# Infrastructure

{% hint style="info" %}
Most users should start with `ml launch`. It provisions a GPU cluster, runs your task, and handles the full lifecycle — no manual VM management needed. The workload commands (launch, exec, status, stop, down) are the fastest path from code to running on GPUs.
{% endhint %}

Infrastructure commands give you a lower-level interface to the Mithril spot marketplace. Instead of defining a task and letting SkyPilot manage the cluster, you manage instances and Kubernetes clusters directly. Use them when you need to:

* Build and manage Kubernetes node pools with GPU workers
* Maintain persistent VMs with your own orchestration, outside of SkyPilot's task-oriented model

### Quick reference

| Command                        | Purpose                                        |
| ------------------------------ | ---------------------------------------------- |
| **`ml instance create`**       | Submit a spot bid for GPU instances.           |
| **`ml instance list`**         | List instances and bids (active by default).   |
| **`ml instance info`**         | Show details for an instance.                  |
| **`ml instance delete`**       | Cancel a bid and terminate its instances.      |
| **`ml instance list-types`**   | Show available GPU instance types and specs.   |
| **`ml k8s list`**              | List Kubernetes clusters.                      |
| **`ml k8s info`**              | Show details for a Kubernetes cluster.         |
| **`ml k8s ssh`**               | SSH into a cluster's control node.             |
| **`ml k8s update-kubeconfig`** | Fetch credentials and update local kubeconfig. |
| **`ml ssh`**                   | SSH into a Mithril instance by bid name.       |

For full option lists, run `ml <command> --help`.

***

### ml instance create

Submit a spot bid for one or more GPU instances.

```
ml instance create [OPTIONS]
```

All instances are allocated through a blind second-price auction. You set a limit price (`-m`) — the maximum you are willing to pay per hour per instance. The actual price you pay is determined by market demand, not your bid. If demand rises above your limit, the instance is preempted and the bid returns to the marketplace.

#### Options

| Flag                               | Description                                                                            |
| ---------------------------------- | -------------------------------------------------------------------------------------- |
| `-i`, `--instance-type TYPE`       | **Required.** GPU instance type (e.g. `b200`, `8xa100`). See `ml instance list-types`. |
| `-r`, `--region REGION`            | **Required.** Region (e.g. `us-central5-a`).                                           |
| `-m`, `--max-price-per-hour PRICE` | **Required.** Maximum hourly price in USD. Accepts `8.0` or `$8.0`.                    |
| `-n`, `--name NAME`                | Bid name. Auto-generated if omitted.                                                   |
| `-N`, `--num-instances N`          | Number of instances in the bid. Default: 1.                                            |
| `-p`, `--project PROJECT`          | Project name. Skips interactive project selection.                                     |
| `--k8s CLUSTER`                    | Attach instances to a Kubernetes cluster as worker nodes.                              |
| `--wait`                           | Block until all instances reach the running state.                                     |
| `-w`, `--watch`                    | Watch instance progress interactively.                                                 |
| `-d`, `--dry-run`                  | Validate the configuration without submitting the bid.                                 |
| `--json`                           | Output JSON.                                                                           |

#### Examples

```bash
ml instance create -i b200 -r us-central5-a -m 32.0
ml instance create -i 8xa100 -r us-east1 -m 25.0 -n training -N 4
ml instance create -i b200 -r us-central5-a -m 32.0 --k8s my-cluster
ml instance create -i b200 -r us-central5-a -m 32.0 -d   # dry run
```

***

### ml instance list

List instances and spot bids.

```
ml instance list [OPTIONS]
```

By default, shows only active bids (pending through running). Use `--all` to include terminal states.

#### Options

| Flag                  | Description                                           |
| --------------------- | ----------------------------------------------------- |
| `--all`               | Include completed, failed, and cancelled bids.        |
| `-s`, `--state STATE` | Filter to a single state (e.g. `running`, `pending`). |
| `--limit N`           | Maximum number of results.                            |
| `--json`              | Output JSON.                                          |

#### Examples

```bash
ml instance list                    # active bids
ml instance list --all              # all bids
ml instance list --state running    # only running
ml instance list --json             # for scripting
```

***

### ml instance info

Show detailed information about an instance or bid.

```
ml instance info [BID_NAME] [OPTIONS]
```

`BID_NAME` is the bid name or FID. Interactive if omitted.

Output includes status, instance type, region, current and maximum price, creation time, IP addresses (when running), SSH destination, and attached Kubernetes cluster (if any).

#### Options

| Flag     | Description  |
| -------- | ------------ |
| `--json` | Output JSON. |

#### Examples

```bash
ml instance info my-instance
ml instance info my-instance --json
ml instance info # interactive picker
```

***

### ml instance delete

Cancel a bid and terminate its instances.

```
ml instance delete [BID_NAME] [OPTIONS]
```

`BID_NAME` is the bid name or FID. Interactive if omitted. Running instances are terminated immediately; pending bids are cancelled. This cannot be undone.

#### Options

| Flag                           | Description                                             |
| ------------------------------ | ------------------------------------------------------- |
| `-y`, `--yes`                  | Skip the confirmation prompt.                           |
| `--all`                        | Cancel all active bids.                                 |
| `-n`, `--name-pattern PATTERN` | Cancel bids matching a wildcard pattern (e.g. `dev-*`). |

#### Examples

```bash
ml instance delete my-instance
ml instance delete my-instance -y         # skip confirmation
ml instance delete --all                  # cancel everything
ml instance delete -n 'training-*'        # pattern match
```

***

### ml instance list-types

Show available GPU instance types and their specifications.

```
ml instance list-types [OPTIONS]
```

#### Options

| Flag                    | Description                        |
| ----------------------- | ---------------------------------- |
| `-r`, `--region REGION` | Filter to a specific region.       |
| `-v`, `--verbose`       | Include memory and detailed specs. |
| `--json`                | Output JSON.                       |

#### Examples

```bash
ml instance list-types                        # all types
ml instance list-types -r us-central5-a       # filter by region
ml instance list-types -v                     # with specs
```

***

### Instance states

Bids and instances have separate state models. `ml instance list` shows **bid status**; `ml instance info` shows both bid and instance-level detail.

#### Bid states

| State        | Meaning                                                                        |
| ------------ | ------------------------------------------------------------------------------ |
| `Open`       | Bid is active in the marketplace, waiting for or seeking allocation.           |
| `Allocated`  | Resources reserved; instances are provisioning or running.                     |
| `Preempting` | Instances are being preempted. The bid will return to `Open` for reallocation. |
| `Paused`     | Bid is temporarily paused.                                                     |
| `Terminated` | Bid is finished (cancelled by user or completed). Terminal state.              |

All states except `Terminated` are considered active and appear in `ml instance list` by default. Use `--all` to include terminated bids.

#### Instance states

Each instance within a bid progresses through its own lifecycle:

| State          | Meaning                                |
| -------------- | -------------------------------------- |
| `New`          | Instance record created.               |
| `Confirmed`    | Allocation confirmed.                  |
| `Scheduled`    | Scheduled for provisioning.            |
| `Initializing` | VM being initialized.                  |
| `Starting`     | VM booting.                            |
| `Running`      | Instance is up and accessible via SSH. |
| `Relocating`   | Being moved to a different host.       |
| `Preempting`   | Being preempted.                       |
| `Preempted`    | Preemption complete.                   |
| `Paused`       | Temporarily paused.                    |
| `Terminated`   | Permanently terminated.                |
| `Error`        | An error occurred.                     |

Instances in active states (New through Running) appear in ml instance list by default. Instances in terminal states (Terminated, Error) are hidden unless --all is passed.

#### Checking state

```bash
ml instance list                    # bid-level status
ml instance list --state running    # filter by state
ml instance info my-instance        # full detail including instance states
```

***

### ml k8s list

List Kubernetes clusters.

```
ml k8s list [OPTIONS]
```

By default, shows active clusters only.

#### Options

| Flag     | Description                  |
| -------- | ---------------------------- |
| `--all`  | Include terminated clusters. |
| `--json` | Output JSON.                 |

#### Examples

```bash
ml k8s list
ml k8s list --all
ml k8s list --json
```

***

### ml k8s info

Show detailed information about a Kubernetes cluster.

```
ml k8s info [CLUSTER] [OPTIONS]
```

`CLUSTER` is the cluster name or FID. Interactive if omitted.

Output includes cluster status, region, Kubernetes version, control-plane endpoint, node count, and attached instances.

#### Options

| Flag     | Description  |
| -------- | ------------ |
| `--json` | Output JSON. |

#### Examples

```bash
ml k8s info my-cluster
ml k8s info my-cluster --json
ml k8s info                     # interactive picker
```

***

### ml k8s ssh

SSH into the control node of a Kubernetes cluster.

```
ml k8s ssh [CLUSTER] [COMMAND ...] [OPTIONS]
```

`CLUSTER` is the cluster name or FID. Interactive if omitted. Without a command, opens an interactive shell. With a trailing command, executes it on the control node and returns.

#### Options

| Flag                    | Description                                    |
| ----------------------- | ---------------------------------------------- |
| `--show`                | Print the SSH command instead of executing it. |
| `-i`, `--identity PATH` | SSH identity file. Auto-detected if omitted.   |

#### Examples

```bash
ml k8s ssh my-cluster                           # interactive shell
ml k8s ssh my-cluster kubectl get nodes         # remote command
ml k8s ssh my-cluster --show                    # print ssh command
```

***

### ml k8s update-kubeconfig

Fetch cluster credentials and merge them into local kubeconfig.

```
ml k8s update-kubeconfig [CLUSTER] [OPTIONS]
```

`CLUSTER` is the cluster name or FID. Interactive if omitted.

Connects to the control node via SSH, downloads the kubeconfig, backs up the existing `~/.kube/config`, merges the new credentials, and validates with `kubectl cluster-info`.

#### Options

| Flag                    | Description                                        |
| ----------------------- | -------------------------------------------------- |
| `-i`, `--identity PATH` | SSH identity file. Auto-detected if omitted.       |
| `-y`, `--yes`           | Skip the confirmation prompt.                      |
| `--no-backup`           | Skip creating a backup of the existing kubeconfig. |
| `--skip-validation`     | Skip the `kubectl cluster-info` validation step.   |

#### Examples

```bash
ml k8s update-kubeconfig my-cluster
ml k8s update-kubeconfig my-cluster -y --no-backup
ml k8s update-kubeconfig                              # interactive picker
```

After updating:

```bash
kubectl config get-contexts
kubectl get nodes
```

***

### ml ssh

SSH into a Mithril instance by bid name.

```
ml ssh [BID_NAME] [OPTIONS] [-- COMMAND ...]
```

`BID_NAME` is the spot bid name. Interactive if omitted — presents a picker of running instances. By default, waits for the instance to become available and for SSH to be ready before connecting.

This command connects to **Mithril instances** (created via `ml instance create`). To SSH into a **SkyPilot cluster** (created via `ml launch`), use `ssh CLUSTER` directly — SkyPilot configures your SSH config automatically.

#### Options

| Flag           | Description                                                                                                                         |
| -------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
| `--node INDEX` | Node index for multi-instance bids. Defaults to 0 for interactive sessions; commands without `--node` run on all nodes in parallel. |
| `--show`       | Print the SSH command instead of executing it.                                                                                      |
| `--no-wait`    | Fail immediately if the instance is not running (do not wait for it to become available).                                           |

#### Examples

```bash
ml ssh my-instance                            # interactive shell
ml ssh my-instance nvidia-smi                 # run command on all nodes
ml ssh my-instance --node 0 nvidia-smi        # run on specific node
ml ssh my-instance --show                     # print ssh command
ml ssh                                        # interactive picker
```

***
