# Overview

The Mithril CLI removes the manual setup typically required to run ML workloads on remote infrastructure. Workloads are defined **declaratively in YAML** — specifying code, compute requirements, and runtime configuration in a single spec. Instead of SSHing into nodes or stitching together launch scripts, researchers submit runs directly from these definitions. The CLI handles packaging, scheduling, and execution **across Mithril and other clouds**, supporting common patterns like model training, offline batch inference, and large-scale evaluation.

#### Defining a basic workload

```yaml
# task.yaml
resources:
  infra: mithril
  accelerators: B200:8

num_nodes: 2

setup: |
  pip install -r requirements.txt

run: |
  MASTER_ADDR=$(echo "$SKYPILOT_NODE_IPS" | head -n1)
  torchrun \
    --nnodes=$SKYPILOT_NUM_NODES \
    --nproc_per_node=$SKYPILOT_NUM_GPUS_PER_NODE \
    --master_addr=$MASTER_ADDR \
    --node_rank=$SKYPILOT_NODE_RANK \
    train.py --distributed
```

#### Running the workload

```bash
ml launch task.yaml
```

#### Provisioning and scheduling

Before execution, the CLI evaluates available capacity and proposes a cluster configuration:

```
> ml launch task.yaml
mithril-client 0.1.0
SkyPilot API server 0.1.0a4
Considered resources (2 nodes):
-------------------------------------------------------------------------------------
 INFRA                     INSTANCE   vCPUs   Mem(GB)   GPUS     COST ($)   CHOSEN
-------------------------------------------------------------------------------------
 Mithril (us-central5-a)   b200.8x    232      192       B200:8   2.25          ✔
-------------------------------------------------------------------------------------
Launching a new cluster 'sky-692d-olivier'. Proceed? [Y/n]:
• Launching... View logs: ml logs --provision sky-69a7-olivier
```

#### Features

* **Attach storage** – Mount persistent volumes or cloud buckets for datasets, checkpoints, and run outputs.
* **Scale from single-node to distributed training** – Provision multi-node GPU clusters with InfiniBand networking automatically configured.
* **Cost protection** – Set maximum price limits to control spend.
* **Idle auto-shutdown** – Instances pause automatically when GPUs are no longer in use.
* **AI-native** – Built with coding agents in mind.
* **Multi-cloud ready** – Launch workloads across Mithril, Nebius, Oracle, GCP, AWS, and 15 other providers.
* **No lock-in** – Workload specs and CLI workflows build on open-source SkyPilot — not proprietary tooling.

#### Built on Skypilot

The Mithril CLI is built on the open-source [SkyPilot](https://skypilot.co/) framework — adopting its workload definition model, provisioning engine, and multi-cloud integrations.

This means existing SkyPilot workflows run unchanged, and workloads remain portable across all SkyPilot-supported clouds.

Mithril extends this foundation at the capacity layer, integrating auction-based GPU allocation, flexible reservation models, and cost controls directly into the same declarative workflow.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.mithril.ai/mithril-cli/overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
