Workloads
The Mithril CLI manages clusters and jobs through SkyPilot, an open-source framework for running ML workloads. The commands below — launch, exec, status, logs, queue, start, stop, and down — provide the core workflow: provision GPUs, run tasks, monitor progress, and clean up when you're done. For options not covered here, run ml --help for the full option set, or use ml sky for direct access to SkyPilot's CLI.
Quick reference
ml launch
Create cluster and run a task (or start interactive setup).
ml exec
Run a task or command on an existing cluster (or open a shell).
ml status
List clusters and jobs; optionally show IP or endpoints.
ml logs
Stream or download job logs; stream provision/autostop logs.
ml queue
Show job queue for cluster(s).
ml start
Start stopped (or failed) cluster(s).
ml stop
Stop cluster(s); keep disks for later ml start.
ml down
Tear down cluster(s) and delete resources.
For full option lists, run ml <command> --help.
ml launch
Provision a cluster and run a task. With no arguments, runs an interactive setup. With a task YAML or inline command, launches the cluster and (unless --detach-run) streams job logs.
When invoked with no arguments, ml launch starts an interactive onboarding flow:
Creates a starter task YAML (task.yaml) with annotated fields for resources, setup, and run.
Optionally adds an AGENTS.md to your project so coding agents (Claude, Cursor, etc.) can discover the Mithril CLI docs bundled with the package.
Prints a ready-to-use launch command and, if AGENTS.md was written, a prompt for an agent-guided walkthrough.
Synopsis
Arguments
ENTRYPOINT
Optional. Path to a task YAML (.yaml/.yml) or a single bash command in quotes. Omit to start interactive setup.
Options
-c, --cluster NAME
Cluster name. If the cluster exists, reuses it; otherwise creates it.
--gpus SPEC
GPU type and count (e.g. A100:4, H100:8).
--cpus SPEC
vCPU requirement (e.g. 4, 4+).
--memory SPEC
Memory in GB.
--cloud CLOUD
Cloud provider.
--region REGION
Region.
--num-nodes N
Number of nodes.
-i, --idle-minutes-to-autostop N
Auto-stop cluster after N minutes of idleness.
--down
Tear down the cluster after the job finishes.
-d, --detach-run
Do not stream job logs; return after the job is submitted.
-r, --retry-until-up
Retry provisioning until the cluster is up.
-y, --yes
Skip confirmation prompts.
--dryrun
Print cluster name, task, and resources only; do not launch.
-n, --name NAME
Task name.
--workdir DIR
Local directory to sync as the task workdir.
-e, --env KEY=VALUE
Set environment variables (repeatable).
Examples
ml exec
Run a task or command on an existing cluster without re-provisioning. Use a task YAML or a bash command. For interactive use, open a shell with ml exec CLUSTER or use ml ssh CLUSTER.
Synopsis
Arguments
CLUSTER
Cluster name.
ENTRYPOINT
Optional. Task YAML path or bash command. Omit to open an interactive shell on the head node.
Options
-d, --detach-run
Submit the job and return; do not stream logs.
Additional task and resource options (e.g. --workdir, --gpus, --env) are supported; run ml exec --help for the full list.
Examples
ml status
List clusters and job information. Updates local SSH config so you can ssh CLUSTER or use ml exec. With one cluster, --ip or --endpoints can be used to get connection details.
Synopsis
Arguments
CLUSTER
Optional. One or more cluster names. Default: all clusters.
Options
-v, --verbose
Show all fields.
-r, --refresh
Query latest status from the cloud (use when clusters change outside Mithril or with autostop).
--ip
Show head node IP (only with exactly one cluster).
--endpoints
Show all exposed endpoints (only with exactly one cluster).
--endpoint PORT
Show URL for the given port (only with exactly one cluster).
--show-managed-jobs / --no-show-managed-jobs
Include in-progress managed jobs (default: show).
--show-services / --no-show-services
Include Sky Serve services (default: show).
--show-pools / --no-show-pools
Include pools (default: show).
--all-users
Include clusters for all users.
Cluster states
UP
Ready; provisioning and setup completed.
STOPPED
Stopped; use ml start to restart.
INIT
Provisioning or setup in progress, or cluster in an inconsistent state.
Examples
ml logs
Stream or download job logs, or stream provisioning/autostop logs.
Synopsis
Arguments
CLUSTER
Cluster name.
JOB_ID
Optional. Job ID(s). If omitted, uses the latest job. For streaming, at most one job; for --sync-down, multiple allowed.
Options
--provision
Stream cluster provisioning logs (provision.log).
--autostop
Stream autostop hook logs.
-w, --worker ID
Worker ID for logs (only with --provision).
-s, --sync-down
Download job logs to ~/sky_logs (multiple job IDs allowed).
--status
Do not show logs; exit with status code: 0 = succeeded, 100 = failed, 101 = not finished, 102 = not found, 103 = cancelled.
--follow / --no-follow
Stream logs continuously (default: follow).
--tail N
Show only the last N lines (0 = all).
Examples
ml queue
Show the job queue for one or more clusters (pending and running jobs; optionally finished).
Synopsis
Arguments
CLUSTER
Optional. Cluster name(s). Default: all clusters.
Options
-s, --skip-finished
Show only pending and running jobs.
--all-users
Show queue for all users.
Examples
ml start
Start one or more stopped clusters (or retry provisioning/setup for clusters in INIT). No effect if a cluster is already UP.
Synopsis
Arguments
CLUSTER
Optional. Cluster name(s). Default: all clusters (or the single cluster if only one exists).
Options
-a, --all
Start all clusters.
-y, --yes
Skip confirmation.
-i, --idle-minutes-to-autostop N
Set autostop after N minutes of idleness.
--down
Use autodown (tear down after idleness); requires --idle-minutes-to-autostop.
-r, --retry-until-up
Retry until the cluster is up on availability failures.
-f, --force
Start even if already UP (e.g. to upgrade SkyPilot runtime).
Examples
ml stop
Stop one or more clusters. Billing for instances stops; attached disks are kept and reattached when you ml start. Spot clusters cannot be stopped.
Synopsis
Arguments
CLUSTER
Optional. Cluster name(s) or glob (e.g. cluster*).
Options
-a, --all
Stop all clusters.
--all-users
Stop all clusters for all users.
-y, --yes
Skip confirmation.
--graceful
Wait for MOUNT_CACHED uploads to complete (cancels current jobs first).
--graceful-timeout N
Timeout in seconds for --graceful.
Examples
ml down
Tear down one or more clusters. All associated resources are deleted and billing stops; data on attached disks is lost.
Synopsis
Arguments
CLUSTER
Optional. Cluster name(s) or glob (e.g. cluster*).
Options
-a, --all
Tear down all clusters.
--all-users
Tear down all clusters for all users.
-y, --yes
Skip confirmation.
-p, --purge
(Advanced) Remove cluster(s) from SkyPilot’s table even if cloud teardown failed. Use only when troubleshooting; you are responsible for cleaning up leaked resources.
--graceful
Wait for MOUNT_CACHED uploads before terminating (cancels current jobs first).
--graceful-timeout N
Timeout in seconds for --graceful.
Examples
Last updated