flow.sdk.models

TaskSpec

Complete task specification - the core IR model.

Fields:

api_version (Literal): IR schema version (default: flow.ir/v1)
name (str): Task name
command (list): Command to execute
resources (ResourceSpec): Resource requirements
mounts (list): Volume mounts
params (RunParams): Runtime parameters

ResourceSpec

Hardware resource requirements.

Fields:

gpus (int): Number of GPUs required (default: 0)
gpu_type (str | None): GPU type (e.g., 'H100-80GB')
cpus (int): Number of CPUs required (default: 4)
memory_gb (int): Memory in GB (default: 16)
accelerator_hints (dict): Hints for accelerator configuration (MIG, NVLink, SXM/PCIe, compute capability)

MountSpec

Volume mount specification.

Fields:

kind (Literal): Type of mount
source (str): Source path or URI
target (str): Target mount path in container
read_only (bool): Whether mount is read-only (default: True)
cache (dict[str, str] | None): Cache configuration for remote mounts

RunParams

Runtime parameters for task execution.

Fields:

env (dict): Environment variables
working_dir (str | None): Working directory
retry (int): Number of retries on failure (default: 0)
preemptible_ok (bool): Allow preemptible instances (default: False)
time_limit_s (int | None): Time limit in seconds
image (str | None): Container image to use

TaskStatus

Task lifecycle states.

Values:

PENDING = "pending"
RUNNING = "running"
PAUSED = "paused"
PREEMPTING = "preempting"
COMPLETED = "completed"
FAILED = "failed"
CANCELLED = "cancelled"

InstanceStatus

Status of a compute instance.

Values:

PENDING = "pending"
RUNNING = "running"
STOPPED = "stopped"
TERMINATED = "terminated"

ReservationStatus

Reservation lifecycle states.

Values:

SCHEDULED = "scheduled"
ACTIVE = "active"
EXPIRED = "expired"
FAILED = "failed"

StorageInterface

Storage interface type.

Values:

BLOCK = "block"
FILE = "file"

Retries

Retry policy with fixed or exponential backoff.

Fields:

max_retries (int): Maximum retry attempts (0-10) (default: 3)
backoff_coefficient (float): Delay multiplier between retries (default: 2.0)
initial_delay (float): Initial delay in seconds before first retry (default: 1.0)
max_delay (float | None): Maximum delay between retries (seconds)

Retries.get_delay

get_delay(self, attempt: int) -> float

Calculate delay for a given retry attempt.

Parameters:

attempt: Retry attempt number (1-based)

Returns:

Delay in seconds before this retry attempt

Retries.validate_delays

validate_delays(self) -> Retries

Ensure max_delay is greater than initial_delay if set.

Task

Task handle with lifecycle control (status, logs, wait, cancel, ssh).

Fields:

task_id (str): Task UUID
name (str): Human-readable name
status (TaskStatus): Execution state
config (flow.sdk.models.task_config.TaskConfig | None): Original configuration
created_at (datetime)
started_at (datetime.datetime | None)
completed_at (datetime.datetime | None)
instance_created_at (datetime.datetime | None): Creation time of current instance (for preempted/restarted tasks)
instance_type (str)
num_instances (int)
region (str)
cost_per_hour (str): Hourly cost
total_cost (str | None): Accumulated cost
created_by (str | None): Creator user ID
ssh_host (str | None): SSH endpoint
ssh_port (int | None): SSH port (default: 22)
ssh_user (str): SSH user (default: ubuntu)
shell_command (str | None): Complete shell command
endpoints (dict): Exposed service URLs
instances (list): Instance identifiers
message (str | None): Human-readable status
provider_metadata (dict): Provider-specific state and metadata (e.g., Mithril bid status, preemption reasons)

Task.cancel

cancel(self) -> None

Task.get_instances

get_instances(self) -> list[Instance]

Task.get_user

get_user(self) -> Any | None

Task.logs

logs(
    self,
    follow: bool = False,
    tail: int = 100,
    stderr: bool = False,
    source: str | None = None,
    stream: str | None = None
) -> str | Iterator[str]

Task.refresh

refresh(self) -> None

Task.result

result(self) -> Any

Task.shell

shell(
    self,
    command: str | None = None,
    node: int | None = None,
    progress_context = None,
    record: bool = False
) -> None

Task.stop

stop(self) -> None

Task.wait

wait(self, timeout: int | None = None) -> None

TaskConfig

Complete task specification used by Flow.run().

One obvious way to express requirements; fails fast with clear validation.

Fields:

name (str): Task identifier (default: flow-task)
unique_name (bool): Append unique suffix to name to ensure uniqueness (default: True)
instance_type (str | None): Explicit instance type
min_gpu_memory_gb (int | None): Minimum GPU memory requirement
command (str | list[str] | None): Command to execute when the task starts. Supports three formats: list format (recommended for precise control), single-line string (shell execution), or multi-line script (for complex workflows). Multi-line commands or scripts starting with shebang (#!) are automatically detected and executed as shell scripts. If not specified, defaults to 'sleep infinity' for interactive sessions.
Examples:
- ['python', 'train.py', '--epochs', '10']
- 'python train.py --epochs 10'
- Multi-line script:
  #!/bin/bash pip install -r requirements.txt python train.py python evaluate.py
- 'nvidia-smi'
image (str): Container image (default: nvidia/cuda:12.1.0-runtime-ubuntu22.04)
env (dict): Environment
working_dir (str): Container working directory (default: /workspace)
volumes (list)
data_mounts (list): Data to mount
ports (list): Container/instance ports to expose. High ports only (>=1024).
allow_docker_cache (bool): Allow mounting a volume at /var/lib/docker to persist Docker image layers. Single-node tasks only; use with caution. (default: False)
retries (flow.sdk.models.retry.Retries | None): Advanced retry configuration for task submission/execution
max_price_per_hour (float | None): Maximum hourly price (USD)
max_run_time_hours (float | None): Maximum runtime hours; 0 or None disables runtime monitoring
min_run_time_hours (float | None): Minimum guaranteed runtime hours
deadline_hours (float | None): Hours from submission until deadline
ssh_keys (list): Authorized SSH key IDs
allocation_mode (Literal): Allocation strategy: 'spot' (default, preemptible), 'reserved' (scheduled capacity), or 'auto'. (default: spot)
reservation_id (str | None): Target an existing reservation (advanced).
scheduled_start_time (str | None): When allocation_mode='reserved', schedule start (UTC).
reserved_duration_hours (int | None): When allocation_mode='reserved', reservation duration in hours (3-336).
region (str | None): Target region
num_instances (int): Instance count (default: 1)
priority (Literal): Task priority tier affecting limit price (default: med)
distributed_mode (Optional): Distributed rendezvous mode when num_instances > 1: 'auto' lets Flow assign rank and leader IP; 'manual' expects user-set FLOW_* envs.
internode_interconnect (str | None): Preferred inter-node network (e.g., InfiniBand, IB_3200, Ethernet)
intranode_interconnect (str | None): Preferred intra-node interconnect (e.g., SXM5, PCIe)
upload_code (bool): Upload current directory code to job (default: True)
dev_vm (bool | None): Hint: this task is a developer VM. When True, provider background code uploads are disabled and Docker startup adapts accordingly. If None, falls back to FLOW_DEV_VM env.
upload_strategy (Literal): Strategy for uploading code to instances:
- auto: Use SCP for large (>8KB), embedded for small
- embedded: Include in startup script (10KB limit)
- scp: Transfer after instance starts (no size limit)
- none: No code upload (default: auto)
terminate_on_exit (bool): When true, a watcher cancels the task as soon as the main container exits. (default: False)
upload_timeout (int): Maximum seconds to wait for code upload (60-3600) (default: 600)
code_root (str | pathlib._local.Path | None): Local project directory to upload when upload_code=True. Defaults to the current working directory when not set.

TaskConfig.to_spec

to_spec(self)

Convert TaskConfig into canonical IR TaskSpec.

Keep mapping minimal and user-facing config simple. Code is modeled as a first-class mount in IR when upload_code=True, without extra env flags or strategy knobs. Providers decide delivery details.

TaskConfig.to_yaml

to_yaml(self, path: str | Path) -> None

TaskConfig.validate_config

validate_config(self) -> TaskConfig

VolumeSpec

Persistent volume specification (create or attach).

Fields:

name (str | None): Human-readable name (3-64 chars, lowercase alphanumeric with hyphens)
size_gb (int): Size in GB (default: 1)
mount_path (str | None): Mount path in container (default: /volumes/)
volume_id (str | None): ID of existing volume to attach
interface (StorageInterface): Storage interface type (default: StorageInterface.BLOCK)
iops (int | None): Provisioned IOPS
throughput_mb_s (int | None): Provisioned throughput

VolumeSpec.validate_volume_spec

validate_volume_spec(self) -> VolumeSpec

Validate volume specification.

MountSpec

Mount specification for volumes, S3, or bind mounts.

Fields:

source (str): Source URL or path
target (str): Mount path in container
mount_type (Literal): Type of mount (default: bind)
options (dict): Provider-specific options
cache_key (str | None): Key for caching mount metadata
size_estimate_gb (float | None): Estimated size for planning

GPUSpec

Immutable GPU hardware specification used for matching.

Fields:

vendor (str): GPU vendor (default: NVIDIA)
model (str): GPU model (e.g., A100, H100)
memory_gb (int): GPU memory in GB
memory_type (str): Memory type (HBM2e, HBM3, GDDR6) (default: ``)
architecture (str): GPU architecture (Ampere, Hopper) (default: ``)
compute_capability (tuple): CUDA compute capability (default: (0, 0))
tflops_fp32 (float): FP32 performance in TFLOPS (default: 0.0)
tflops_fp16 (float): FP16 performance in TFLOPS (default: 0.0)
memory_bandwidth_gb_s (float): Memory bandwidth in GB/s (default: 0.0)

CPUSpec

CPU specification.

Fields:

vendor (str): CPU vendor (default: Intel)
model (str): CPU model (default: Xeon)
cores (int): Number of CPU cores
threads (int): Number of threads (0 = same as cores) (default: 0)
base_clock_ghz (float): Base clock speed in GHz (default: 0.0)

CPUSpec.set_threads_default

set_threads_default(self) -> CPUSpec

Default threads to cores when not specified.

MemorySpec

System memory specification.

Fields:

size_gb (int): Memory size in GB
type (str): Memory type (default: DDR4)
speed_mhz (int): Memory speed in MHz (default: 3200)
ecc (bool): ECC memory support (default: True)

StorageSpec

Storage specification.

Fields:

size_gb (int): Storage size in GB
type (str): Storage type (NVMe, SSD, HDD) (default: NVMe)
iops (int | None): IOPS rating
bandwidth_mb_s (int | None): Bandwidth in MB/s

NetworkSpec

Network specification.

Fields:

intranode (str): Intra-node interconnect (SXM4, SXM5, PCIe) (default: ``)
internode (str | None): Inter-node network (InfiniBand, Ethernet)
bandwidth_gbps (float | None): Network bandwidth in Gbps

InstanceType

Canonical instance type specification (immutable).

Fields:

gpu (GPUSpec)
gpu_count (int): Number of GPUs
cpu (CPUSpec)
memory (MemorySpec)
storage (StorageSpec)
network (NetworkSpec)
id (uuid.UUID | None): Unique instance type ID
aliases (set): Alternative names
created_at (datetime)
version (int) (default: 1)

InstanceType.compute_id_and_aliases

compute_id_and_aliases(self) -> InstanceType

Compute a stable ID and default aliases.

InstanceMatch

Matched instance with price and availability.

Fields:

instance (InstanceType)
region (str)
availability (int): Number of available instances
price_per_hour (float): Price in USD per hour
match_score (float): Match quality score (default: 1.0)

Instance

Compute instance entity.

Fields:

instance_id (str): Instance UUID
task_id (str): Parent task ID
status (InstanceStatus): Instance state
ssh_host (str | None): Public hostname/IP
private_ip (str | None): VPC-internal IP
created_at (datetime)
terminated_at (datetime.datetime | None)

AvailableInstance

Available compute resource.

Fields:

allocation_id (str): Resource allocation ID
instance_type (str): Instance type identifier
region (str): Availability region
price_per_hour (float): Hourly price (USD)
gpu_type (str | None): GPU type
gpu_count (int | None): Number of GPUs
cpu_count (int | None): Number of CPUs
memory_gb (int | None): Memory in GB
available_quantity (int | None): Number available
status (str | None): Allocation status
expires_at (datetime.datetime | None): Expiration time
internode_interconnect (str | None): Inter-node network (e.g., InfiniBand, IB_3200, Ethernet)
intranode_interconnect (str | None): Intra-node interconnect (e.g., SXM5, PCIe)

Reservation

Reservation details returned by providers.

Fields:

reservation_id (str): Reservation identifier
name (str | None): Display name
status (ReservationStatus): Lifecycle state
instance_type (str): Instance type identifier
region (str): Region/zone
quantity (int): Number of instances
start_time_utc (datetime): Scheduled start time (UTC)
end_time_utc (datetime.datetime | None): Scheduled end time (UTC)
price_total_usd (float | None): Quoted/actual total price
provider_metadata (dict)

ReservationSpec

Provider-agnostic spec for creating a reservation.

Fields:

name (str | None): Optional reservation name for display
project_id (str | None): Provider project/workspace ID
instance_type (str): Explicit instance type (e.g., 'a100', '8xh100')
region (str): Target region/zone for the reservation
quantity (int): Number of instances to reserve (default: 1)
start_time_utc (datetime): Reservation start time (UTC)
duration_hours (int): Reservation duration in hours (3-336)
ssh_keys (list): Authorized SSH key IDs
volumes (list): Volume IDs to attach (provider-specific)
startup_script (str | None): Optional startup script executed when instances boot

FlowConfig

Flow SDK configuration settings.

Immutable configuration for API authentication and default behaviors. Typically loaded from environment variables or config files.

Fields:

api_key (str): Authentication key
project (str): Project identifier
region (str): Default deployment region (default: us-central1-b)
api_url (str): API base URL (default: https://api.mithril.ai)

Project

Project metadata.

Fields:

name (str): Project identifier
region (str): Primary region

ValidationResult

Configuration validation result.

Fields:

is_valid (bool): Validation status
projects (list): Accessible projects
error_message (str | None): Validation error

SubmitTaskRequest

Task submission request.

Fields:

config (TaskConfig): Task specification
wait (bool): Block until complete (default: False)
dry_run (bool): Validation only (default: False)

SubmitTaskResponse

Task submission result.

Fields:

task_id (str): Assigned task ID
status (TaskStatus): Initial state
message (str | None): Status details

ListTasksRequest

Task listing request.

Fields:

status (flow.sdk.models.enums.TaskStatus | None): Status filter
limit (int): Page size (default: 100)
offset (int): Skip count (default: 0)

ListTasksResponse

Task listing result.

Fields:

tasks (list): Task collection
total (int): Total available
has_more (bool): Pagination indicator

User

User identity information.

Fields:

user_id (str): Unique user identifier (e.g., 'user_kfV4CCaapLiqCNlv')
username (str): Username for display
email (str): User email address

Volume

Backwards-compatible alias to the canonical Volume model.

Kept for import stability of the legacy Volume class while delegating to the real implementation in the volume module, which supports both persistent volumes and bind mounts (local/remote/read_only).

Fields:

local (str | None): Source path on host
remote (str | None): Target path in container
read_only (bool | None): Mount read-only
volume_id (str | None): Volume ID
name (str | None): Volume name
size_gb (int | None): Capacity (GB)
region (str | None): Storage region
interface (flow.sdk.models.enums.StorageInterface | None): Storage interface type
attached_to (list): Attached instance IDs
created_at (Any | None): Creation timestamp

Previousflow.sdk.decorators NextAPI overview and quickstart

Last updated 15 days ago