flow.sdk.models

TaskSpec

Complete task specification - the core IR model.

Fields:

  • api_version (Literal): IR schema version (default: flow.ir/v1)

  • name (str): Task name

  • command (list): Command to execute

  • resources (ResourceSpec): Resource requirements

  • mounts (list): Volume mounts

  • params (RunParams): Runtime parameters

ResourceSpec

Hardware resource requirements.

Fields:

  • gpus (int): Number of GPUs required (default: 0)

  • gpu_type (str | None): GPU type (e.g., 'H100-80GB')

  • cpus (int): Number of CPUs required (default: 4)

  • memory_gb (int): Memory in GB (default: 16)

  • accelerator_hints (dict): Hints for accelerator configuration (MIG, NVLink, SXM/PCIe, compute capability)

MountSpec

Volume mount specification.

Fields:

  • kind (Literal): Type of mount

  • source (str): Source path or URI

  • target (str): Target mount path in container

  • read_only (bool): Whether mount is read-only (default: True)

  • cache (dict[str, str] | None): Cache configuration for remote mounts

RunParams

Runtime parameters for task execution.

Fields:

  • env (dict): Environment variables

  • working_dir (str | None): Working directory

  • retry (int): Number of retries on failure (default: 0)

  • preemptible_ok (bool): Allow preemptible instances (default: False)

  • time_limit_s (int | None): Time limit in seconds

  • image (str | None): Container image to use

TaskStatus

Task lifecycle states.

Values:

  • PENDING = "pending"

  • RUNNING = "running"

  • PAUSED = "paused"

  • PREEMPTING = "preempting"

  • COMPLETED = "completed"

  • FAILED = "failed"

  • CANCELLED = "cancelled"

InstanceStatus

Status of a compute instance.

Values:

  • PENDING = "pending"

  • RUNNING = "running"

  • STOPPED = "stopped"

  • TERMINATED = "terminated"

ReservationStatus

Reservation lifecycle states.

Values:

  • SCHEDULED = "scheduled"

  • ACTIVE = "active"

  • EXPIRED = "expired"

  • FAILED = "failed"

StorageInterface

Storage interface type.

Values:

  • BLOCK = "block"

  • FILE = "file"

Retries

Retry policy with fixed or exponential backoff.

Fields:

  • max_retries (int): Maximum retry attempts (0-10) (default: 3)

  • backoff_coefficient (float): Delay multiplier between retries (default: 2.0)

  • initial_delay (float): Initial delay in seconds before first retry (default: 1.0)

  • max_delay (float | None): Maximum delay between retries (seconds)

Retries.get_delay

get_delay(self, attempt: int) -> float

Calculate delay for a given retry attempt.

Parameters:

  • attempt: Retry attempt number (1-based)

Returns:

Delay in seconds before this retry attempt

Retries.validate_delays

validate_delays(self) -> Retries

Ensure max_delay is greater than initial_delay if set.

Task

Task handle with lifecycle control (status, logs, wait, cancel, ssh).

Fields:

  • task_id (str): Task UUID

  • name (str): Human-readable name

  • status (TaskStatus): Execution state

  • config (flow.sdk.models.task_config.TaskConfig | None): Original configuration

  • created_at (datetime)

  • started_at (datetime.datetime | None)

  • completed_at (datetime.datetime | None)

  • instance_created_at (datetime.datetime | None): Creation time of current instance (for preempted/restarted tasks)

  • instance_type (str)

  • num_instances (int)

  • region (str)

  • cost_per_hour (str): Hourly cost

  • total_cost (str | None): Accumulated cost

  • created_by (str | None): Creator user ID

  • ssh_host (str | None): SSH endpoint

  • ssh_port (int | None): SSH port (default: 22)

  • ssh_user (str): SSH user (default: ubuntu)

  • shell_command (str | None): Complete shell command

  • endpoints (dict): Exposed service URLs

  • instances (list): Instance identifiers

  • message (str | None): Human-readable status

  • provider_metadata (dict): Provider-specific state and metadata (e.g., Mithril bid status, preemption reasons)

Task.cancel

cancel(self) -> None

Task.get_instances

get_instances(self) -> list[Instance]

Task.get_user

get_user(self) -> Any | None

Task.logs

logs(
    self,
    follow: bool = False,
    tail: int = 100,
    stderr: bool = False,
    source: str | None = None,
    stream: str | None = None
) -> str | Iterator[str]

Task.refresh

refresh(self) -> None

Task.result

result(self) -> Any

Task.shell

shell(
    self,
    command: str | None = None,
    node: int | None = None,
    progress_context = None,
    record: bool = False
) -> None

Task.stop

stop(self) -> None

Task.wait

wait(self, timeout: int | None = None) -> None

TaskConfig

Complete task specification used by Flow.run().

One obvious way to express requirements; fails fast with clear validation.

Fields:

  • name (str): Task identifier (default: flow-task)

  • unique_name (bool): Append unique suffix to name to ensure uniqueness (default: True)

  • instance_type (str | None): Explicit instance type

  • min_gpu_memory_gb (int | None): Minimum GPU memory requirement

  • command (str | list[str] | None): Command to execute when the task starts. Supports three formats: list format (recommended for precise control), single-line string (shell execution), or multi-line script (for complex workflows). Multi-line commands or scripts starting with shebang (#!) are automatically detected and executed as shell scripts. If not specified, defaults to 'sleep infinity' for interactive sessions.

    Examples:

    • ['python', 'train.py', '--epochs', '10']

    • 'python train.py --epochs 10'

    • Multi-line script:

      
      #!/bin/bash
      
      pip install -r requirements.txt
      
      python train.py
      
      python evaluate.py
      
    • 'nvidia-smi'

  • image (str): Container image (default: nvidia/cuda:12.1.0-runtime-ubuntu22.04)

  • env (dict): Environment

  • working_dir (str): Container working directory (default: /workspace)

  • volumes (list)

  • data_mounts (list): Data to mount

  • ports (list): Container/instance ports to expose. High ports only (>=1024).

  • allow_docker_cache (bool): Allow mounting a volume at /var/lib/docker to persist Docker image layers. Single-node tasks only; use with caution. (default: False)

  • retries (flow.sdk.models.retry.Retries | None): Advanced retry configuration for task submission/execution

  • max_price_per_hour (float | None): Maximum hourly price (USD)

  • max_run_time_hours (float | None): Maximum runtime hours; 0 or None disables runtime monitoring

  • min_run_time_hours (float | None): Minimum guaranteed runtime hours

  • deadline_hours (float | None): Hours from submission until deadline

  • ssh_keys (list): Authorized SSH key IDs

  • allocation_mode (Literal): Allocation strategy: 'spot' (default, preemptible), 'reserved' (scheduled capacity), or 'auto'. (default: spot)

  • reservation_id (str | None): Target an existing reservation (advanced).

  • scheduled_start_time (str | None): When allocation_mode='reserved', schedule start (UTC).

  • reserved_duration_hours (int | None): When allocation_mode='reserved', reservation duration in hours (3-336).

  • region (str | None): Target region

  • num_instances (int): Instance count (default: 1)

  • priority (Literal): Task priority tier affecting limit price (default: med)

  • distributed_mode (Optional): Distributed rendezvous mode when num_instances > 1: 'auto' lets Flow assign rank and leader IP; 'manual' expects user-set FLOW_* envs.

  • internode_interconnect (str | None): Preferred inter-node network (e.g., InfiniBand, IB_3200, Ethernet)

  • intranode_interconnect (str | None): Preferred intra-node interconnect (e.g., SXM5, PCIe)

  • upload_code (bool): Upload current directory code to job (default: True)

  • dev_vm (bool | None): Hint: this task is a developer VM. When True, provider background code uploads are disabled and Docker startup adapts accordingly. If None, falls back to FLOW_DEV_VM env.

  • upload_strategy (Literal): Strategy for uploading code to instances:

    • auto: Use SCP for large (>8KB), embedded for small

    • embedded: Include in startup script (10KB limit)

    • scp: Transfer after instance starts (no size limit)

    • none: No code upload (default: auto)

  • terminate_on_exit (bool): When true, a watcher cancels the task as soon as the main container exits. (default: False)

  • upload_timeout (int): Maximum seconds to wait for code upload (60-3600) (default: 600)

  • code_root (str | pathlib._local.Path | None): Local project directory to upload when upload_code=True. Defaults to the current working directory when not set.

TaskConfig.to_spec

to_spec(self)

Convert TaskConfig into canonical IR TaskSpec.

Keep mapping minimal and user-facing config simple. Code is modeled as a first-class mount in IR when upload_code=True, without extra env flags or strategy knobs. Providers decide delivery details.

TaskConfig.to_yaml

to_yaml(self, path: str | Path) -> None

TaskConfig.validate_config

validate_config(self) -> TaskConfig

VolumeSpec

Persistent volume specification (create or attach).

Fields:

  • name (str | None): Human-readable name (3-64 chars, lowercase alphanumeric with hyphens)

  • size_gb (int): Size in GB (default: 1)

  • mount_path (str | None): Mount path in container (default: /volumes/)

  • volume_id (str | None): ID of existing volume to attach

  • interface (StorageInterface): Storage interface type (default: StorageInterface.BLOCK)

  • iops (int | None): Provisioned IOPS

  • throughput_mb_s (int | None): Provisioned throughput

VolumeSpec.validate_volume_spec

validate_volume_spec(self) -> VolumeSpec

Validate volume specification.

MountSpec

Mount specification for volumes, S3, or bind mounts.

Fields:

  • source (str): Source URL or path

  • target (str): Mount path in container

  • mount_type (Literal): Type of mount (default: bind)

  • options (dict): Provider-specific options

  • cache_key (str | None): Key for caching mount metadata

  • size_estimate_gb (float | None): Estimated size for planning

GPUSpec

Immutable GPU hardware specification used for matching.

Fields:

  • vendor (str): GPU vendor (default: NVIDIA)

  • model (str): GPU model (e.g., A100, H100)

  • memory_gb (int): GPU memory in GB

  • memory_type (str): Memory type (HBM2e, HBM3, GDDR6) (default: ``)

  • architecture (str): GPU architecture (Ampere, Hopper) (default: ``)

  • compute_capability (tuple): CUDA compute capability (default: (0, 0))

  • tflops_fp32 (float): FP32 performance in TFLOPS (default: 0.0)

  • tflops_fp16 (float): FP16 performance in TFLOPS (default: 0.0)

  • memory_bandwidth_gb_s (float): Memory bandwidth in GB/s (default: 0.0)

CPUSpec

CPU specification.

Fields:

  • vendor (str): CPU vendor (default: Intel)

  • model (str): CPU model (default: Xeon)

  • cores (int): Number of CPU cores

  • threads (int): Number of threads (0 = same as cores) (default: 0)

  • base_clock_ghz (float): Base clock speed in GHz (default: 0.0)

CPUSpec.set_threads_default

set_threads_default(self) -> CPUSpec

Default threads to cores when not specified.

MemorySpec

System memory specification.

Fields:

  • size_gb (int): Memory size in GB

  • type (str): Memory type (default: DDR4)

  • speed_mhz (int): Memory speed in MHz (default: 3200)

  • ecc (bool): ECC memory support (default: True)

StorageSpec

Storage specification.

Fields:

  • size_gb (int): Storage size in GB

  • type (str): Storage type (NVMe, SSD, HDD) (default: NVMe)

  • iops (int | None): IOPS rating

  • bandwidth_mb_s (int | None): Bandwidth in MB/s

NetworkSpec

Network specification.

Fields:

  • intranode (str): Intra-node interconnect (SXM4, SXM5, PCIe) (default: ``)

  • internode (str | None): Inter-node network (InfiniBand, Ethernet)

  • bandwidth_gbps (float | None): Network bandwidth in Gbps

InstanceType

Canonical instance type specification (immutable).

Fields:

  • gpu (GPUSpec)

  • gpu_count (int): Number of GPUs

  • cpu (CPUSpec)

  • memory (MemorySpec)

  • storage (StorageSpec)

  • network (NetworkSpec)

  • id (uuid.UUID | None): Unique instance type ID

  • aliases (set): Alternative names

  • created_at (datetime)

  • version (int) (default: 1)

InstanceType.compute_id_and_aliases

compute_id_and_aliases(self) -> InstanceType

Compute a stable ID and default aliases.

InstanceMatch

Matched instance with price and availability.

Fields:

  • instance (InstanceType)

  • region (str)

  • availability (int): Number of available instances

  • price_per_hour (float): Price in USD per hour

  • match_score (float): Match quality score (default: 1.0)

Instance

Compute instance entity.

Fields:

  • instance_id (str): Instance UUID

  • task_id (str): Parent task ID

  • status (InstanceStatus): Instance state

  • ssh_host (str | None): Public hostname/IP

  • private_ip (str | None): VPC-internal IP

  • created_at (datetime)

  • terminated_at (datetime.datetime | None)

AvailableInstance

Available compute resource.

Fields:

  • allocation_id (str): Resource allocation ID

  • instance_type (str): Instance type identifier

  • region (str): Availability region

  • price_per_hour (float): Hourly price (USD)

  • gpu_type (str | None): GPU type

  • gpu_count (int | None): Number of GPUs

  • cpu_count (int | None): Number of CPUs

  • memory_gb (int | None): Memory in GB

  • available_quantity (int | None): Number available

  • status (str | None): Allocation status

  • expires_at (datetime.datetime | None): Expiration time

  • internode_interconnect (str | None): Inter-node network (e.g., InfiniBand, IB_3200, Ethernet)

  • intranode_interconnect (str | None): Intra-node interconnect (e.g., SXM5, PCIe)

Reservation

Reservation details returned by providers.

Fields:

  • reservation_id (str): Reservation identifier

  • name (str | None): Display name

  • status (ReservationStatus): Lifecycle state

  • instance_type (str): Instance type identifier

  • region (str): Region/zone

  • quantity (int): Number of instances

  • start_time_utc (datetime): Scheduled start time (UTC)

  • end_time_utc (datetime.datetime | None): Scheduled end time (UTC)

  • price_total_usd (float | None): Quoted/actual total price

  • provider_metadata (dict)

ReservationSpec

Provider-agnostic spec for creating a reservation.

Fields:

  • name (str | None): Optional reservation name for display

  • project_id (str | None): Provider project/workspace ID

  • instance_type (str): Explicit instance type (e.g., 'a100', '8xh100')

  • region (str): Target region/zone for the reservation

  • quantity (int): Number of instances to reserve (default: 1)

  • start_time_utc (datetime): Reservation start time (UTC)

  • duration_hours (int): Reservation duration in hours (3-336)

  • ssh_keys (list): Authorized SSH key IDs

  • volumes (list): Volume IDs to attach (provider-specific)

  • startup_script (str | None): Optional startup script executed when instances boot

FlowConfig

Flow SDK configuration settings.

Immutable configuration for API authentication and default behaviors. Typically loaded from environment variables or config files.

Fields:

  • api_key (str): Authentication key

  • project (str): Project identifier

  • region (str): Default deployment region (default: us-central1-b)

  • api_url (str): API base URL (default: https://api.mithril.ai)

Project

Project metadata.

Fields:

  • name (str): Project identifier

  • region (str): Primary region

ValidationResult

Configuration validation result.

Fields:

  • is_valid (bool): Validation status

  • projects (list): Accessible projects

  • error_message (str | None): Validation error

SubmitTaskRequest

Task submission request.

Fields:

  • config (TaskConfig): Task specification

  • wait (bool): Block until complete (default: False)

  • dry_run (bool): Validation only (default: False)

SubmitTaskResponse

Task submission result.

Fields:

  • task_id (str): Assigned task ID

  • status (TaskStatus): Initial state

  • message (str | None): Status details

ListTasksRequest

Task listing request.

Fields:

  • status (flow.sdk.models.enums.TaskStatus | None): Status filter

  • limit (int): Page size (default: 100)

  • offset (int): Skip count (default: 0)

ListTasksResponse

Task listing result.

Fields:

  • tasks (list): Task collection

  • total (int): Total available

  • has_more (bool): Pagination indicator

User

User identity information.

Fields:

  • user_id (str): Unique user identifier (e.g., 'user_kfV4CCaapLiqCNlv')

  • username (str): Username for display

  • email (str): User email address

Volume

Backwards-compatible alias to the canonical Volume model.

Kept for import stability of the legacy Volume class while delegating to the real implementation in the volume module, which supports both persistent volumes and bind mounts (local/remote/read_only).

Fields:

  • local (str | None): Source path on host

  • remote (str | None): Target path in container

  • read_only (bool | None): Mount read-only

  • volume_id (str | None): Volume ID

  • name (str | None): Volume name

  • size_gb (int | None): Capacity (GB)

  • region (str | None): Storage region

  • interface (flow.sdk.models.enums.StorageInterface | None): Storage interface type

  • attached_to (list): Attached instance IDs

  • created_at (Any | None): Creation timestamp

Last updated