README¶

alta-inference-sdk¶

Unified client and lifecycle manager for Azure ML batch inference endpoints.

Overview¶

This library provides two classes with distinct responsibilities:

Class	Who uses it	Purpose
`InferenceClient`	Inference code, AFA, pipelines	Submit jobs, monitor execution, get results
`InferenceManager`	Setup scripts, infra team	Create / update / delete endpoints and deployments

Installation¶

pip install alta-inference-sdk --index-url https://pkgs.dev.azure.com/altametris/DEV-IA/_packaging/altametris/pypi/simple/

Configuration¶

`.env` file¶

Create a .env file at the root of your project. All variables are optional if passed explicitly to the constructor, but the file centralises configuration for local development.

# Azure ML workspace
SUBSCRIPTION_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
RESOURCE_GROUP=my-resource-group
WORKSPACE_NAME=my-aml-workspace

# Endpoint & deployment (optional — can be overridden by constructor args)
ENDPOINT_NAME=randlanet-dev-endpoint
DEPLOYMENT_NAME=deployment-vg-v100

Resolution order¶

Priority: explicit constructor arg > .env / env var > error.

Variable	Required	If absent
`SUBSCRIPTION_ID`	Yes	raises `ValueError`
`RESOURCE_GROUP`	Yes	raises `ValueError`
`WORKSPACE_NAME`	Yes	raises `ValueError`
`ENDPOINT_NAME`	Yes	raises `ValueError`
`DEPLOYMENT_NAME`	No	Azure ML uses the endpoint’s default deployment

# reads everything from .env
client = InferenceClient("batch_pipeline")

# explicit arg overrides .env
client = InferenceClient("batch_pipeline", "my-other-endpoint", deployment_name="dep-v2")

Authentication¶

The SDK picks credentials automatically, in this order:

Managed identity — if IDENTITY_ENDPOINT or MSI_ENDPOINT is set (AFA, VM, ACI, AKS)
Service principal — if AZURE_CLIENT_ID + AZURE_CLIENT_SECRET + AZURE_TENANT_ID are set (CI/CD, .env)
Azure CLI session — az login for local development

InferenceClient — submitting and monitoring jobs¶

Use this class from inference code, Azure Function Apps, or any system that needs to trigger a batch job.

Inputs¶

Pipeline inputs are model-specific and built in the calling repo. The example below corresponds to the randlanet-vegetation batch pipeline.

from azure.ai.ml import Input

inputs = {
    # --- Required ---
    "ept_url":               Input(type="string", default="https://api.altametris.xyz/.../ept.json"),
    "traj_url":              Input(type="string", default="https://api.altametris.xyz/.../sbet.out"),
    "traj_time_min":         Input(type="number", default=443955905.06),  # GPS start time (s)
    "traj_time_max":         Input(type="number", default=443956173.43),  # GPS end time  (s)
    "url_blob":              Input(type="string", default="https://api.altametris.xyz/.../addons/"),
    "env_deployment":        Input(type="string", default="dev"),         # "dev" | "test" | "prod"
    # --- Optional (pipeline defaults apply if omitted) ---
    "batch_size":            Input(type="integer", default=8),    # inference batch size
    "splitting_distance":    Input(type="number",  default=500),  # trajectory split (m)
    "buffer_length":         Input(type="number",  default=8.3),  # buffer each side (m)
    "min_splitting_distance":Input(type="number",  default=250),  # min split size   (m)
}

Two separate upload mechanisms:

url_blob (input) — the pipeline uploads addons to the U3D API during execution. Always required.

outputs (optional) — if provided, Azure ML automatically redirects output_dir to the specified datastore path. Use this so the DS team can access the files from Azure ML Studio.
from azure.ai.ml import Output
from azure.ai.ml.constants import AssetTypes

outputs = {
    "output_dir": Output(
        type=AssetTypes.URI_FOLDER,
        path="azureml://datastores/data4ds/paths/batch/addons",
    )
}
job_name = client.invoke(inputs=inputs, outputs=outputs)
Omit outputs entirely if DS access to Azure ML Studio is not needed.

Submit a job (fire-and-forget)¶

from altametris.inference_sdk import InferenceClient

client = InferenceClient("batch_pipeline")  # reads .env

job_name = client.invoke(inputs=inputs)
print(job_name)  # pipelinejob-XX-XXXX

Blocking run (submit + stream + results in one call)¶

result = client.run(inputs=inputs)
print(result["status"])       # Completed
print(result["job_name"])     # pipelinejob-XX-XXXX
print(result["output_url"])   # azureml://datastores/... (None if pipeline manages its own upload)
print(result["retrieved_at"]) # 2026-04-08T15:00:02+00:00

Monitor a running job¶

# Stream logs in real time (blocks until done)
status = client.stream(job_name)

# Or poll every 60 seconds
status = client.poll(job_name, interval=60)

# Get result summary
result = client.results(job_name)

Check endpoint health¶

health = client.health()
print(health["status"])  # healthy / degraded / unhealthy

Cancel a job¶

client.cancel(job_name)  # blocks until Azure ML confirms cancellation

Failover between deployments¶

Submit on a primary deployment (e.g. T4); if the compute stays queued longer than queue_timeout_s, cancel and resubmit on a fallback deployment (e.g. V100). The timer only counts time spent in a waiting state — it resets as soon as the child job starts running.

job_name = client.invoke_with_failover(
    inputs=inputs,
    primary_deployment="deployment-vg-t4",
    fallback_deployment="deployment-vg-v100",
    queue_timeout_s=3600,   # 1h in queue before switching
    poll_interval_s=60,
)
# job_name is either the primary or fallback job — use stream/poll normally
status = client.stream(job_name)
result = client.results(job_name)

InferenceManager — endpoint and deployment lifecycle¶

Use this class from setup scripts only, not from inference code. Requires Azure ML contributor permissions on the workspace.

One-shot setup¶

from pathlib import Path
from altametris.inference_sdk import InferenceManager

manager = InferenceManager("batch_pipeline")  # reads .env

manager.ensure_ready(
    deployment_name="my-deployment",
    pipeline_yml=Path("config/pipeline.yml"),
    compute_name="my-cluster",
)

Step by step¶

# Endpoint
manager.create_endpoint(description="Vegetation pipeline endpoint")

# Compute cluster
manager.ensure_compute(Path("config/compute.yml"))

# Pipeline component + deployment
manager.create_deployment("my-deployment", Path("config/pipeline.yml"), "my-cluster")

# Default deployment
manager.set_default_deployment("my-deployment")

Listing¶

manager.list_endpoints()    # all batch endpoints in the workspace
manager.list_deployments()  # deployments on this endpoint
manager.list_computes()     # all compute clusters in the workspace
manager.get_default_deployment()

Cleanup¶

manager.delete_deployment("my-deployment")
manager.delete_endpoint()