README

alta-inference-sdk

Unified client and lifecycle manager for Azure ML batch inference endpoints.

Overview

This library provides two classes with distinct responsibilities:

Class

Who uses it

Purpose

InferenceClient

Inference code, AFA, pipelines

Submit jobs, monitor execution, get results

InferenceManager

Setup scripts, infra team

Create / update / delete endpoints and deployments


Installation

pip install alta-inference-sdk --index-url https://pkgs.dev.azure.com/altametris/DEV-IA/_packaging/altametris/pypi/simple/

Configuration

.env file

Create a .env file at the root of your project. All variables are optional if passed explicitly to the constructor, but the file centralises configuration for local development.

# Azure ML workspace
SUBSCRIPTION_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
RESOURCE_GROUP=my-resource-group
WORKSPACE_NAME=my-aml-workspace

# Endpoint & deployment (optional — can be overridden by constructor args)
ENDPOINT_NAME=randlanet-dev-endpoint
DEPLOYMENT_NAME=deployment-vg-v100

Resolution order

Priority: explicit constructor arg > .env / env var > error.

Variable

Required

If absent

SUBSCRIPTION_ID

Yes

raises ValueError

RESOURCE_GROUP

Yes

raises ValueError

WORKSPACE_NAME

Yes

raises ValueError

ENDPOINT_NAME

Yes

raises ValueError

DEPLOYMENT_NAME

No

Azure ML uses the endpoint’s default deployment

# reads everything from .env
client = InferenceClient("batch_pipeline")

# explicit arg overrides .env
client = InferenceClient("batch_pipeline", "my-other-endpoint", deployment_name="dep-v2")

Authentication

The SDK picks credentials automatically, in this order:

  1. Managed identity — if IDENTITY_ENDPOINT or MSI_ENDPOINT is set (AFA, VM, ACI, AKS)

  2. Service principal — if AZURE_CLIENT_ID + AZURE_CLIENT_SECRET + AZURE_TENANT_ID are set (CI/CD, .env)

  3. Azure CLI sessionaz login for local development


InferenceClient — submitting and monitoring jobs

Use this class from inference code, Azure Function Apps, or any system that needs to trigger a batch job.

Inputs

Pipeline inputs are model-specific and built in the calling repo. The example below corresponds to the randlanet-vegetation batch pipeline.

from azure.ai.ml import Input

inputs = {
    # --- Required ---
    "ept_url":               Input(type="string", default="https://api.altametris.xyz/.../ept.json"),
    "traj_url":              Input(type="string", default="https://api.altametris.xyz/.../sbet.out"),
    "traj_time_min":         Input(type="number", default=443955905.06),  # GPS start time (s)
    "traj_time_max":         Input(type="number", default=443956173.43),  # GPS end time  (s)
    "url_blob":              Input(type="string", default="https://api.altametris.xyz/.../addons/"),
    "env_deployment":        Input(type="string", default="dev"),         # "dev" | "test" | "prod"
    # --- Optional (pipeline defaults apply if omitted) ---
    "batch_size":            Input(type="integer", default=8),    # inference batch size
    "splitting_distance":    Input(type="number",  default=500),  # trajectory split (m)
    "buffer_length":         Input(type="number",  default=8.3),  # buffer each side (m)
    "min_splitting_distance":Input(type="number",  default=250),  # min split size   (m)
}

Two separate upload mechanisms:

  • url_blob (input) — the pipeline uploads addons to the U3D API during execution. Always required.

  • outputs (optional) — if provided, Azure ML automatically redirects output_dir to the specified datastore path. Use this so the DS team can access the files from Azure ML Studio.

from azure.ai.ml import Output
from azure.ai.ml.constants import AssetTypes

outputs = {
    "output_dir": Output(
        type=AssetTypes.URI_FOLDER,
        path="azureml://datastores/data4ds/paths/batch/addons",
    )
}
job_name = client.invoke(inputs=inputs, outputs=outputs)

Omit outputs entirely if DS access to Azure ML Studio is not needed.

Submit a job (fire-and-forget)

from altametris.inference_sdk import InferenceClient

client = InferenceClient("batch_pipeline")  # reads .env

job_name = client.invoke(inputs=inputs)
print(job_name)  # pipelinejob-XX-XXXX

Blocking run (submit + stream + results in one call)

result = client.run(inputs=inputs)
print(result["status"])       # Completed
print(result["job_name"])     # pipelinejob-XX-XXXX
print(result["output_url"])   # azureml://datastores/... (None if pipeline manages its own upload)
print(result["retrieved_at"]) # 2026-04-08T15:00:02+00:00

Monitor a running job

# Stream logs in real time (blocks until done)
status = client.stream(job_name)

# Or poll every 60 seconds
status = client.poll(job_name, interval=60)

# Get result summary
result = client.results(job_name)

Check endpoint health

health = client.health()
print(health["status"])  # healthy / degraded / unhealthy

InferenceManager — endpoint and deployment lifecycle

Use this class from setup scripts only, not from inference code. Requires Azure ML contributor permissions on the workspace.

One-shot setup

from pathlib import Path
from altametris.inference_sdk import InferenceManager

manager = InferenceManager("batch_pipeline")  # reads .env

manager.ensure_ready(
    deployment_name="my-deployment",
    pipeline_yml=Path("config/pipeline.yml"),
    compute_name="my-cluster",
)

Step by step

# Endpoint
manager.create_endpoint(description="Vegetation pipeline endpoint")

# Compute cluster
manager.ensure_compute(Path("config/compute.yml"))

# Pipeline component + deployment
manager.create_deployment("my-deployment", Path("config/pipeline.yml"), "my-cluster")

# Default deployment
manager.set_default_deployment("my-deployment")

Listing

manager.list_endpoints()    # all batch endpoints in the workspace
manager.list_deployments()  # deployments on this endpoint
manager.list_computes()     # all compute clusters in the workspace
manager.get_default_deployment()

Cleanup

manager.delete_deployment("my-deployment")
manager.delete_endpoint()