README¶
alta-inference-sdk¶
Unified client and lifecycle manager for Azure ML batch inference endpoints.
Overview¶
This library provides two classes with distinct responsibilities:
Class |
Who uses it |
Purpose |
|---|---|---|
|
Inference code, AFA, pipelines |
Submit jobs, monitor execution, get results |
|
Setup scripts, infra team |
Create / update / delete endpoints and deployments |
Installation¶
pip install alta-inference-sdk --index-url https://pkgs.dev.azure.com/altametris/DEV-IA/_packaging/altametris/pypi/simple/
Configuration¶
.env file¶
Create a .env file at the root of your project. All variables are optional if passed explicitly to the constructor, but the file centralises configuration for local development.
# Azure ML workspace
SUBSCRIPTION_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
RESOURCE_GROUP=my-resource-group
WORKSPACE_NAME=my-aml-workspace
# Endpoint & deployment (optional — can be overridden by constructor args)
ENDPOINT_NAME=randlanet-dev-endpoint
DEPLOYMENT_NAME=deployment-vg-v100
Resolution order¶
Priority: explicit constructor arg > .env / env var > error.
Variable |
Required |
If absent |
|---|---|---|
|
Yes |
raises |
|
Yes |
raises |
|
Yes |
raises |
|
Yes |
raises |
|
No |
Azure ML uses the endpoint’s default deployment |
# reads everything from .env
client = InferenceClient("batch_pipeline")
# explicit arg overrides .env
client = InferenceClient("batch_pipeline", "my-other-endpoint", deployment_name="dep-v2")
Authentication¶
The SDK picks credentials automatically, in this order:
Managed identity — if
IDENTITY_ENDPOINTorMSI_ENDPOINTis set (AFA, VM, ACI, AKS)Service principal — if
AZURE_CLIENT_ID+AZURE_CLIENT_SECRET+AZURE_TENANT_IDare set (CI/CD,.env)Azure CLI session —
az loginfor local development
InferenceClient — submitting and monitoring jobs¶
Use this class from inference code, Azure Function Apps, or any system that needs to trigger a batch job.
Inputs¶
Pipeline inputs are model-specific and built in the calling repo.
The example below corresponds to the randlanet-vegetation batch pipeline.
from azure.ai.ml import Input
inputs = {
# --- Required ---
"ept_url": Input(type="string", default="https://api.altametris.xyz/.../ept.json"),
"traj_url": Input(type="string", default="https://api.altametris.xyz/.../sbet.out"),
"traj_time_min": Input(type="number", default=443955905.06), # GPS start time (s)
"traj_time_max": Input(type="number", default=443956173.43), # GPS end time (s)
"url_blob": Input(type="string", default="https://api.altametris.xyz/.../addons/"),
"env_deployment": Input(type="string", default="dev"), # "dev" | "test" | "prod"
# --- Optional (pipeline defaults apply if omitted) ---
"batch_size": Input(type="integer", default=8), # inference batch size
"splitting_distance": Input(type="number", default=500), # trajectory split (m)
"buffer_length": Input(type="number", default=8.3), # buffer each side (m)
"min_splitting_distance":Input(type="number", default=250), # min split size (m)
}
Two separate upload mechanisms:
url_blob(input) — the pipeline uploads addons to the U3D API during execution. Always required.
outputs(optional) — if provided, Azure ML automatically redirectsoutput_dirto the specified datastore path. Use this so the DS team can access the files from Azure ML Studio.from azure.ai.ml import Output from azure.ai.ml.constants import AssetTypes outputs = { "output_dir": Output( type=AssetTypes.URI_FOLDER, path="azureml://datastores/data4ds/paths/batch/addons", ) } job_name = client.invoke(inputs=inputs, outputs=outputs)Omit
outputsentirely if DS access to Azure ML Studio is not needed.
Submit a job (fire-and-forget)¶
from altametris.inference_sdk import InferenceClient
client = InferenceClient("batch_pipeline") # reads .env
job_name = client.invoke(inputs=inputs)
print(job_name) # pipelinejob-XX-XXXX
Blocking run (submit + stream + results in one call)¶
result = client.run(inputs=inputs)
print(result["status"]) # Completed
print(result["job_name"]) # pipelinejob-XX-XXXX
print(result["output_url"]) # azureml://datastores/... (None if pipeline manages its own upload)
print(result["retrieved_at"]) # 2026-04-08T15:00:02+00:00
Monitor a running job¶
# Stream logs in real time (blocks until done)
status = client.stream(job_name)
# Or poll every 60 seconds
status = client.poll(job_name, interval=60)
# Get result summary
result = client.results(job_name)
Check endpoint health¶
health = client.health()
print(health["status"]) # healthy / degraded / unhealthy
InferenceManager — endpoint and deployment lifecycle¶
Use this class from setup scripts only, not from inference code. Requires Azure ML contributor permissions on the workspace.
One-shot setup¶
from pathlib import Path
from altametris.inference_sdk import InferenceManager
manager = InferenceManager("batch_pipeline") # reads .env
manager.ensure_ready(
deployment_name="my-deployment",
pipeline_yml=Path("config/pipeline.yml"),
compute_name="my-cluster",
)
Step by step¶
# Endpoint
manager.create_endpoint(description="Vegetation pipeline endpoint")
# Compute cluster
manager.ensure_compute(Path("config/compute.yml"))
# Pipeline component + deployment
manager.create_deployment("my-deployment", Path("config/pipeline.yml"), "my-cluster")
# Default deployment
manager.set_default_deployment("my-deployment")
Listing¶
manager.list_endpoints() # all batch endpoints in the workspace
manager.list_deployments() # deployments on this endpoint
manager.list_computes() # all compute clusters in the workspace
manager.get_default_deployment()
Cleanup¶
manager.delete_deployment("my-deployment")
manager.delete_endpoint()