Altametris - alta-inference-sdk¶

Welcome to inference_sdk’s documentation¶

Note

For a complete documentation, please visit the repository inference_sdk.

Owner altametris.

Project Description¶

Unified client and lifecycle manager for Azure ML batch and online inference endpoints

Steps¶

Install the package following the Installation
Learn how to use it following the Usage

Contents¶

Contributors¶

anis.chaarana@altametris.com

README¶

alta-inference-sdk¶

Unified client and lifecycle manager for Azure ML batch inference endpoints.

Overview¶

This library provides two classes with distinct responsibilities:

Class	Who uses it	Purpose
`InferenceClient`	Inference code, AFA, pipelines	Submit jobs, monitor execution, get results
`InferenceManager`	Setup scripts, infra team	Create / update / delete endpoints and deployments

Installation¶

pip install alta-inference-sdk --index-url https://pkgs.dev.azure.com/altametris/DEV-IA/_packaging/altametris/pypi/simple/

Configuration¶

`.env` file¶

Create a .env file at the root of your project. All variables are optional if passed explicitly to the constructor, but the file centralises configuration for local development.

# Azure ML workspace
SUBSCRIPTION_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
RESOURCE_GROUP=my-resource-group
WORKSPACE_NAME=my-aml-workspace

# Endpoint & deployment (optional — can be overridden by constructor args)
ENDPOINT_NAME=randlanet-dev-endpoint
DEPLOYMENT_NAME=deployment-vg-v100

Resolution order¶

Priority: explicit constructor arg > .env / env var > error.

Variable	Required	If absent
`SUBSCRIPTION_ID`	Yes	raises `ValueError`
`RESOURCE_GROUP`	Yes	raises `ValueError`
`WORKSPACE_NAME`	Yes	raises `ValueError`
`ENDPOINT_NAME`	Yes	raises `ValueError`
`DEPLOYMENT_NAME`	No	Azure ML uses the endpoint’s default deployment

# reads everything from .env
client = InferenceClient("batch_pipeline")

# explicit arg overrides .env
client = InferenceClient("batch_pipeline", "my-other-endpoint", deployment_name="dep-v2")

Authentication¶

The SDK picks credentials automatically, in this order:

Managed identity — if IDENTITY_ENDPOINT or MSI_ENDPOINT is set (AFA, VM, ACI, AKS)
Service principal — if AZURE_CLIENT_ID + AZURE_CLIENT_SECRET + AZURE_TENANT_ID are set (CI/CD, .env)
Azure CLI session — az login for local development

InferenceClient — submitting and monitoring jobs¶

Use this class from inference code, Azure Function Apps, or any system that needs to trigger a batch job.

Inputs¶

Pipeline inputs are model-specific and built in the calling repo. The example below corresponds to the randlanet-vegetation batch pipeline.

from azure.ai.ml import Input

inputs = {
    # --- Required ---
    "ept_url":               Input(type="string", default="https://api.altametris.xyz/.../ept.json"),
    "traj_url":              Input(type="string", default="https://api.altametris.xyz/.../sbet.out"),
    "traj_time_min":         Input(type="number", default=443955905.06),  # GPS start time (s)
    "traj_time_max":         Input(type="number", default=443956173.43),  # GPS end time  (s)
    "url_blob":              Input(type="string", default="https://api.altametris.xyz/.../addons/"),
    "env_deployment":        Input(type="string", default="dev"),         # "dev" | "test" | "prod"
    # --- Optional (pipeline defaults apply if omitted) ---
    "batch_size":            Input(type="integer", default=8),    # inference batch size
    "splitting_distance":    Input(type="number",  default=500),  # trajectory split (m)
    "buffer_length":         Input(type="number",  default=8.3),  # buffer each side (m)
    "min_splitting_distance":Input(type="number",  default=250),  # min split size   (m)
}

Two separate upload mechanisms:

url_blob (input) — the pipeline uploads addons to the U3D API during execution. Always required.

outputs (optional) — if provided, Azure ML automatically redirects output_dir to the specified datastore path. Use this so the DS team can access the files from Azure ML Studio.
from azure.ai.ml import Output
from azure.ai.ml.constants import AssetTypes

outputs = {
    "output_dir": Output(
        type=AssetTypes.URI_FOLDER,
        path="azureml://datastores/data4ds/paths/batch/addons",
    )
}
job_name = client.invoke(inputs=inputs, outputs=outputs)
Omit outputs entirely if DS access to Azure ML Studio is not needed.

Submit a job (fire-and-forget)¶

from altametris.inference_sdk import InferenceClient

client = InferenceClient("batch_pipeline")  # reads .env

job_name = client.invoke(inputs=inputs)
print(job_name)  # pipelinejob-XX-XXXX

Blocking run (submit + stream + results in one call)¶

result = client.run(inputs=inputs)
print(result["status"])       # Completed
print(result["job_name"])     # pipelinejob-XX-XXXX
print(result["output_url"])   # azureml://datastores/... (None if pipeline manages its own upload)
print(result["retrieved_at"]) # 2026-04-08T15:00:02+00:00

Monitor a running job¶

# Stream logs in real time (blocks until done)
status = client.stream(job_name)

# Or poll every 60 seconds
status = client.poll(job_name, interval=60)

# Get result summary
result = client.results(job_name)

Check endpoint health¶

health = client.health()
print(health["status"])  # healthy / degraded / unhealthy

Cancel a job¶

client.cancel(job_name)  # blocks until Azure ML confirms cancellation

Failover between deployments¶

Submit on a primary deployment (e.g. T4); if the compute stays queued longer than queue_timeout_s, cancel and resubmit on a fallback deployment (e.g. V100). The timer only counts time spent in a waiting state — it resets as soon as the child job starts running.

job_name = client.invoke_with_failover(
    inputs=inputs,
    primary_deployment="deployment-vg-t4",
    fallback_deployment="deployment-vg-v100",
    queue_timeout_s=3600,   # 1h in queue before switching
    poll_interval_s=60,
)
# job_name is either the primary or fallback job — use stream/poll normally
status = client.stream(job_name)
result = client.results(job_name)

InferenceManager — endpoint and deployment lifecycle¶

Use this class from setup scripts only, not from inference code. Requires Azure ML contributor permissions on the workspace.

One-shot setup¶

from pathlib import Path
from altametris.inference_sdk import InferenceManager

manager = InferenceManager("batch_pipeline")  # reads .env

manager.ensure_ready(
    deployment_name="my-deployment",
    pipeline_yml=Path("config/pipeline.yml"),
    compute_name="my-cluster",
)

Step by step¶

# Endpoint
manager.create_endpoint(description="Vegetation pipeline endpoint")

# Compute cluster
manager.ensure_compute(Path("config/compute.yml"))

# Pipeline component + deployment
manager.create_deployment("my-deployment", Path("config/pipeline.yml"), "my-cluster")

# Default deployment
manager.set_default_deployment("my-deployment")

Listing¶

manager.list_endpoints()    # all batch endpoints in the workspace
manager.list_deployments()  # deployments on this endpoint
manager.list_computes()     # all compute clusters in the workspace
manager.get_default_deployment()

Cleanup¶

manager.delete_deployment("my-deployment")
manager.delete_endpoint()

Altametris - alta-inference-sdk¶

Welcome to inference_sdk’s documentation¶

Project Description¶

Steps¶

Contents¶

Contributors¶

README¶

alta-inference-sdk¶

Overview¶

Installation¶

Configuration¶

.env file¶

Resolution order¶

Authentication¶

InferenceClient — submitting and monitoring jobs¶

Inputs¶

Submit a job (fire-and-forget)¶

Blocking run (submit + stream + results in one call)¶

Monitor a running job¶

Check endpoint health¶

Cancel a job¶

Failover between deployments¶

InferenceManager — endpoint and deployment lifecycle¶

One-shot setup¶

Step by step¶

Listing¶

Cleanup¶

Indices and tables¶

`.env` file¶