README

alta-azure-sdk 2.0.0

A Python SDK for Azure authentication, Azure ML workspace management, and Data Science cloud services.

Installation

  1. Configure pip.ini by following the wiki to access the Altametris feed.

  2. Install the package:

# Standard install
pip install alta-azure-sdk

# Development (includes tests, linting, docs, notebooks)
pip install alta-azure-sdk[dev]

Structure

altametris/azure_sdk/
├── authentication/     Credential management — Azure Identity, Blob Storage, APIM, MLClient
├── storage/            Low-level Blob and Container clients (Azure Blob SDK backend)
├── storage_manager/    BlobManager composition root + CacheManager (local TTL cache)
├── api/                DataScience API clients (APIM OAuth2 backend)
└── aml/                Azure ML workspace management
    ├── asset/          Compute, environment, dataset, model managers + WorkspaceLoader
    ├── config/         YAML config parsing (jobs, pipelines, endpoints)
    ├── deployment/     Online and batch endpoint CRUD
    ├── runs/           Pipeline job submission (CLI entry point)
    └── utils/          Validators, MLflow model registry, asset orchestration

Authentication

Overview

altametris.azure_sdk.authentication provides Credentials, a unified credential manager that handles authentication for Azure Blob Storage, APIM, and Azure ML workspaces using a single cached Azure Identity credential.

Five authentication modes are available:

Mode

Class

Use case

default

DefaultAzureCredential

Local development (az login, env vars, MI in sequence)

interactive

InteractiveBrowserCredential

Local dev without az login

sp

ClientSecretCredential

CI/CD pipelines — fails fast if SP vars are missing

mi

ManagedIdentityCredential

Production Azure resources (ACI, VM, AKS)

workload

WorkloadIdentityCredential

Azure DevOps Workload Identity Federation

Environment variables

# Service Principal (auth_mode="sp")
AZURE_TENANT_ID=""
AZURE_CLIENT_ID=""
AZURE_CLIENT_SECRET=""

# Blob Storage
AZURE_STORAGE_ACCOUNT_NAME=""
AZURE_STORAGE_CONNECTION_STRING=""   # local dev fallback

# APIM
DS_API_BASE_URL="https://am-ds-apim-dev.azure-api.net"
DS_API_SCOPE="https://api.altametris.xyz/datascience/dev/.default"
DS_WEIGHTS_CONTAINER="weights-dev"
DS_WEIGHTS_PREFIX="segrails"

# Workspace parameters per environment
DEV_SUBSCRIPTION_ID=""
DEV_RESOURCE_GROUP="AM-DS-AI-DEV"
DEV_WORKSPACE_NAME="mlw-mlops-dev"
# same pattern for TEST_ / BETA_ / PROD_

Copy .env.example and fill in your values.

Use case 1 — Downloading model weights

from altametris.azure_sdk.authentication import Credentials

# Local development (az login)
creds = Credentials()
blob_client = creds.get_blob_service_client()

# Production — Managed Identity
creds = Credentials(auth_mode="mi")
blob_client = creds.get_blob_service_client()

# APIM OAuth2 token
token = creds.get_api_token()

Use case 2 — Azure ML workspace access

from altametris.azure_sdk.authentication import get_credentials

# Resolves DEV_SUBSCRIPTION_ID / DEV_RESOURCE_GROUP / DEV_WORKSPACE_NAME from env
creds = get_credentials("dev")    # auth_mode="default"
creds = get_credentials("test")   # auth_mode="sp"   (CI/CD)
creds = get_credentials("prod")   # auth_mode="mi"   (production)

ml_client = creds.get_ml_client()

Or pass workspace parameters explicitly:

from altametris.azure_sdk.authentication import Credentials

creds = Credentials(
    auth_mode="sp",
    subscription_id="ad2d1318-...",
    resource_group="AM-DS-AI-DEV",
    workspace_name="mlw-mlops-dev",
)
ml_client = creds.get_ml_client()

Use case 3 — Weights + workspace (combined)

A single Credentials instance covers both. The underlying Azure Identity credential is instantiated once and shared across all SDK clients.

from altametris.azure_sdk.authentication import Credentials, DsAuthConfig

creds = Credentials(
    auth_mode="mi",
    subscription_id="...",
    resource_group="AM-DS-AI-PROD",
    workspace_name="mlw-mlops-prod",
    config=DsAuthConfig(storage_account_name="amdsaiazmlprod"),
)

blob_client = creds.get_blob_service_client()   # downloads model.pt
ml_client   = creds.get_ml_client()             # Azure ML workspace
# same Azure Identity credential reused — no double authentication

Validate authentication

creds = Credentials()
creds.validate_access()   # lists one blob — raises DsAuthError if it fails

Storage module (storage/)

Low-level clients for direct Azure Blob Storage operations, backed by the Azure SDK. Both clients receive an already-built BlobServiceClient injected by BlobManager.

Class

Operations

BlobClient

download, exists, size

ContainerClient

list (with optional prefix filter)

from altametris.azure_sdk.authentication import Credentials
from altametris.azure_sdk.storage.blob import BlobClient
from altametris.azure_sdk.storage.container import ContainerClient

creds = Credentials()
sdk_client = creds.get_blob_service_client()

blob = BlobClient(sdk_client)
blob.download(Path("/tmp/model.pt"), container="weights-dev", blob_path="segrails/model.pt")
print(blob.exists(container="weights-dev", blob_path="segrails/model.pt"))
print(blob.size(container="weights-dev", blob_path="segrails/model.pt"))

container = ContainerClient(sdk_client)
blobs = container.list(container="weights-dev", prefix="segrails/")
# [{"name": "segrails/model.pt", "size_bytes": 123456, "size_mb": 0.12, ...}, ...]

Storage manager (storage_manager/)

BlobManager is the composition root — the only entry point that imports Credentials. It builds the BlobServiceClient once and injects it into BlobClient and ContainerClient.

Two backends are available: Azure Blob SDK and APIM API.

from altametris.azure_sdk.authentication import Credentials
from altametris.azure_sdk.storage_manager import BlobManager

creds = Credentials(auth_mode="default")

# Azure Blob SDK backend
manager = BlobManager.from_blob_sdk(creds)
manager.blob.download(Path("/tmp/model.pt"), container="weights-dev", blob_path="segrails/model.pt")
manager.blob.exists(container="weights-dev", blob_path="segrails/model.pt")
blobs = manager.container.list(container="weights-dev", prefix="segrails/")

# APIM backend
manager = BlobManager.from_api(creds, base_url="https://am-ds-apim-dev.azure-api.net")
manager.blob.download(Path("/tmp/model.pt"), container="weights", blob_path="segrails/model.pt")
blobs = manager.container.list(container="weights", prefix="segrails")

CacheManager provides a local file cache with configurable TTL, used by WeightStep in downstream packages to avoid re-downloading unchanged blobs.

from altametris.azure_sdk.storage_manager import CacheManager

cache = CacheManager(cache_dir="~/.cache/altametris", ttl_seconds=86400)
local_path = cache.get("weights-dev/segrails/model.pt")   # None if expired or absent
cache.put("weights-dev/segrails/model.pt", Path("/tmp/model.pt"))

API module (api/)

HTTP clients for DataScience APIM endpoints, backed by OAuth2 JWT authentication. Implements the same BlobClientBase / ContainerClientBase interfaces as storage/, so they are interchangeable via BlobManager.from_api().

Class

Operations

ApiBlobClient

download (streaming, atomic write, optional tqdm progress)

ApiContainerClient

list (with optional prefix filter)

ApiMlClient

Azure ML operations via API

from altametris.azure_sdk.authentication import Credentials
from altametris.azure_sdk.api.api_blob import ApiBlobClient
from altametris.azure_sdk.api.api_container import ApiContainerClient

creds = Credentials(auth_mode="sp")

blob = ApiBlobClient(credentials=creds, base_url="https://am-ds-apim-dev.azure-api.net")
blob.download(Path("/tmp/model.pt"), container="weights", blob_path="segrails/model.pt")

container = ApiContainerClient(credentials=creds, base_url="https://am-ds-apim-dev.azure-api.net")
items = container.list(container="weights", prefix="segrails/")

Azure ML module (aml/)

Asset management

from altametris.azure_sdk.authentication import get_credentials
from altametris.azure_sdk.aml.asset import ComputeManager, EnvironmentManager, DatasetManager, ModelManager

ml_client = get_credentials("dev").get_ml_client()

# Compute
comp = ComputeManager(ml_client)
comp.create("my-cluster", "Standard_DS2_v2", min_nodes=0, max_nodes=4, priority="dedicated")
print(comp.list())

# Environment
env = EnvironmentManager(ml_client)
print(env.list())
print(env.get_latest_version("my-env"))

# Dataset
ds = DatasetManager(ml_client)
ds.register(datastore_name="workspaceblobstore", data_path="data/train", dataset_name="my-dataset")

# Workspace info
from altametris.azure_sdk.aml.asset import WorkspaceLoader
ws = WorkspaceLoader(ml_client).get_ws("mlw-mlops-dev")

YAML config parsing

from altametris.azure_sdk.aml.config import YamlParser

parser = YamlParser("path/to/pipeline.yml")
config = parser.parse()
assets = parser.get_assets()   # {"environments": [...], "computes": [...], "models": [...]}

Endpoint management

from altametris.azure_sdk.aml.deployment import OnlineEndpointManager, BatchEndpointManager

# Online endpoint
online_mgr = OnlineEndpointManager(ml_client)
endpoint = online_mgr.create_endpoint("my-endpoint", "description")
print(endpoint.scoring_uri)

# Batch endpoint
batch_mgr = BatchEndpointManager(ml_client)
batch_ep = batch_mgr.get_endpoint("my-batch-endpoint")
job_name = batch_ep.invoke(input=my_input)

Pipeline submission (CLI)

alta-submit-job

Reads configuration from environment variables (EXPERIMENT_NAME, TRAIN_DATASET, etc.) and submits an Azure ML pipeline job.

Notebooks

Notebook

Description

tutorial-authentication.ipynb

End-to-end authentication walkthrough — all use cases

tutorial-az-ml-services.ipynb

Asset management, deployments, workspace operations

tutorial-parser.ipynb

YAML config parsing and asset extraction

tutorial_mlflow.ipynb

MLflow model registry queries

tutorial_ml_pipeline.ipynb

Full ML pipeline example