README¶
alta-azure-sdk 2.0.0¶
A Python SDK for Azure authentication, Azure ML workspace management, and Data Science cloud services.
Installation¶
Configure
pip.iniby following the wiki to access the Altametris feed.Install the package:
# Standard install
pip install alta-azure-sdk
# Development (includes tests, linting, docs, notebooks)
pip install alta-azure-sdk[dev]
Structure¶
altametris/azure_sdk/
├── authentication/ Credential management — Azure Identity, Blob Storage, APIM, MLClient
├── storage/ Low-level Blob and Container clients (Azure Blob SDK backend)
├── storage_manager/ BlobManager composition root + CacheManager (local TTL cache)
├── api/ DataScience API clients (APIM OAuth2 backend)
└── aml/ Azure ML workspace management
├── asset/ Compute, environment, dataset, model managers + WorkspaceLoader
├── config/ YAML config parsing (jobs, pipelines, endpoints)
├── deployment/ Online and batch endpoint CRUD
├── runs/ Pipeline job submission (CLI entry point)
└── utils/ Validators, MLflow model registry, asset orchestration
Authentication¶
Overview¶
altametris.azure_sdk.authentication provides Credentials, a unified credential manager
that handles authentication for Azure Blob Storage, APIM, and Azure ML workspaces
using a single cached Azure Identity credential.
Five authentication modes are available:
Mode |
Class |
Use case |
|---|---|---|
|
|
Local development ( |
|
|
Local dev without |
|
|
CI/CD pipelines — fails fast if SP vars are missing |
|
|
Production Azure resources (ACI, VM, AKS) |
|
|
Azure DevOps Workload Identity Federation |
Environment variables¶
# Service Principal (auth_mode="sp")
AZURE_TENANT_ID=""
AZURE_CLIENT_ID=""
AZURE_CLIENT_SECRET=""
# Blob Storage
AZURE_STORAGE_ACCOUNT_NAME=""
AZURE_STORAGE_CONNECTION_STRING="" # local dev fallback
# APIM
DS_API_BASE_URL="https://am-ds-apim-dev.azure-api.net"
DS_API_SCOPE="https://api.altametris.xyz/datascience/dev/.default"
DS_WEIGHTS_CONTAINER="weights-dev"
DS_WEIGHTS_PREFIX="segrails"
# Workspace parameters per environment
DEV_SUBSCRIPTION_ID=""
DEV_RESOURCE_GROUP="AM-DS-AI-DEV"
DEV_WORKSPACE_NAME="mlw-mlops-dev"
# same pattern for TEST_ / BETA_ / PROD_
Copy .env.example and fill in your values.
Use case 1 — Downloading model weights¶
from altametris.azure_sdk.authentication import Credentials
# Local development (az login)
creds = Credentials()
blob_client = creds.get_blob_service_client()
# Production — Managed Identity
creds = Credentials(auth_mode="mi")
blob_client = creds.get_blob_service_client()
# APIM OAuth2 token
token = creds.get_api_token()
Use case 2 — Azure ML workspace access¶
from altametris.azure_sdk.authentication import get_credentials
# Resolves DEV_SUBSCRIPTION_ID / DEV_RESOURCE_GROUP / DEV_WORKSPACE_NAME from env
creds = get_credentials("dev") # auth_mode="default"
creds = get_credentials("test") # auth_mode="sp" (CI/CD)
creds = get_credentials("prod") # auth_mode="mi" (production)
ml_client = creds.get_ml_client()
Or pass workspace parameters explicitly:
from altametris.azure_sdk.authentication import Credentials
creds = Credentials(
auth_mode="sp",
subscription_id="ad2d1318-...",
resource_group="AM-DS-AI-DEV",
workspace_name="mlw-mlops-dev",
)
ml_client = creds.get_ml_client()
Use case 3 — Weights + workspace (combined)¶
A single Credentials instance covers both. The underlying Azure Identity credential
is instantiated once and shared across all SDK clients.
from altametris.azure_sdk.authentication import Credentials, DsAuthConfig
creds = Credentials(
auth_mode="mi",
subscription_id="...",
resource_group="AM-DS-AI-PROD",
workspace_name="mlw-mlops-prod",
config=DsAuthConfig(storage_account_name="amdsaiazmlprod"),
)
blob_client = creds.get_blob_service_client() # downloads model.pt
ml_client = creds.get_ml_client() # Azure ML workspace
# same Azure Identity credential reused — no double authentication
Validate authentication¶
creds = Credentials()
creds.validate_access() # lists one blob — raises DsAuthError if it fails
Storage module (storage/)¶
Low-level clients for direct Azure Blob Storage operations, backed by the Azure SDK.
Both clients receive an already-built BlobServiceClient injected by BlobManager.
Class |
Operations |
|---|---|
|
|
|
|
from altametris.azure_sdk.authentication import Credentials
from altametris.azure_sdk.storage.blob import BlobClient
from altametris.azure_sdk.storage.container import ContainerClient
creds = Credentials()
sdk_client = creds.get_blob_service_client()
blob = BlobClient(sdk_client)
blob.download(Path("/tmp/model.pt"), container="weights-dev", blob_path="segrails/model.pt")
print(blob.exists(container="weights-dev", blob_path="segrails/model.pt"))
print(blob.size(container="weights-dev", blob_path="segrails/model.pt"))
container = ContainerClient(sdk_client)
blobs = container.list(container="weights-dev", prefix="segrails/")
# [{"name": "segrails/model.pt", "size_bytes": 123456, "size_mb": 0.12, ...}, ...]
Storage manager (storage_manager/)¶
BlobManager is the composition root — the only entry point that imports Credentials.
It builds the BlobServiceClient once and injects it into BlobClient and ContainerClient.
Two backends are available: Azure Blob SDK and APIM API.
from altametris.azure_sdk.authentication import Credentials
from altametris.azure_sdk.storage_manager import BlobManager
creds = Credentials(auth_mode="default")
# Azure Blob SDK backend
manager = BlobManager.from_blob_sdk(creds)
manager.blob.download(Path("/tmp/model.pt"), container="weights-dev", blob_path="segrails/model.pt")
manager.blob.exists(container="weights-dev", blob_path="segrails/model.pt")
blobs = manager.container.list(container="weights-dev", prefix="segrails/")
# APIM backend
manager = BlobManager.from_api(creds, base_url="https://am-ds-apim-dev.azure-api.net")
manager.blob.download(Path("/tmp/model.pt"), container="weights", blob_path="segrails/model.pt")
blobs = manager.container.list(container="weights", prefix="segrails")
CacheManager provides a local file cache with configurable TTL, used by WeightStep
in downstream packages to avoid re-downloading unchanged blobs.
from altametris.azure_sdk.storage_manager import CacheManager
cache = CacheManager(cache_dir="~/.cache/altametris", ttl_seconds=86400)
local_path = cache.get("weights-dev/segrails/model.pt") # None if expired or absent
cache.put("weights-dev/segrails/model.pt", Path("/tmp/model.pt"))
API module (api/)¶
HTTP clients for DataScience APIM endpoints, backed by OAuth2 JWT authentication.
Implements the same BlobClientBase / ContainerClientBase interfaces as storage/,
so they are interchangeable via BlobManager.from_api().
Class |
Operations |
|---|---|
|
|
|
|
|
Azure ML operations via API |
from altametris.azure_sdk.authentication import Credentials
from altametris.azure_sdk.api.api_blob import ApiBlobClient
from altametris.azure_sdk.api.api_container import ApiContainerClient
creds = Credentials(auth_mode="sp")
blob = ApiBlobClient(credentials=creds, base_url="https://am-ds-apim-dev.azure-api.net")
blob.download(Path("/tmp/model.pt"), container="weights", blob_path="segrails/model.pt")
container = ApiContainerClient(credentials=creds, base_url="https://am-ds-apim-dev.azure-api.net")
items = container.list(container="weights", prefix="segrails/")
Azure ML module (aml/)¶
Asset management¶
from altametris.azure_sdk.authentication import get_credentials
from altametris.azure_sdk.aml.asset import ComputeManager, EnvironmentManager, DatasetManager, ModelManager
ml_client = get_credentials("dev").get_ml_client()
# Compute
comp = ComputeManager(ml_client)
comp.create("my-cluster", "Standard_DS2_v2", min_nodes=0, max_nodes=4, priority="dedicated")
print(comp.list())
# Environment
env = EnvironmentManager(ml_client)
print(env.list())
print(env.get_latest_version("my-env"))
# Dataset
ds = DatasetManager(ml_client)
ds.register(datastore_name="workspaceblobstore", data_path="data/train", dataset_name="my-dataset")
# Workspace info
from altametris.azure_sdk.aml.asset import WorkspaceLoader
ws = WorkspaceLoader(ml_client).get_ws("mlw-mlops-dev")
YAML config parsing¶
from altametris.azure_sdk.aml.config import YamlParser
parser = YamlParser("path/to/pipeline.yml")
config = parser.parse()
assets = parser.get_assets() # {"environments": [...], "computes": [...], "models": [...]}
Endpoint management¶
from altametris.azure_sdk.aml.deployment import OnlineEndpointManager, BatchEndpointManager
# Online endpoint
online_mgr = OnlineEndpointManager(ml_client)
endpoint = online_mgr.create_endpoint("my-endpoint", "description")
print(endpoint.scoring_uri)
# Batch endpoint
batch_mgr = BatchEndpointManager(ml_client)
batch_ep = batch_mgr.get_endpoint("my-batch-endpoint")
job_name = batch_ep.invoke(input=my_input)
Pipeline submission (CLI)¶
alta-submit-job
Reads configuration from environment variables (EXPERIMENT_NAME, TRAIN_DATASET, etc.) and submits an Azure ML pipeline job.
Notebooks¶
Notebook |
Description |
|---|---|
|
End-to-end authentication walkthrough — all use cases |
|
Asset management, deployments, workspace operations |
|
YAML config parsing and asset extraction |
|
MLflow model registry queries |
|
Full ML pipeline example |