Azure Core Services

Microsoft Azure. Third cloud globally but dominant in enterprise (Microsoft 365 integration, Entra ID as de facto corporate identity).

Microsoft Azure. Third cloud globally but dominant in enterprise (Microsoft 365 integration, Entra ID as de facto corporate identity). Strong for .NET workloads, hybrid cloud (Azure Arc), and OpenAI partnership. 23% cloud market share (2026).


Compute

Azure Virtual Machines

VMs. Size families: B-series (burstable, cheap), D-series (general), F-series (compute), E-series (memory), N-series (GPU, ND A100 for ML). Spot VMs (Spot) up to 90% cheaper, evicted with 30s notice.

Azure App Service

PaaS for web apps and APIs. No container management. Supports .NET, Java, Python, Node.js, PHP. Scale out via App Service Plan. Deployment slots for blue/green with traffic splitting.

Azure Functions

Serverless event-driven functions. Consumption plan (scale to zero), Premium plan (no cold start, VNet integration), Dedicated plan (App Service). Durable Functions for stateful orchestrations (equivalent to AWS Step Functions).

import azure.functions as func

def main(req: func.HttpRequest) -> func.HttpResponse:
    name = req.params.get("name", "World")
    return func.HttpResponse(f"Hello, {name}!")

AKS — Azure Kubernetes Service

Managed Kubernetes. Free control plane. Node pools: system (core cluster components) and user (workloads). Supports Windows node pools (useful for .NET workloads). Azure CNI for pod-level networking in the VNet.

# Create cluster
az aks create \
  --resource-group my-rg \
  --name my-cluster \
  --node-count 3 \
  --enable-managed-identity \
  --enable-oidc-issuer \
  --enable-workload-identity

# Get credentials
az aks get-credentials --resource-group my-rg --name my-cluster

Workload Identity — Azure's equivalent of GKE Workload Identity and AWS IRSA. Pod service account annotated with Azure AD managed identity; no credentials in pods.


Storage

Blob Storage

Object storage (equivalent to S3). Account → Container → Blob hierarchy. Access tiers: Hot → Cool → Cold → Archive. Smart Tiering (GA 2025): automatically moves blobs between Hot/Cool/Cold based on access patterns. Lifecycle management policies for rule-based transitions.

# Upload
az storage blob upload \
  --account-name mystorageaccount \
  --container-name mycontainer \
  --file ./data.csv \
  --name data.csv

# Generate SAS token (1-hour expiry)
az storage blob generate-sas \
  --account-name mystorageaccount \
  --container-name mycontainer \
  --name data.csv \
  --permissions r \
  --expiry "2026-05-01T12:00:00Z"

Azure SQL / Cosmos DB

  • Azure SQL Database — managed SQL Server (PaaS). Serverless tier scales to zero. Hyperscale tier for large databases (100TB+).
  • Cosmos DB — multi-model NoSQL (document, key-value, graph, table). Multi-region writes, <10ms latency SLA at p99. API compatibility: Core SQL, MongoDB (wire protocol), Cassandra, Gremlin, Table Storage.

Azure Files / Managed Disks

  • Azure Files — managed SMB/NFS file shares. Premium tier (SSD) for AKS ReadWriteMany volumes.
  • Managed Disks — block storage for VMs. Premium SSD (p-series), Standard SSD, Standard HDD, Ultra Disk (160,000 IOPS, sub-ms latency for databases).

AI and ML

Azure OpenAI Service

OpenAI models (GPT-4o, o-series, DALL-E, Whisper, Embeddings) on Azure infrastructure. Data privacy guarantees (no training on your data). Regional deployments (UK South, Sweden Central, East US 2).

from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint="https://my-instance.openai.azure.com/",
    api_key="...",
    api_version="2024-02-01"
)

response = client.chat.completions.create(
    model="gpt-4o",  # deployment name
    messages=[{"role": "user", "content": "Hello"}]
)

Azure Machine Learning

Managed ML platform. Compute clusters for training, managed endpoints for inference. MLflow integration for experiment tracking. Prompt Flow for LLM orchestration (alternative to LangChain for Azure-native teams).


Networking

Azure Virtual Network (VNet)

Regional network. Subnets within a VNet. VNet Peering for inter-VNet connectivity. Azure Firewall for egress control. Network Security Groups (NSGs). Stateful firewall at subnet/NIC level (same role as AWS Security Groups).

Application Gateway / Azure Front Door

  • Application Gateway — regional L7 LB with WAF.
  • Azure Front Door — global CDN + L7 LB (equivalent to AWS CloudFront + ALB). DDoS protection, WAF, SSL termination.

Expose PaaS services (Storage, SQL, Cosmos) inside your VNet via private IP. Eliminates public internet exposure for data-path traffic.


Security and Identity

Entra ID (formerly Azure Active Directory)

Microsoft's cloud identity platform. The de facto enterprise identity standard. Most corporates already have it. Conditional Access for MFA enforcement. B2C variant for customer identity. SAML/OIDC federation.

Managed Identities (system-assigned or user-assigned). Replace service account credentials for Azure-to-Azure auth.

Azure Key Vault

Secrets, keys, certificates. HSM-backed key generation. Soft-delete and purge protection prevent accidental deletion. Reference Key Vault secrets directly in App Service/AKS via CSI driver. Secret mounts, no code change required.

from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

credential = DefaultAzureCredential()
client = SecretClient(vault_url="https://myvault.vault.azure.net/", credential=credential)
secret = client.get_secret("db-password")
print(secret.value)

Microsoft Defender for Cloud

Unified security posture management. Security score, CSPM (Cloud Security Posture Management), threat detection across VMs, containers, databases. MCSB (Microsoft Cloud Security Benchmark) as the default policy baseline.


Key CLI

# Auth
az login
az account set --subscription "My Subscription"

# Resource groups
az group create --name my-rg --location uksouth

# List resources
az vm list --output table
az aks list --output table
az storage account list --output table

# Managed identity
az identity create --name my-identity --resource-group my-rg

Azure vs AWS Equivalents

AzureAWS
Virtual MachinesEC2
App ServiceElastic Beanstalk
Azure FunctionsLambda
AKSEKS
Blob StorageS3
Azure SQLRDS
Cosmos DBDynamoDB
Entra IDIAM + Cognito
Key VaultSecrets Manager + KMS
Azure OpenAIAmazon Bedrock
Azure MLSageMaker
Front DoorCloudFront + ALB
NSGSecurity Groups

Common Failure Cases

Managed Identity not available at container startup Why: The identity endpoint inside the container becomes reachable a few seconds after pod/container start, not instantly. Detect: DefaultAzureCredential raises CredentialUnavailableError during the first seconds of a pod lifecycle or after a node restart. Fix: Add retry logic with exponential backoff around the first credential acquisition, or set AZURE_AUTHORITY_HOST and use ManagedIdentityCredential directly with a retry policy.

AKS workload identity token not refreshed after expiry Why: OIDC service account tokens have a bounded TTL; if the application caches the token long-term rather than re-requesting it from the projected volume, calls fail after expiry. Detect: Azure SDK calls return 401 errors roughly 24 hours after pod start without restart. Fix: Always acquire credentials via DefaultAzureCredential() per-request or use the Azure SDK's built-in token refresh; never cache the raw access token string.

Cosmos DB request units (RU/s) exceeded causing 429s Why: Each Cosmos operation consumes RUs proportional to document size and index complexity; default provisioned throughput is often undersized for write spikes. Detect: HTTP 429 responses with the header x-ms-retry-after-ms from the Cosmos SDK; TotalRequestUnits metric in Azure Monitor spikes to provisioned limit. Fix: Enable autoscale RU/s, or switch to serverless mode for spiky workloads; reduce RU consumption by projecting only needed fields and limiting cross-partition queries.

Blob Storage SAS token leaks causing unauthorised access Why: SAS tokens embedded in client-side code or logged in application traces grant full access until expiry. Detect: Azure Storage diagnostic logs showing access from unexpected IPs or outside business hours using the same SAS key. Fix: Generate SAS tokens server-side with short TTLs (minutes, not hours), use User Delegation SAS (backed by Entra ID), and rotate the storage account key if a token is compromised.

App Service deployment slot swap not warming up dependencies Why: Slot swap completes before the new slot's app has established database connection pools or loaded in-memory caches, causing errors immediately after swap. Detect: Elevated error rates and high latency in the first 30-60 seconds after a slot swap. Fix: Use the applicationInitialization element in web.config or a startup probe endpoint that returns 200 only after warming up critical dependencies before the swap completes.

Connections

Open Questions

  • What monitoring and alerting matter most when this is deployed in production?
  • At what scale or workload does this approach hit its practical limits?