Serverless Patterns
Architectural patterns for serverless compute — Lambda, Cloud Run, Azure Functions.
Architectural patterns for serverless compute. Lambda, Cloud Run, Azure Functions. Serverless shifts operations overhead to the cloud provider: no VM management, auto-scaling to zero, pay-per-invocation.
When Serverless Fits
Good fit:
- Event-driven workloads (S3 upload triggers, DynamoDB streams, SQS consumers)
- Webhooks and API backends with spiky or low traffic
- Scheduled tasks (CloudWatch Events → Lambda)
- Background processing (image resize, PDF generation, notifications)
- AI inference APIs — pair with provisioned concurrency to avoid cold starts
Poor fit:
- Long-running jobs > 15 min (Lambda limit)
- Stateful applications requiring persistent connections (WebSocket servers)
- High-throughput, consistent traffic (ECS/Kubernetes is cheaper at scale)
- Applications requiring GPU (not available in Lambda; use ECS GPU tasks)
Lambda — Execution Model
Invocation types:
Synchronous (RequestResponse) — API Gateway, ALB, Lambda URL; caller waits for response
Asynchronous (Event) — S3, SNS, EventBridge; Lambda retries on failure
Polling (Event Source) — SQS, Kinesis, DynamoDB Streams; Lambda polls and batches
Concurrency:
Reserved concurrency — guarantees capacity, limits maximum (isolation)
Provisioned concurrency — pre-warmed instances (eliminates cold starts, costs more)
Timeout:
Max 15 minutes. Set to 2x your expected p99 latency.
API workloads: 30s. Background: 5-15 min.
SAM Template — API + Lambda Stack
# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Globals:
Function:
Runtime: python3.12
MemorySize: 512
Timeout: 30
Environment:
Variables:
LOG_LEVEL: INFO
POWERTOOLS_SERVICE_NAME: myapp
Resources:
ProductsApi:
Type: AWS::Serverless::Api
Properties:
StageName: prod
Auth:
DefaultAuthorizer: CognitoAuthorizer
Authorizers:
CognitoAuthorizer:
UserPoolArn: !GetAtt UserPool.Arn
ListProductsFunction:
Type: AWS::Serverless::Function
Properties:
Handler: handlers.products.list_handler
CodeUri: src/
Events:
ApiEvent:
Type: Api
Properties:
Path: /products
Method: GET
RestApiId: !Ref ProductsApi
Policies:
- DynamoDBReadPolicy:
TableName: !Ref ProductsTable
Environment:
Variables:
TABLE_NAME: !Ref ProductsTable
ProductsTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
KeySchema:
- AttributeName: id
KeyType: HASH
AttributeDefinitions:
- AttributeName: id
AttributeType: SAWS Lambda Powertools
# handlers/products.py
from aws_lambda_powertools import Logger, Tracer, Metrics
from aws_lambda_powertools.metrics import MetricUnit
from aws_lambda_powertools.event_handler import APIGatewayRestResolver
from aws_lambda_powertools.utilities.typing import LambdaContext
logger = Logger()
tracer = Tracer()
metrics = Metrics()
app = APIGatewayRestResolver()
@app.get("/products")
@tracer.capture_method
def list_products():
products = table.scan()["Items"]
metrics.add_metric(name="ProductsListed", unit=MetricUnit.Count, value=len(products))
return {"products": products}
@logger.inject_lambda_context(log_event=True)
@tracer.capture_lambda_handler
@metrics.log_metrics
def list_handler(event: dict, context: LambdaContext) -> dict:
return app.resolve(event, context)Cold Start Mitigation
# Move heavy initialisation outside the handler (runs once per container)
import boto3
from myapp.db import create_engine
# These run at cold start, not per invocation
_engine = create_engine(os.environ["DATABASE_URL"])
_s3_client = boto3.client("s3")
_model = load_model("s3://models/latest") # load ML model once
def handler(event, context):
# _engine, _s3_client, _model already initialised
return process(event, _engine)# Provisioned concurrency — pre-warm N instances
aws lambda put-provisioned-concurrency-config \
--function-name my-function \
--qualifier production \
--provisioned-concurrent-executions 10
# Auto-scaling provisioned concurrency by schedule (morning ramp, evening scale-down)
aws application-autoscaling put-scaling-policy \
--service-namespace lambda \
--resource-id function:my-function:production \
--scalable-dimension lambda:function:ProvisionedConcurrency \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration \
'{"TargetValue": 0.7, "PredefinedMetricSpecification": {"PredefinedMetricType": "LambdaProvisionedConcurrencyUtilization"}}'Event-Driven Saga with Lambda
Order placement saga (Lambda-native, no Step Functions):
API Gateway → Lambda (PlaceOrder)
→ DynamoDB (write order, status=pending)
→ EventBridge (publish order.placed)
→ Lambda (ReserveInventory) [triggered by EventBridge]
→ Lambda (ChargePayment) [triggered by EventBridge after inventory OK]
→ Lambda (FulfillOrder) [triggered after payment]
→ Lambda (NotifyCustomer) [triggered after fulfillment]
Each Lambda handles one step. EventBridge is the backbone. Each Lambda publishes
the next event on success, or a compensating event on failure.
Google Cloud Run
# Deploy a container to Cloud Run
gcloud run deploy myapp \
--image gcr.io/myproject/myapp:latest \
--platform managed \
--region europe-west1 \
--allow-unauthenticated \
--min-instances 0 \
--max-instances 100 \
--memory 512Mi \
--cpu 1 \
--concurrency 80 \
--timeout 30s \
--set-env-vars LOG_LEVEL=INFO \
--set-secrets DATABASE_URL=myapp-db-url:latestCommon Failure Cases
Lambda cold start spikes cause API timeout errors under bursty traffic
Why: a sudden burst of concurrent requests forces Lambda to spin up many new execution environments simultaneously; each cold start takes 2–5 seconds for Python, causing the API Gateway's 29-second timeout to appear unreachable even though the function works fine in steady state.
Detect: CloudWatch metrics show a spike in InitDuration correlating exactly with periods of high ConcurrentExecutions; error rate rises only at burst onset and recovers within minutes.
Fix: configure provisioned concurrency scaled to your expected burst level; for AI inference Lambdas, pre-load the model outside the handler so the cold start cost is paid only once per container lifecycle.
SQS-triggered Lambda accumulates DLQ messages because retry logic is not idempotent
Why: a transient error causes Lambda to fail a batch, SQS retries all messages in the batch, and the non-idempotent handler processes some messages twice before failing on the duplicate, landing everything in the DLQ.
Detect: DLQ depth grows steadily; DLQ messages have the same MessageDeduplicationId as messages already successfully processed elsewhere; downstream effects (charges, emails) appear duplicated.
Fix: use Lambda Powertools idempotency decorator with a DynamoDB persistence store keyed on messageId; also set FunctionResponseTypes: [ReportBatchItemFailures] so only failed records are retried.
Event-driven saga leaves the system in an inconsistent state after a downstream Lambda fails
Why: there is no compensating event handler; if ChargePayment fails after ReserveInventory has already succeeded, inventory remains reserved with no corresponding order.
Detect: inventory counts drift from expected values; orders in pending status with no payment event in the audit trail accumulate over time.
Fix: publish a compensating event (inventory.reservation.cancelled) from the payment Lambda's error handler; implement a periodic reconciliation job that detects orphaned pending orders older than 5 minutes and triggers rollback.
Cloud Run container exits immediately after deploy with exit code 1
Why: the container's startup command fails because an environment variable or secret (--set-secrets) was not injected correctly — the secret version does not exist or the service account lacks Secret Manager accessor permission.
Detect: Cloud Run console shows revision deployment failed with "Container failed to start"; Cloud Logging shows the startup error; gcloud run services describe shows no active revision.
Fix: verify the secret version exists and is not destroyed; grant the Cloud Run service account the roles/secretmanager.secretAccessor IAM role on the specific secret; redeploy.
Connections
cloud-hub · cloud/aws-lambda-patterns · cloud/aws-step-functions · cloud/aws-sqs-sns · cloud/aws-api-gateway · cloud/finops-cost-management · llms/ae-hub
Open Questions
- What monitoring and alerting matter most when this is deployed in production?
- At what scale or workload does this approach hit its practical limits?
Related reading