Amazon Bedrock AgentCore Runtime
AWS's serverless hosting platform for AI agents. GA October 2025. Each session runs in a dedicated microVM with isolated CPU/memory. Pay-per-second billing — no charge during LLM/tool I/O wait. Supports LangGraph, Strands, CrewAI, and any Python framework.
AWS's purpose-built hosting layer for agentic workloads. Released into general availability on October 13, 2025. Solves the operational problem of running long-running, I/O-heavy agents without paying for idle compute. AgentCore Runtime is one component of the broader Amazon Bedrock AgentCore suite, which also includes memory, identity, and gateway services.
[Source: AWS official documentation and blog, 2026-05-03]
Key Facts
- GA date: October 13, 2025
- Landing page: https://aws.amazon.com/bedrock/agentcore/
- Framework support: LangGraph, Strands Agents, CrewAI, or any Python code
- Model agnostic: Bedrock models, Anthropic Claude direct, Google Gemini, OpenAI
- Billing model: per-second, consumption-based
- Session isolation: dedicated microVM per session
Why AgentCore Runtime Exists
Traditional serverless (Lambda) doesn't fit agents:
- Lambda max execution: 15 minutes — many agent tasks run longer
- Lambda cold starts interrupt conversational sessions
- Lambda doesn't maintain in-memory state between tool calls
EC2/ECS/Fargate fits agents but is wasteful:
- Agents spend 30–70% of session time in I/O wait (LLM calls, tool calls, DB queries)
- You pay for compute even when nothing runs
AgentCore Runtime solves this: you only pay for CPU cycles actually consumed. During LLM/tool I/O wait, no CPU charge accrues.
Pricing
| Resource | Price |
|---|---|
| CPU | $0.0895 per vCPU-hour |
| Memory | $0.00945 per GB-hour |
Billing: per-second increments, minimum 1 second. Only actual CPU consumption and peak memory charged.
Example — a 10-minute agent session that spends 65% time waiting on LLM responses:
- Effective CPU billing: ~3.5 minutes of actual compute
- Much cheaper than Fargate at 10 full minutes
Session Lifecycle and Isolation
Each session runs in a dedicated microVM:
Boot → Initialize agent code → Process requests → (idle during I/O) → Terminate
Isolation guarantees:
- Separate CPU, memory, and filesystem per session
- After session end: microVM terminated, memory sanitised
- No cross-session contamination — each user's agent runs in its own VM
This makes AgentCore suitable for multi-tenant AI agents (one VM per user).
Deployment Methods
1. Container-based (for production)
# Build and push your agent as a Docker image
docker build -t my-agent .
aws ecr get-login-password | docker login --username AWS --password-stdin <account>.dkr.ecr.us-east-1.amazonaws.com
docker push <account>.dkr.ecr.us-east-1.amazonaws.com/my-agent:latest
# Deploy to AgentCore Runtime
aws bedrock-agentcore create-agent-runtime \
--runtime-name my-agent \
--runtime-artifact containerImage={imageUri=<ecr-uri>}2. Direct code upload (added November 2025)
Zip your Python code and upload — no Docker required:
zip -r agent.zip . -x "*.pyc" "__pycache__/*"
aws bedrock-agentcore create-agent-runtime \
--runtime-name my-agent \
--runtime-artifact codeArchive={s3Uri=s3://my-bucket/agent.zip}Best for rapid prototyping and iteration.
Framework Integration: LangGraph
from langgraph.graph import StateGraph, END
from typing import TypedDict
class AgentState(TypedDict):
messages: list
def build_graph():
# ... your LangGraph agent
return graph.compile()
# AgentCore Runtime invokes this as your agent endpoint
# Configure via runtime environment variables
agent = build_graph()AgentCore Runtime wraps your agent in an HTTP server and manages session routing.
Framework Integration: Strands Agents
from strands import Agent
from strands.models import BedrockModel
agent = Agent(
model=BedrockModel(model_id="us.anthropic.claude-sonnet-4-6-20250514-v1:0"),
system_prompt="You are a DevOps assistant."
)
# Strands agents deploy natively to AgentCore Runtime via the Strands SDKSee agents/strands-agents-sdk — Strands is AWS's own agent SDK and is designed to deploy to AgentCore Runtime.
MCP Server Hosting
AgentCore Runtime can host MCP servers as well as agents. Deploy an MCP server alongside an agent, or make MCP tools available to multiple agents:
# Deploy an MCP server to AgentCore
aws bedrock-agentcore create-agent-runtime \
--runtime-name my-mcp-server \
--runtime-artifact codeArchive={s3Uri=s3://my-bucket/mcp-server.zip} \
--runtime-type MCP_SERVEREnterprise Features (GA)
All available at GA:
- VPC integration: deploy agents inside your VPC for private networking
- AWS PrivateLink: connect to AgentCore endpoints without public internet
- CloudFormation: provision and manage runtimes as IaC
- Resource tagging: cost allocation, access control
vs Alternatives
| Option | Best for | Weakness |
|---|---|---|
| AgentCore Runtime | Long-running agents on AWS, multi-tenant | AWS vendor lock-in |
| Lambda | Short async tasks (<15 min) | No session state, time limit |
| Fargate | Always-on agents, custom networking | Pay for idle compute |
| Modal | GPU-heavy agents, Python-first | Smaller ecosystem |
| Cloud Run | GCP-native agents | No AWS ecosystem integration |
Connections
- agents/strands-agents-sdk — AWS's agent SDK; designed to deploy natively to AgentCore Runtime
- cloud/aws-core — EC2, Lambda, ECS context for understanding when AgentCore Runtime wins
- agents/langgraph — LangGraph agents can be hosted on AgentCore Runtime
- protocols/mcp — AgentCore Runtime can host MCP servers alongside agents
- infra/deployment — deployment options comparison for AI systems
- cloud/aws-lambda-patterns — Lambda is the alternative for short-running tasks
Open Questions
- How does AgentCore Runtime handle agent checkpointing for very long sessions (>1 hour)?
- What is the microVM boot latency at session start — does it add cold-start overhead?
- Can AgentCore Runtime session affinity route the same user to the same microVM across multiple turns?
Related reading