Getting Started with AI Engineering
First working Anthropic API call in under 10 minutes — API key setup, SDK install, stateless multi-turn conversation pattern, streaming, and the nine most common beginner mistakes.
Your first working LLM integration in under 10 minutes. This page assumes you know Python. Everything else is explained from scratch.
1. Get an API Key
- Go to console.anthropic.com
- Create an account, verify email
- Navigate to API Keys → Create Key
- Copy it — you won't see it again
Store it as an environment variable, never in code:
# .env
ANTHROPIC_API_KEY=sk-ant-...# Load it in Python
import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.environ["ANTHROPIC_API_KEY"] # KeyError if missing — intentionalAdd .env to .gitignore before your first commit. Leaked API keys are a real cost risk.
2. Install the SDK
pip install anthropic python-dotenv
# or with uv (faster):
uv add anthropic python-dotenv3. Your First API Call
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from environment
response = client.messages.create(
model="claude-haiku-4-5-20251001", # cheapest model — right for experiments
max_tokens=1024,
messages=[
{"role": "user", "content": "What is the capital of France?"}
],
)
print(response.content[0].text) # ParisRun it. If you see "Paris", you're connected.
4. Understanding the Response
print(response.id) # "msg_01XFDUDYJgAACzvnptvVoYEL" — unique message ID
print(response.model) # "claude-haiku-4-5-20251001"
print(response.stop_reason) # "end_turn" — model finished naturally
print(response.content) # list of content blocks
print(response.content[0].type) # "text"
print(response.content[0].text) # the actual answer
# Token usage — this is what you pay for
print(response.usage.input_tokens) # tokens you sent
print(response.usage.output_tokens) # tokens the model generatedThe content field is a list because the model can return multiple blocks (text + tool calls). For simple chat, response.content[0].text is always what you want.
5. System Prompts
A system prompt sets the model's role and behaviour before the conversation starts. It's the most impactful thing you control.
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=256,
system="You are a concise assistant. Answer in one sentence maximum.",
messages=[
{"role": "user", "content": "Explain quantum computing."}
],
)
print(response.content[0].text)
# "Quantum computing uses quantum mechanical phenomena like superposition
# and entanglement to process information differently than classical computers."Without the system prompt, the model gives a longer answer. The system prompt is not magic. It's an instruction the model follows. Write it like you'd brief a new colleague.
6. Multi-Turn Conversation
The API is stateless. To have a conversation, you pass the full history on every call:
messages = []
def chat(user_message: str) -> str:
messages.append({"role": "user", "content": user_message})
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=messages,
)
assistant_message = response.content[0].text
messages.append({"role": "assistant", "content": assistant_message})
return assistant_message
print(chat("My name is Lewis."))
print(chat("What's my name?")) # Correctly says "Lewis" — it has the historyThe messages list grows with every turn. In production you'll need to trim it. See prompting/context-engineering.
7. Streaming
Without streaming, you wait for the full response before seeing anything. With streaming, text appears word by word. Much better UX for longer responses.
with client.messages.stream(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=[{"role": "user", "content": "Write a haiku about Python."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
print() # newline at end8. Async (for Django / FastAPI)
If your backend is async (FastAPI, Django async views), use the async client:
import asyncio
import anthropic
async def async_chat(query: str) -> str:
client = anthropic.AsyncAnthropic()
response = await client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=[{"role": "user", "content": query}],
)
return response.content[0].text
# Test it
asyncio.run(async_chat("Hello"))9. Common First Mistakes
| Mistake | Fix |
|---|---|
AuthenticationError | Check ANTHROPIC_API_KEY is set in your environment, not just the .env file — run load_dotenv() first |
max_tokens too low — response cuts off mid-sentence | Increase max_tokens. Set it to what the task actually needs, not 50 |
Passing "assistant" as first message role | First message must be "user" |
| Alternating roles wrong | Messages must strictly alternate user → assistant → user → assistant |
Using response.content as a string | It's a list. Always response.content[0].text |
| Hardcoding the API key in code | Use environment variables. Always. |
Using claude-opus-4-7 for everything | Use Haiku for experiments and classification; Sonnet for production tasks |
| Not setting a system prompt | The model has no context about its role — always set one |
max_tokens=4096 everywhere | Output tokens cost 5x input tokens. Set it to what the task needs |
10. What's Next
You've made your first API call. The natural next steps, in order:
- Add a system prompt to shape the model's behaviour for your use case
- Build multi-turn chat with history management
- Add tool use — give the model access to functions your code defines
- Add RAG — let the model answer questions from your own documents
- Write tests with mocked API calls — see test-automation/testing-llm-apps
- Add evals — measure quality before shipping — see evals/methodology
See synthesis/learning-path for a structured progression through the whole stack.
Key Facts
- SDK install:
pip install anthropic python-dotenvoruv add anthropic python-dotenv - Always read API key from environment:
os.environ["ANTHROPIC_API_KEY"]— never hardcode - First message role must be
"user"—"assistant"as first role causes an error - Messages must strictly alternate: user → assistant → user → assistant
response.contentis a list of content blocks — always access text asresponse.content[0].text- API is stateless: pass the full conversation history on every call
- Use Haiku for experiments and classification; Sonnet for production tasks; Opus only for complex reasoning
- Output tokens cost 5x input tokens — set max_tokens to what the task actually needs, not 4096
stop_reason: "end_turn"means the model finished naturally;"max_tokens"means it was cut off
Connections
- apis/anthropic-api — full API reference: caching, batch, streaming, tool use
- llms/claude — which model to use for what task and why
- synthesis/cost-optimisation — how to keep costs low as you scale
- synthesis/learning-path — what to learn next and in what order
- prompting/techniques — system prompt design and XML structuring once the basics work
Open Questions
- Is
uvuniversally better thanpip installfor new AI engineering projects, or are there cases where pip is still preferable? - Should beginners start with Haiku even for quality-sensitive tasks just to validate the integration first?
- At what point does the
AsyncAnthropicclient become necessary vs optional for typical web backends?
Related reading