Build real skills.
Hands-on exercises per role path. Each one is a concrete task you can complete in a sitting: not reading, not watching, building.
AI Engineer
Prompt engineering, RAG, agents, evals, and production operations.
Build a RAG pipeline from scratch
Chunk a PDF, embed it, store it in Chroma, and answer questions against it with Claude.
Write an LLM-as-judge eval
Build a 10-case golden set and score faithfulness using Claude as the judge.
Build a LangGraph agent with memory
A stateful agent that plans, searches the web, and remembers what it has found.
Implement and measure prompt caching
Add cache_control blocks to a long system prompt and measure real token savings.
Trace an LLM call end-to-end in Langfuse
Instrument a chatbot with Langfuse spans and identify the highest-cost query.
Wire up Claude tool use with multiple tools
Define 3 tools in JSON Schema, handle tool_use blocks, and chain a multi-step tool call sequence.
Extract structured data reliably with Pydantic and Claude
Define a Pydantic model, prompt Claude to return matching JSON, and validate 20 real-world inputs.
Build a document understanding pipeline with Claude vision
Send PDF pages as images to Claude, extract structured data, and benchmark accuracy against ground truth.
Fine-tune a small model with QLoRA
Prepare a dataset, run a QLoRA fine-tune on a 7B model, and measure task performance before and after.
Software Engineer
Clean code, design patterns, system design, and production-grade implementation.
Refactor a God class using SOLID
Split a 50-line class doing HTTP, database, and email into three single-responsibility classes.
Implement a circuit breaker decorator
A Python decorator that opens after 3 failures, half-opens after 30 seconds, closes on success.
TDD a log line parser
Build a structured log parser using strict red-green-refactor; all tests written first.
Design a multi-tenant Postgres schema
Row-level security policies that isolate tenant data; verified with two test tenants.
Find and fix an N+1 query
Trigger 100 extra queries with lazy loading, then fix it with selectinload.
Build a concurrent API client with asyncio and httpx
Fetch 50 URLs concurrently, cap with a semaphore, and measure the speedup vs sequential.
Implement cache-aside with Redis
Add a Redis cache to a slow function, measure hit rate and latency, and handle cache invalidation.
Build an append-only event store with projection and replay
Implement event sourcing for a bank account domain, replay history from scratch, and query point-in-time state.
Cloud Engineer
AWS, containers, Kubernetes, infrastructure as code, and cost engineering.
Containerise and deploy a FastAPI app
Multi-stage Dockerfile, health check endpoint, non-root user; deployed to ECS Fargate.
Provision infrastructure with Terraform
S3 bucket with versioning, CloudFront distribution, and OAI; destroyed and recreated cleanly.
Build a CI/CD pipeline with GitHub Actions
Lint, test, build Docker image, push to ECR, deploy to ECS; failing fast on test failure.
Set up billing alerts and cost reporting
CloudWatch billing alarm at $100/month plus a boto3 script showing spend by service.
Configure Kubernetes autoscaling under load
HPA targeting 60% CPU, 200 VU load test with k6; verify pods scale up and back down.
Write and test least-privilege IAM policies
Scope an IAM role to exactly the permissions a Lambda needs, then verify deny rules with the IAM simulator.
Deploy a serverless API with Lambda and API Gateway
Write a Lambda handler, expose it via API Gateway HTTP API, and deploy the whole stack with Terraform.
Build a production CloudWatch dashboard for a running service
Emit custom metrics, set alarms with anomaly detection, and build a Terraform-managed dashboard.
QA Engineer
Test strategy, exploratory testing, risk-based thinking, and quality in modern delivery.
Write and execute test charters
Three exploratory charters for a login page, one executed for 45 minutes, bugs filed.
Build a risk matrix for a checkout flow
Top 5 risks scored by likelihood × impact, 3 test cases per risk, severity justified.
Design boundary value test cases
Complete boundary and equivalence class cases for a text field and a number field.
File a reproducible bug report
Find a real bug through exploratory testing and write a complete, reproducible report.
Audit test coverage against user flows
Map a feature's test suite against user flows, find 3 gaps, estimate business risk.
Audit a web page for accessibility issues
Run a WCAG 2.1 audit using axe, keyboard nav, and a screen reader; then write remediation notes.
Exploratory test a REST API with boundary thinking
Map an API surface, write a session charter, and find at least 3 bugs using structured exploration.
Design a regression testing strategy for a feature
Map user flows, prioritise by risk, define automation boundaries, and estimate CI cost.
SDET
API testing, performance, test architecture, and distributed systems debugging.
Test a streaming LLM endpoint with Playwright
Capture SSE chunks as they arrive, reconstruct the response, assert content and format.
Build an isolated database fixture chain
pytest fixtures that spin up Postgres, run migrations, seed 5 rows, and roll back after each test.
Write and run a k6 load test
Ramp 50 to 200 VUs over 2 minutes, assert p95 < 300ms and error rate < 0.1%.
Implement consumer-driven contract testing
Pact contract between a Python consumer and FastAPI provider; CI breaks when provider violates contract.
Debug a flaky Playwright test
Identify an exact race condition using the Playwright trace viewer and fix it without any sleep.
Set up visual regression testing with Playwright screenshots
Capture baseline screenshots, detect pixel diffs on UI changes, and wire the suite into CI.
Automate accessibility audits with axe-core and Playwright
Inject axe into running pages, assert zero critical violations, and add the check to your test suite.
Mock external APIs in tests with respx and Playwright route()
Isolate Python tests with respx and browser tests with page.route(); cover error and timeout cases.
Analytics Engineer
SQL, schema design, data tools, and the query performance skills production demands.
Write a window function ranking query
Rank customers by spend per category with RANK() OVER, then find who dropped out of the top 10.
Design a star schema
fact_orders, dim_customer, dim_product, dim_date; with indexes and a clear grain decision.
Analyse a large CSV with DuckDB
Query a 500MB CSV without loading it into memory; top categories, peak month, busiest hour.
Fix an N+1 query in SQLAlchemy
Trigger 100 customer queries from lazy loading, fix with joinedload, verify the count drops to 1.
Build a dbt revenue model with tests
Transform raw orders into daily_revenue with a 7-day rolling average and schema tests.
Profile a dataset for quality issues with DuckDB
Measure nulls, duplicates, outliers, and cardinality across a 50k-row CSV in under 50 lines of SQL.
Implement Type 2 slowly changing dimensions in SQL
Track customer attribute history with effective dates, expire old rows, and query point-in-time state.
Write comprehensive dbt tests including custom macros
Cover your dbt models with singular tests, a custom generic macro, source freshness, and CI.