Build a concurrent API client with asyncio and httpx
Write an async Python script that fetches data from 50 endpoints concurrently using httpx and asyncio, applies a semaphore to cap concurrency at 10, handles retries with exponential backoff for 429 and 5xx responses, and measures total elapsed time against a sequential baseline.
Why this matters
Most I/O-bound Python code in production either blocks the event loop accidentally or spawns too many concurrent requests and gets rate-limited. Understanding asyncio; coroutines, tasks, semaphores, and gather; is what separates Python engineers who can actually use the async ecosystem from those who copy-paste it and wonder why it is slow.
Before you start
- Python 3.10+ with httpx installed (pip install httpx)
- Basic Python (functions, loops, exception handling)
- No prior async experience required; you will build from first principles
Step-by-step guide
- 1
Write a synchronous baseline
Use httpx.Client (the sync version) to fetch each URL in a for loop. Wrap in time.perf_counter() calls. Record total elapsed time for 10 URLs. This is your benchmark to beat.
- 2
Write the first async version
Replace with httpx.AsyncClient and async for. Wrap in asyncio.run(). Measure again. You will likely see similar performance; sequential async is not faster, it just does not block the thread. Understanding this distinction is the point.
- 3
Use asyncio.gather for true concurrency
Wrap each fetch in an async task, collect them in a list, and pass the list to asyncio.gather(). Measure again. You should see near-linear speedup up to the limit of your network. Print per-URL elapsed time to verify they overlap.
- 4
Add a semaphore to cap concurrency
Create asyncio.Semaphore(10). Wrap each fetch in async with sem:. Without a semaphore, 50 concurrent requests can trigger rate limits or overwhelm slow servers. Measure throughput at sem=5, 10, 20, 50 and find the sweet spot.
- 5
Add retry with exponential backoff
Wrap the fetch in a loop: catch httpx.HTTPStatusError for 429 and 5xx, await asyncio.sleep(2 ** attempt + random.random()), and retry up to 3 times. Verify the retry fires by temporarily pointing at a URL that returns 500.
- 6
Compare all three versions
Print a summary table: sequential, sequential async, concurrent with semaphore. For 50 requests hitting a remote API, the concurrent version is typically 5-20x faster. Identify the bottleneck; it is almost always the semaphore size or server-side rate limits, not Python overhead.