IntermediateSoftware Engineer

Build a concurrent API client with asyncio and httpx

Write an async Python script that fetches data from 50 endpoints concurrently using httpx and asyncio, applies a semaphore to cap concurrency at 10, handles retries with exponential backoff for 429 and 5xx responses, and measures total elapsed time against a sequential baseline.

Why this matters

Most I/O-bound Python code in production either blocks the event loop accidentally or spawns too many concurrent requests and gets rate-limited. Understanding asyncio; coroutines, tasks, semaphores, and gather; is what separates Python engineers who can actually use the async ecosystem from those who copy-paste it and wonder why it is slow.

Before you start

Step-by-step guide

  1. 1

    Write a synchronous baseline

    Use httpx.Client (the sync version) to fetch each URL in a for loop. Wrap in time.perf_counter() calls. Record total elapsed time for 10 URLs. This is your benchmark to beat.

  2. 2

    Write the first async version

    Replace with httpx.AsyncClient and async for. Wrap in asyncio.run(). Measure again. You will likely see similar performance; sequential async is not faster, it just does not block the thread. Understanding this distinction is the point.

  3. 3

    Use asyncio.gather for true concurrency

    Wrap each fetch in an async task, collect them in a list, and pass the list to asyncio.gather(). Measure again. You should see near-linear speedup up to the limit of your network. Print per-URL elapsed time to verify they overlap.

  4. 4

    Add a semaphore to cap concurrency

    Create asyncio.Semaphore(10). Wrap each fetch in async with sem:. Without a semaphore, 50 concurrent requests can trigger rate limits or overwhelm slow servers. Measure throughput at sem=5, 10, 20, 50 and find the sweet spot.

  5. 5

    Add retry with exponential backoff

    Wrap the fetch in a loop: catch httpx.HTTPStatusError for 429 and 5xx, await asyncio.sleep(2 ** attempt + random.random()), and retry up to 3 times. Verify the retry fires by temporarily pointing at a URL that returns 500.

  6. 6

    Compare all three versions

    Print a summary table: sequential, sequential async, concurrent with semaphore. For 50 requests hitting a remote API, the concurrent version is typically 5-20x faster. Identify the bottleneck; it is almost always the semaphore size or server-side rate limits, not Python overhead.

Relevant Axiom pages

What to do next

Back to Practice Lab