IntermediateAnalytics Engineer

Fix an N+1 query in SQLAlchemy

Load 100 Order records and access each order's customer name in a loop, triggering the N+1 pattern. Measure the exact query count using SQLAlchemy event listeners. Fix it using a joined load, confirm the query count drops from 101 to 1, and understand when to use joinedload versus selectinload.

Why this matters

N+1 queries are responsible for a significant share of production database load and are invisible until they are not. An ORM hides the SQL, which makes this pattern easy to write and hard to spot in code review. Understanding how to detect and fix it using the ORM's eager loading options is a baseline skill for any backend engineer working with a relational database.

Before you start

Step-by-step guide

  1. 1

    Create the models

    Define Customer (id, name) and Order (id, customer_id FK, total) models. Add the relationship on Order: customer = relationship('Customer', lazy='select'). This lazy='select' is the default and the source of the N+1. Create the tables and seed 10 customers with 100 orders spread across them.

  2. 2

    Instrument the session to count queries

    Add an event listener: from sqlalchemy import event; event.listen(engine, 'before_cursor_execute', lambda *args: counter.increment()). A simple class with an integer attribute is sufficient for the counter. Reset the counter before each measurement.

  3. 3

    Trigger and count the N+1

    Load all orders: orders = session.query(Order).all(). Then loop: for o in orders: print(o.customer.name). Reset and increment the counter inside the event listener. After the loop, print the count; it should be 101: 1 query for orders, 1 query per order for the customer.

  4. 4

    Fix with joinedload

    Change the query: session.query(Order).options(joinedload(Order.customer)).all(). Reset the counter and run the same loop. The count should be 1: a single query using a JOIN. Print the SQL using echo=True on the engine to confirm the JOIN is present.

  5. 5

    Compare joinedload vs selectinload

    Repeat the measurement with selectinload instead of joinedload. The count will be 2 (separate SELECT with IN clause) rather than 1. Run both against 100 rows and against 10,000 rows and measure elapsed time. selectinload typically wins at scale when the related table is large; joinedload wins for small datasets.

Relevant Axiom pages

What to do next

Back to Practice Lab