Find and fix an N+1 query
Set up SQLAlchemy models with a lazy-loaded relationship between Orders and Customers. Load 100 order records and access each order's customer name in a loop, deliberately triggering the N+1 pattern. Measure the query count using SQLAlchemy's event system, fix it with selectinload, and confirm the count drops from 101 to 1.
Why this matters
N+1 queries are responsible for a disproportionate share of production database load. They are invisible in development (where N is small), catastrophic in production (where N is 10,000), and trivially fixable once you know what to look for. The ability to spot one in a stack trace or a slow query log is one of the most commercially valuable debugging skills a backend engineer has.
Before you start
- Python with sqlalchemy and a database (SQLite is fine for this exercise)
- Basic understanding of what an ORM is and what a relationship() means
- Understanding of what a SQL JOIN is
- pip install sqlalchemy
Step-by-step guide
- 1
Define models with a lazy relationship
Create Customer (id, name) and Order (id, customer_id FK, amount) SQLAlchemy models. Define the relationship on Order as: customer = relationship('Customer', lazy='select'). This is the default and the source of the N+1.
- 2
Seed 100 orders across 10 customers
Insert 10 customers and 100 orders distributed across them. Use a session.bulk_insert_mappings for speed. Verify the data is there with a simple SELECT before moving to the measurement step.
- 3
Trigger and count the N+1
Add a SQLAlchemy event listener on engine 'before_cursor_execute' that increments a counter. Load all orders with session.query(Order).all() and then loop over them accessing order.customer.name. Print the query count; it should be 101: 1 for orders, 100 for customer lookups.
- 4
Fix with selectinload
Change the query to: session.query(Order).options(selectinload(Order.customer)).all(). Reset the counter and run the same loop. The count should be 2: 1 for orders, 1 for all customers via an IN clause. Print both queries to confirm this is what happened.
- 5
Understand when to use joinedload vs selectinload
Replace selectinload with joinedload and print the generated SQL. You will see a JOIN instead of a second SELECT. Run both versions on 100 and then 10,000 rows and note which is faster. selectinload wins when the related table is large; joinedload wins for simple one-to-one cases.