N+1 Queries: How the Repository Pattern Solves Them

Every Python developer has faced N+1 queries. You add eager loading everywhere, use SQL profilers religiously, and still find yourself playing whack-a-mole with performance issues. There's a better way.

The Problem with Implicit Loading

SQLAlchemy's lazy loading is convenient. Too convenient. When you write:

orders = session.query(Order).all()
for order in orders:
    print(order.customer.name)  # N+1!

You get 1 query for orders, plus N queries for customers. The fix seems obvious:

orders = session.query(Order).options(joinedload(Order.customer)).all()

Problem solved? Not really. You've just moved the problem. Now your route handler "knows" about database optimization. And when requirements change, you forget to update the eager loading. Or you over-fetch data you don't need.

The Repository Solution

With the Repository pattern, data fetching becomes explicit. You define exactly what you need, where you need it:

class OrderRepositoryInterface(Protocol):
    def find_all_with_customers(self) -> list[Order]: ...
    def find_by_id(self, id: OrderId) -> Order | None: ...
    def find_by_id_with_items(self, id: OrderId) -> Order | None: ...

Each method has a clear purpose. No ambiguity about what data will be loaded:

class SQLAlchemyOrderRepository:
    def __init__(self, session: Session):
        self.session = session

    def find_all_with_customers(self) -> list[Order]:
        models = self.session.query(OrderModel).options(
            joinedload(OrderModel.customer)
        ).all()
        return [self._to_domain(m) for m in models]

    def find_by_id_with_items(self, id: OrderId) -> Order | None:
        model = self.session.query(OrderModel).options(
            joinedload(OrderModel.items)
        ).filter_by(id=id.value).first()

        return self._to_domain(model) if model else None

Why This Works Better

Explicit contracts - The interface documents exactly what data loading options exist
Single responsibility - Optimization logic lives in the repository, not scattered across routes
Easy to test - Mock the repository interface, not SQLAlchemy internals
AI-friendly - Clear method names help AI assistants understand intent

The CQRS Optimization

For read-heavy operations, combine this with dedicated Query objects:

@dataclass(frozen=True)
class GetOrdersWithCustomerQuery:
    customer_id: CustomerId | None = None
    date_range: DateRange | None = None

class GetOrdersWithCustomerHandler:
    def __init__(self, session: Session):
        self.session = session

    def handle(self, query: GetOrdersWithCustomerQuery) -> list[dict]:
        stmt = select(
            Order.id,
            Order.total,
            Customer.name.label("customer_name"),
        ).join(Customer)

        if query.customer_id:
            stmt = stmt.where(Order.customer_id == query.customer_id.value)

        return self.session.execute(stmt).mappings().all()

Now your reads bypass the ORM entirely. No N+1 possible. No over-fetching. Just the exact data you need.

Key Takeaway

N+1 queries aren't a technical problem - they're a design problem. SQLAlchemy's implicit loading encourages implicit decisions about data fetching. The Repository pattern makes those decisions explicit, centralized, and testable.

Want to learn more?

This pattern is covered in detail in Chapter 12 (Repositories) and Chapter 14 (Queries and Read Models) of Pragmatic DDD with Python.

Get the book