Every Python developer has faced N+1 queries. You add eager loading everywhere, use SQL profilers religiously, and still find yourself playing whack-a-mole with performance issues. There's a better way.
The Problem with Implicit Loading
SQLAlchemy's lazy loading is convenient. Too convenient. When you write:
orders = session.query(Order).all()
for order in orders:
print(order.customer.name) # N+1! You get 1 query for orders, plus N queries for customers. The fix seems obvious:
orders = session.query(Order).options(joinedload(Order.customer)).all() Problem solved? Not really. You've just moved the problem. Now your route handler "knows" about database optimization. And when requirements change, you forget to update the eager loading. Or you over-fetch data you don't need.
The Repository Solution
With the Repository pattern, data fetching becomes explicit. You define exactly what you need, where you need it:
class OrderRepositoryInterface(Protocol):
def find_all_with_customers(self) -> list[Order]: ...
def find_by_id(self, id: OrderId) -> Order | None: ...
def find_by_id_with_items(self, id: OrderId) -> Order | None: ... Each method has a clear purpose. No ambiguity about what data will be loaded:
class SQLAlchemyOrderRepository:
def __init__(self, session: Session):
self.session = session
def find_all_with_customers(self) -> list[Order]:
models = self.session.query(OrderModel).options(
joinedload(OrderModel.customer)
).all()
return [self._to_domain(m) for m in models]
def find_by_id_with_items(self, id: OrderId) -> Order | None:
model = self.session.query(OrderModel).options(
joinedload(OrderModel.items)
).filter_by(id=id.value).first()
return self._to_domain(model) if model else None Why This Works Better
- Explicit contracts - The interface documents exactly what data loading options exist
- Single responsibility - Optimization logic lives in the repository, not scattered across routes
- Easy to test - Mock the repository interface, not SQLAlchemy internals
- AI-friendly - Clear method names help AI assistants understand intent
The CQRS Optimization
For read-heavy operations, combine this with dedicated Query objects:
@dataclass(frozen=True)
class GetOrdersWithCustomerQuery:
customer_id: CustomerId | None = None
date_range: DateRange | None = None
class GetOrdersWithCustomerHandler:
def __init__(self, session: Session):
self.session = session
def handle(self, query: GetOrdersWithCustomerQuery) -> list[dict]:
stmt = select(
Order.id,
Order.total,
Customer.name.label("customer_name"),
).join(Customer)
if query.customer_id:
stmt = stmt.where(Order.customer_id == query.customer_id.value)
return self.session.execute(stmt).mappings().all() Now your reads bypass the ORM entirely. No N+1 possible. No over-fetching. Just the exact data you need.
Key Takeaway
N+1 queries aren't a technical problem - they're a design problem. SQLAlchemy's implicit loading encourages implicit decisions about data fetching. The Repository pattern makes those decisions explicit, centralized, and testable.
Want to learn more?
This pattern is covered in detail in Chapter 12 (Repositories) and Chapter 14 (Queries and Read Models) of Pragmatic DDD with Python.
Get the book