GraphQL gives clients control over exactly which data they fetch, eliminating the over-fetching and under-fetching that plagues REST. Python has three mature, production-ready libraries for building GraphQL servers: Graphene, Strawberry, and Ariadne. Knowing which one fits your project, how to structure schemas and resolvers correctly, and how to avoid the performance traps that ambush developers in production is what separates a working prototype from a robust API.
Python GraphQL Tutorial: Building APIs with Strawberry, Graphene, and Ariadne
This tutorial walks through how to build production-ready GraphQL APIs in Python using the three major libraries in the ecosystem: Strawberry, Graphene, and Ariadne. You will learn how to define schemas, implement resolvers, prevent the N+1 query problem with DataLoader, implement cursor pagination, and deploy GraphQL APIs using FastAPI.
GraphQL was developed internally at Facebook starting in 2012 by engineers Lee Byron, Dan Schafer, and Nick Schrock to solve data-fetching challenges in Facebook's mobile applications, including the rebuilt native iOS News Feed. As Schrock wrote on the Facebook Engineering Blog when the project was open-sourced in September 2015, GraphQL was invented during the shift from HTML5-driven mobile apps to purely native applications and had been powering the majority of data interactions in the Facebook iOS and Android apps for years. It has since grown into a standard tool for teams that need flexible data access across complex frontends, now governed by the GraphQL Foundation under the Linux Foundation since 2018. Unlike REST, which forces you to design endpoints around backend resources, GraphQL lets the client describe a precise shape for the response. That means a mobile client can fetch a condensed user profile while a dashboard fetches the same user plus their full activity history, all from a single endpoint with no additional backend code. Python's clean syntax, powerful type system, and deep library ecosystem make it an excellent host for GraphQL servers, and the community has spent the last decade building tools that make the integration feel natural.
Why GraphQL and Python Work Well Together
The case for GraphQL over REST comes down to precision and contract. In a REST architecture, a GET /users/42 endpoint returns whatever the backend developer decided to include, often far more than the caller needs. A mobile client requesting a username and avatar might receive a payload containing dozens of fields it discards. GraphQL inverts this: the client sends a query declaring exactly which fields it wants, and the server resolves only those fields. The result is smaller payloads and fewer round trips.
But the precision argument only scratches the surface of what makes this inversion consequential. Consider what happens as an application scales from one frontend to three: a web dashboard, a mobile app, and an internal admin tool. Under REST, the backend team either builds three sets of endpoints optimized for each consumer, or ships a single set of bloated endpoints and forces each client to discard what it does not need. The first option creates an explosion of endpoint permutations that compounds with every new feature. The second option shifts the cost from backend maintenance to network overhead and client-side filtering. GraphQL dissolves this tension entirely. The schema becomes the contract, and each client shapes its own payload within that contract. The organizational implication is as important as the technical one: frontend and backend teams can evolve independently as long as the schema remains the shared boundary, and that boundary is enforced at the type level rather than by convention.
Python earns its place in this ecosystem through several concrete advantages. The language's type annotation system, introduced formally in PEP 484 and expanded in subsequent releases, maps naturally onto GraphQL's strongly typed schema language. Libraries like Strawberry exploit this directly, using Python dataclass-style syntax to define types that are simultaneously valid Python and valid GraphQL schema. The result is a single source of truth for both runtime behavior and API contract, which reduces the maintenance burden that comes from keeping hand-written SDL in sync with resolver code.
This convergence between Python's type system and GraphQL's schema language has a second-order benefit that is easy to overlook: it makes the entire resolver layer statically analyzable. When your GraphQL types are standard Python dataclasses, tools like mypy, pyright, and your IDE's built-in type checker can trace data flow from the database layer through the resolver and into the serialized response. A field that changes from str to Optional[str] produces a type error at every call site before the code ever runs. In a traditional REST framework, that same change might not surface until an integration test or, worse, a production null pointer. The tighter the feedback loop between schema change and error detection, the faster teams can iterate without introducing regressions, and that feedback loop is what makes the Python-GraphQL combination more than just syntactic convenience.
The GraphQL Foundation describes the core philosophy as giving clients full control over the shape and scope of every response, so that APIs return precisely the data requested and nothing beyond it. — GraphQL Foundation, graphql.org/learn
Dan Schafer, one of GraphQL's co-creators, has described the language as a mechanism that redefines the client-server contract: the server declares its capabilities through a typed schema, and the client expresses its data requirements declaratively, giving product developers the freedom to build interfaces on their own terms rather than conforming to rigid endpoint structures. — Dan Schafer, GraphQL Co-Creator, via LevelUp Engineering
Schafer, along with Lee Byron and Nick Schrock, built the first version of GraphQL at Facebook in 2012 to power the rebuilt native iOS News Feed. Their philosophy of client-driven data fetching with a strongly typed contract maps cleanly onto Python's emphasis on readability and explicit type declarations. Byron would later explain in a Reactiflux Q&A that GraphQL was designed with product development as its central concern and that the typed schema serves as an organizing principle that brings structural clarity to server-side code, reinforcing why Python's clean syntax is a natural match for building GraphQL servers.
Python also benefits from a mature async runtime. GraphQL's resolver model, where each field in a query is resolved by a dedicated function, is a natural fit for async/await. A resolver that fetches user data from a database and another that fetches order data can run concurrently instead of sequentially, which is critical for queries that touch multiple data sources. FastAPI, which natively supports asyncio, pairs particularly well with Strawberry for this reason.
REST uses multiple endpoints, each returning a fixed shape. GraphQL uses a single endpoint and lets the client define the shape of every response. Both have their place, but GraphQL shines when frontends have heterogeneous data needs or when multiple client types (web, mobile, IoT) consume the same backend.
The Three Libraries: Graphene, Strawberry, and Ariadne
Three libraries dominate Python GraphQL development. They are philosophically different in important ways, and choosing the wrong one for your context will cost you time.
Graphene
Graphene, originally created by Syrus Akbary and now maintained by the GraphQL-Python community, is the oldest and still the best-supported library for teams embedded in the Django ecosystem. Its latest release is 3.4.3. It uses a code-first philosophy: you define types using Python classes, and the library generates the GraphQL schema from those definitions. Graphene integrates tightly with Django's ORM through graphene-django, and with SQLAlchemy through graphene-sqlalchemy. If your team already runs graphene-django and has established patterns for resolvers and mutations, migrating away carries real risk with little upside.
import graphene
class Book(graphene.ObjectType):
id = graphene.ID(required=True)
title = graphene.String(required=True)
author = graphene.String(required=True)
year = graphene.Int()
class Query(graphene.ObjectType):
book = graphene.Field(Book, id=graphene.ID(required=True))
all_books = graphene.List(Book)
def resolve_book(root, info, id):
# fetch from DB or data source
return {"id": id, "title": "Clean Code", "author": "Robert C. Martin", "year": 2008}
def resolve_all_books(root, info):
return []
schema = graphene.Schema(query=Query)
The class-based syntax feels natural to developers accustomed to Django's models and views, but it can feel verbose compared to modern Python idioms. Graphene supports async resolvers through graphql-core, but its ecosystem—particularly graphene-django—has historically emphasized synchronous execution. In contrast, Strawberry was designed from the beginning for async-first frameworks such as FastAPI and asyncio-based stacks, which matters if you are building a high-concurrency API.
Strawberry
Strawberry is the library you should reach for on any new Python project in 2026. It leverages Python's native type hints and dataclass syntax to define schemas, which makes code dramatically more readable and reduces boilerplate. As of March 2026, the latest stable version is 0.308.3, released March 4, 2026. It requires Python 3.10 or higher, supports experimental @defer and @stream directives for incremental delivery (requiring graphql-core >= 3.3.0a12), and provides Apollo Federation v2 support for distributed GraphQL architectures. Strawberry's creator, Patrick Arminio, has built the library around the principle that GraphQL types should look like standard Python dataclasses, making the code simultaneously valid Python and valid GraphQL schema.
import strawberry
from typing import Optional
@strawberry.type
class Book:
id: strawberry.ID
title: str
author: str
year: Optional[int] = None
@strawberry.type
class Query:
@strawberry.field
def book(self, id: strawberry.ID) -> Optional[Book]:
# fetch from data source
return Book(id=id, title="Clean Code", author="Robert C. Martin", year=2008)
@strawberry.field
def all_books(self) -> list[Book]:
return []
schema = strawberry.Schema(query=Query)
Notice how the Strawberry version reads like standard Python. There are no base classes to inherit and no library-specific field descriptors. This is not just an aesthetic preference; it means your IDE's type checker understands the code natively, flagging type mismatches before they become runtime errors.
Ariadne
Ariadne takes the opposite approach: schema-first development. You write your schema in GraphQL's Schema Definition Language (SDL), then bind Python resolver functions to it. This model is the dominant pattern in the broader GraphQL community and is well supported by tooling like GraphQL Playground, Apollo Studio, and most code generators. According to the Ariadne documentation, Ariadne Codegen can take a schema and a set of operations and generate a fully typed Python client backed by Pydantic models.
from ariadne import QueryType, gql, make_executable_schema
from ariadne.asgi import GraphQL
type_defs = gql("""
type Query {
book(id: ID!): Book
allBooks: [Book!]!
}
type Book {
id: ID!
title: String!
author: String!
year: Int
}
""")
query = QueryType()
@query.field("book")
def resolve_book(*_, id):
return {"id": id, "title": "Clean Code", "author": "Robert C. Martin", "year": 2008}
@query.field("allBooks")
def resolve_all_books(*_):
return []
schema = make_executable_schema(type_defs, query)
app = GraphQL(schema, debug=True)
Ariadne, created by Mirumee Software, is the right choice when your team wants to write SDL first and treat the schema as a contract that frontend and backend teams agree on before writing any implementation code. It also integrates well with legacy Django projects and supports both synchronous and asynchronous resolvers. The latest stable release is 0.29.0 (February 19, 2026), which requires Python 3.10 or higher. Ariadne is built on top of graphql-core, the Python reference implementation of the GraphQL specification that tracks the behavior of the GraphQL.js reference library.
For greenfield projects: use Strawberry. For legacy Django codebases with existing Graphene schemas: stay on Graphene or migrate gradually. For teams that write SDL first and want schema-as-contract: use Ariadne. All three libraries support Flask, Django, and FastAPI.
| Library | Style | Best For | Async Support |
|---|---|---|---|
| Strawberry | Code-first | New projects, FastAPI stacks | Excellent |
| Graphene | Code-first | Django ecosystems | Moderate |
| Ariadne | Schema-first | Teams using SDL as contract | Excellent |
If you are migrating from Graphene to Strawberry, watch for three traps. First, Graphene's resolve_* method naming convention does not exist in Strawberry; resolvers are defined as decorated methods on the type class itself, so a mechanical rename will fail. Second, Graphene passes info as a graphql-core ResolveInfo object, while Strawberry wraps it in its own strawberry.types.Info type. Code that accesses info.context or info.field_name may need adjustments. Third, since Strawberry 0.296.0, parameter injection in resolvers is strictly type-hint-based rather than name-based, meaning resolvers that relied on implicit naming conventions will break silently unless annotated correctly.
Building a Real API with Strawberry and FastAPI
The combination of Strawberry and FastAPI has emerged as the preferred stack for async Python GraphQL APIs. FastAPI's dependency injection, automatic OpenAPI docs, and first-class asyncio support complement Strawberry's type-safe schema definitions cleanly. Here is a complete, runnable example modeling a small library catalog.
Start by installing the required packages in a virtual environment:
python -m venv graphql-env
source graphql-env/bin/activate # Windows: graphql-env\Scripts\activate
pip install "strawberry-graphql[fastapi]" uvicorn sqlalchemy asyncpg
Now define the schema and wire it into FastAPI:
import strawberry
from fastapi import FastAPI
from strawberry.fastapi import GraphQLRouter
from typing import Optional
from dataclasses import field
# ---- Data layer (in-memory for illustration) ----
BOOKS_DB: dict[str, dict] = {
"1": {"id": "1", "title": "Fluent Python", "author": "Luciano Ramalho", "year": 2022},
"2": {"id": "2", "title": "Architecture Patterns with Python", "author": "Harry Percival", "year": 2020},
"3": {"id": "3", "title": "Python Cookbook", "author": "David Beazley", "year": 2013},
}
# ---- GraphQL types ----
@strawberry.type
class Book:
id: strawberry.ID
title: str
author: str
year: Optional[int] = None
# ---- Resolvers ----
@strawberry.type
class Query:
@strawberry.field
async def book(self, id: strawberry.ID) -> Optional[Book]:
data = BOOKS_DB.get(str(id))
if data is None:
return None
return Book(**data)
@strawberry.field
async def all_books(self) -> list[Book]:
return [Book(**b) for b in BOOKS_DB.values()]
# ---- Schema and app ----
schema = strawberry.Schema(query=Query)
graphql_app = GraphQLRouter(schema)
app = FastAPI(title="Library API")
app.include_router(graphql_app, prefix="/graphql")
# Run with: uvicorn main:app --reload
With this running, open http://localhost:8000/graphql to access Strawberry's built-in GraphiQL explorer. You can execute queries like the following directly in the browser interface:
query GetBook {
book(id: "1") {
title
author
year
}
}
query AllBooks {
allBooks {
id
title
author
}
}
Notice that allBooks does not return year in the second query because the client did not ask for it. That is GraphQL's core value proposition in action: the resolver executes fully, but the serialization layer strips unrequested fields before the response leaves the server.
Mutations, Input Types, and Error Handling
Queries fetch data. Mutations change it. In Strawberry, mutations are defined on a Mutation type using the same decorator-based syntax as queries. Input types, defined with @strawberry.input, provide a clean way to validate incoming data before it reaches your business logic.
@strawberry.input
class AddBookInput:
title: str
author: str
year: Optional[int] = None
@strawberry.type
class BookResult:
book: Optional[Book] = None
error: Optional[str] = None
@strawberry.type
class Mutation:
@strawberry.mutation
async def add_book(self, input: AddBookInput) -> BookResult:
if not input.title.strip():
return BookResult(error="Title cannot be empty.")
new_id = str(len(BOOKS_DB) + 1)
new_book = {
"id": new_id,
"title": input.title,
"author": input.author,
"year": input.year,
}
BOOKS_DB[new_id] = new_book
return BookResult(book=Book(**new_book))
schema = strawberry.Schema(query=Query, mutation=Mutation)
The BookResult union pattern, returning either a book or an error string, is a widely recommended approach for GraphQL error handling. Unlike HTTP status codes, GraphQL always returns a 200 OK at the transport level. Many production APIs return business-logic errors through structured payload types instead of relying on the GraphQL errors array. This keeps error handling consistent for clients and avoids mixing execution errors with application-level failures. Encapsulating the outcome in a result type keeps clients from needing to check multiple places for failure information.
The design reasoning behind this pattern runs deeper than convenience. In a GraphQL response, the top-level errors array is fundamentally ambiguous: it conflates schema validation failures, resolver runtime exceptions, authorization denials, and business logic rejections into a single flat list. A client parsing that array has no reliable way to distinguish between a query that asked for a field that does not exist (schema error), a resolver that crashed due to a database timeout (infrastructure error), and a mutation that was rejected because the user exceeded their quota (business error). The structured result type eliminates this ambiguity by giving business outcomes a dedicated, typed location in the response. The practical payoff is that client teams can pattern-match on concrete types rather than parsing error message strings, which makes frontend error handling deterministic and testable. As your schema grows, you can extend the error side of the union with domain-specific error types (validation errors with field-level detail, quota errors with limit information, conflict errors with retry guidance) without polluting the transport-level error channel.
Unhandled exceptions in resolvers propagate into the top-level errors array and can leak internal stack traces to clients. Always catch exceptions inside resolver logic and return structured error types instead. Use Strawberry's permission classes for authorization failures rather than raising raw Python exceptions.
A mutation call from a client looks like this:
mutation CreateBook {
addBook(input: {
title: "High Performance Python"
author: "Micha Gorelick"
year: 2020
}) {
book {
id
title
}
error
}
}
The N+1 Problem and How DataLoader Fixes It
The N+1 problem is the most common performance failure mode in GraphQL, and it is subtle enough that developers often do not notice it until a production system is under load. The problem occurs when resolving a list of parent objects triggers one database query per parent to load related children. Fetch 50 books, and resolving each book's author might fire 50 separate queries, giving you N=50 child queries plus 1 parent query: 51 total database round trips to serve one GraphQL request.
What makes the N+1 problem architecturally insidious is that it emerges from the same design property that makes GraphQL powerful: per-field resolution. In REST, an endpoint handler typically issues a small, fixed number of database queries regardless of what the caller asked for. The inefficiency shows up in the payload (over-fetching), not in the query count. GraphQL flips this trade-off. Each field in the query tree gets its own resolver, which means the system fetches precisely the data the client requested, but the naive implementation fires one database call per field invocation. The N+1 problem is not a bug in the framework; it is the natural consequence of a resolver architecture that optimizes for precision at the expense of batching. Understanding this trade-off is critical because it explains why every relationship field in a GraphQL schema is a potential N+1 site, not just the obvious ones. A three-level-deep query (books, their authors, each author's publisher) can compound from 51 queries to 51 plus another N batch for publishers, and the multiplicative growth accelerates with every level of nesting the client adds.
Lee Byron, GraphQL's co-creator, described DataLoader as the formalization of a batching and caching pattern that Facebook's engineering team had relied on internally for years to build performant GraphQL servers at scale. — Lee Byron, GraphQL Co-Creator, InfoQ, October 2015
Byron later recounted the story of DataLoader's origins: after open-sourcing GraphQL, the team met with engineers at companies like Pinterest and realized that the batching and caching patterns Facebook had relied on internally for years were not widely understood outside the company. DataLoader was extracted and published to fill that gap. The Python implementation, ported to work with asyncio, solves the same problem: instead of a resolver immediately hitting the database for each key, DataLoader collects all the keys requested during a single tick of the event loop, fires a single batch query, and distributes the results back to each waiting resolver. In practical terms, a query that fetches 50 books and their authors drops from 51 database round trips (1 for books, 50 for authors) to 2 (1 for books, 1 batch query for all authors). The exact reduction depends on the depth and shape of the query, but the pattern consistently eliminates the multiplicative growth that makes N+1 dangerous at scale.
The mental model that helps here is to think of DataLoader as a scheduling layer that sits between your resolvers and your database. Without it, each resolver acts as an independent agent issuing its own query the moment it executes. With DataLoader, resolvers register their intent (a key they need) and yield control back to the event loop. Once all resolvers at a given level of the query tree have registered their keys, DataLoader fires a single batch operation, then distributes the results. This is the same principle behind database connection pooling and write-behind caches: aggregate demand, batch execution, distribute results. The difference is that DataLoader operates at the application layer within a single request lifecycle rather than at the infrastructure layer across requests.
Here is a concrete Strawberry example using DataLoader to batch author lookups for a list of books:
import strawberry
from strawberry.dataloader import DataLoader
from strawberry.types import Info
from typing import Optional
# Simulate an authors table
AUTHORS_DB = {
"a1": {"id": "a1", "name": "Luciano Ramalho", "country": "Brazil"},
"a2": {"id": "a2", "name": "Harry Percival", "country": "UK"},
}
BOOKS_DB = {
"1": {"id": "1", "title": "Fluent Python", "author_id": "a1"},
"2": {"id": "2", "title": "Architecture Patterns with Python", "author_id": "a2"},
"3": {"id": "3", "title": "Fluent Python 2nd Ed.", "author_id": "a1"},
}
@strawberry.type
class Author:
id: strawberry.ID
name: str
country: str
@strawberry.type
class Book:
id: strawberry.ID
title: str
author_id: strawberry.ID
@strawberry.field
async def author(self, info: Info) -> Optional[Author]:
# Uses DataLoader: all author_id values across the list
# are batched into ONE query, not N queries
return await info.context["author_loader"].load(self.author_id)
# DataLoader batch function - called ONCE with all collected keys
async def load_authors(keys: list[str]) -> list[Optional[Author]]:
# In production: SELECT * FROM authors WHERE id = ANY($1)
result = []
for key in keys:
data = AUTHORS_DB.get(key)
result.append(Author(**data) if data else None)
return result
@strawberry.type
class Query:
@strawberry.field
async def all_books(self) -> list[Book]:
return [Book(**{k: v for k, v in b.items()}) for b in BOOKS_DB.values()]
async def get_context():
return {
# DataLoader is created per-request so cache does not bleed between requests
"author_loader": DataLoader(load_fn=load_authors)
}
schema = strawberry.Schema(query=Query)
# FastAPI integration
from fastapi import FastAPI
from strawberry.fastapi import GraphQLRouter
graphql_app = GraphQLRouter(schema, context_getter=get_context)
app = FastAPI()
app.include_router(graphql_app, prefix="/graphql")
The critical detail is how the DataLoader is instantiated. The Strawberry documentation is explicit on this point: the loader should be created when building the GraphQL context, not as a module-level global. A global DataLoader would cache results across requests, meaning one user's data could potentially populate responses for a different user. The context getter creates a fresh DataLoader for every incoming request, so the cache is request-scoped and safe.
DataLoader works best when keys map one-to-one with records. When loading by foreign key (for example, all books written by a given author), the batch function must return a list of lists: one inner list per key, in the same order as the keys array. Returning a flat list will corrupt result-to-key mapping.
DataLoader also ships with a per-request in-memory cache. If the same author ID appears in multiple books in a single query, the second load hits the cache and never touches the database. For distributed environments where in-memory per-request caching is insufficient, you can implement a custom cache backend by passing a cache_map parameter to the DataLoader constructor, allowing integration with Redis or another external store while preserving the same batching semantics.
Real-Time Data with GraphQL Subscriptions
GraphQL subscriptions extend the protocol from request-response to a persistent event stream. A client subscribes to a named event, and the server pushes updates whenever that event fires. This maps directly to WebSockets at the transport layer. Both Strawberry and Ariadne support subscriptions in production, and both integrate cleanly with Redis or Kafka for pub/sub broadcasting in distributed deployments.
A Strawberry subscription uses Python's AsyncGenerator type. The resolver is an async generator function decorated with @strawberry.subscription:
import strawberry
import asyncio
from typing import AsyncGenerator
@strawberry.type
class Subscription:
@strawberry.subscription
async def book_added(self) -> AsyncGenerator[Book, None]:
# In production, subscribe to a Redis channel or message queue
# This example simulates periodic events for illustration
for i in range(5):
await asyncio.sleep(2)
yield Book(
id=strawberry.ID(str(i)),
title=f"New Arrival {i}",
author="Various",
year=2026
)
schema = strawberry.Schema(query=Query, mutation=Mutation, subscription=Subscription)
Clients connect via WebSocket and send a subscription operation. Strawberry handles the connection lifecycle, supporting both the newer graphql-transport-ws protocol and the legacy graphql-ws protocol depending on what the client requests. By default, both protocols are accepted, and developers can configure which protocols to allow. For production workloads where many clients subscribe to shared events, broadcasting through Redis pub/sub prevents each subscription from polling the database independently.
Pagination and the Relay Connection Pattern
Any GraphQL API that returns lists will eventually need pagination. Without it, a query like allBooks returns every record in the table, which is fine when the table has three rows and catastrophic when it has three million. GraphQL supports two primary pagination strategies: offset-based and cursor-based. Offset-based pagination passes limit and offset arguments and maps directly to SQL's LIMIT and OFFSET clauses. It is simple to implement and easy for clients to understand, but it breaks under concurrent writes. If a new record is inserted between page requests, offset calculations shift and the client sees duplicates or skips entries entirely.
The offset problem is worth understanding at the database level because it reveals why cursor-based pagination is not just a different API shape but a fundamentally different performance characteristic. When a database engine executes OFFSET 10000 LIMIT 20, it must scan and discard the first 10,000 rows before returning the 20 the caller wants. The deeper into the dataset the client scrolls, the more work the database does per page, and the response time degrades linearly with depth. This is the hidden cost of offset pagination: early pages are fast, but deep pages become progressively slower, and the degradation is invisible during development when test datasets are small. Cursor-based pagination avoids this entirely because the cursor resolves to a WHERE id > $cursor_value clause that uses an index seek rather than a sequential scan. Page retrieval time remains constant whether the client is on page 1 or page 5,000, which is the kind of performance guarantee that matters once real traffic arrives.
Cursor-based pagination eliminates the consistency problem as well by using an opaque cursor, typically a base64-encoded identifier derived from the last item in the previous page, to mark where the next page begins. The server resolves the cursor to a WHERE clause rather than an offset, which means inserts and deletes between requests do not corrupt the page sequence. The GraphQL community has largely standardized on the Relay Cursor Connections Specification, which defines a structured response shape built around edges, nodes, cursors, and a PageInfo object that tells the client whether more data exists in either direction.
Strawberry has built-in support for the Relay connection pattern. Here is a paginated version of the book catalog:
import strawberry
from strawberry import relay
from typing import Optional, List, Iterable
@strawberry.type
class Book(relay.Node):
id: relay.NodeID[str]
title: str
author: str
year: Optional[int] = None
@classmethod
def resolve_nodes(cls, *, info, node_ids, required):
return [Book(**BOOKS_DB[nid]) for nid in node_ids if nid in BOOKS_DB]
@strawberry.type
class BookConnection(relay.ListConnection[Book]):
"""Paginated list of books following the Relay Connection spec."""
@strawberry.type
class Query:
@relay.connection(BookConnection)
def all_books(self) -> Iterable[Book]:
# Strawberry slices this iterable based on first/after/last/before
return [Book(**b) for b in BOOKS_DB.values()]
schema = strawberry.Schema(query=Query)
A client query using this schema specifies how many results it wants and where to start:
query PaginatedBooks {
allBooks(first: 10, after: "YXJyYXljb25uZWN0aW9uOjk=") {
edges {
cursor
node {
id
title
author
}
}
pageInfo {
hasNextPage
endCursor
}
}
}
The response includes pageInfo.endCursor, which the client passes as the after argument in the next request to fetch the following page. This pattern works identically for backward pagination using last and before. The critical implementation detail is what backs the cursor: in production, the cursor should resolve to a WHERE clause on an indexed column rather than an array offset. Using WHERE id > $cursor_value ORDER BY id LIMIT $first leverages database indexes and avoids scanning over discarded rows, which keeps page retrieval time constant regardless of how deep into the dataset the client has scrolled.
Offset pagination is acceptable for admin dashboards and internal tools where the dataset is small and stable. For any client-facing API where data changes between requests or datasets are large, cursor-based pagination with the Relay Connection pattern is the safer and more performant default.
Persisted Queries and Caching Strategies
Caching is one of the areas where GraphQL introduces friction that REST does not. A REST endpoint like GET /books/42 maps cleanly to an HTTP cache key: the URL is unique, deterministic, and naturally cacheable by browsers and CDNs. GraphQL, by contrast, sends variable query strings inside POST request bodies to a single /graphql endpoint, which makes HTTP-level caching ineffective out of the box. This is not an unsolvable problem, but it requires deliberate architectural decisions that a production API cannot afford to skip.
The underlying tension is worth understanding because it shapes every caching decision you will make. REST caching works because the URL is a natural cache key and the HTTP method signals intent: GET requests are idempotent and cacheable, POST requests are not. GraphQL collapses this distinction. Every operation, read or write, travels through the same URL via POST. The cache key that REST gets for free, the URL, does not exist in GraphQL's model. You have to reconstruct it deliberately, and the reconstruction strategy you choose determines which layer of the stack can participate in caching. This is the fundamental reason why GraphQL caching is not a single solution but an assembly of coordinated strategies across the CDN, the HTTP layer, the resolver layer, and the client.
The first line of defense is persisted queries, sometimes called trusted documents. Instead of sending the full GraphQL query string with every request, the client sends a hash, typically SHA-256, that maps to a query the server already knows. The server looks up the hash in a store, retrieves the full query, and executes it normally. This has three benefits: it dramatically reduces request payload size for complex queries, it enables GET-based requests that CDNs and browsers can cache natively, and it restricts execution to a known set of operations, which closes the door on arbitrary query injection. The security benefit here is often underappreciated: persisted queries transform a GraphQL endpoint from an open interpreter that executes arbitrary client-supplied code into a controlled dispatcher that only runs pre-approved operations. This is a meaningful shift in threat posture, especially for public-facing APIs.
Strawberry supports persisted queries through its extension system. For teams using Apollo Client on the frontend, Automatic Persisted Queries (APQ) work without a build step: the client sends the hash first, and if the server does not recognize it, the client retransmits the full query, which the server caches for subsequent requests. For stricter environments, a build step can extract all query strings from the client codebase and register them with the server ahead of deployment, rejecting any query that does not match a known hash. The stricter approach is worth the build complexity for any API exposed to untrusted clients: it eliminates the entire class of attacks that rely on crafting malicious queries at runtime, including depth attacks, alias-based amplification, and field-level enumeration probes.
Beyond persisted queries, response-level caching requires a layered approach, and each layer solves a different problem. Strawberry's CacheControl extension can set per-field cache hints, which the server aggregates into an HTTP Cache-Control header on the response. Fields that return static reference data might carry a max-age of 3600 seconds, while fields that return user-specific data are marked private, no-cache. The aggregate header reflects the most restrictive hint in the resolved field set, which prevents a CDN from caching a response that contains even one uncacheable field. This aggregation behavior creates an important design pressure: queries that mix public reference data with private user data will always receive the most restrictive cache policy. Teams that care about CDN hit rates should encourage clients to split these into separate queries, one for cacheable public data and one for uncacheable private data, so the public query can be served entirely from the edge.
# Application-level caching with Redis
import strawberry
from strawberry.types import Info
from typing import Optional
import json
import hashlib
async def cached_resolve(info: Info, key: str, ttl: int, fetch_fn):
"""Generic resolver-level cache wrapper using Redis."""
redis = info.context["redis"]
cached = await redis.get(key)
if cached:
return json.loads(cached)
result = await fetch_fn()
await redis.setex(key, ttl, json.dumps(result))
return result
@strawberry.type
class Query:
@strawberry.field
async def book(self, info: Info, id: strawberry.ID) -> Optional[Book]:
return await cached_resolve(
info,
key=f"book:{id}",
ttl=300,
fetch_fn=lambda: fetch_book_from_db(id),
)
For high-traffic public APIs, placing a CDN like Cloudflare or Fastly in front of the GraphQL endpoint is effective when combined with persisted queries and GET requests. The CDN caches responses keyed on the query hash and variables, which offloads repetitive read traffic from the application servers entirely. Redis or Memcached at the resolver level then handles cache invalidation for mutations, ensuring that a write operation flushes the relevant cache entries before the next read resolves stale data. The invalidation strategy matters more than the caching strategy: a cache that is fast but serves stale data after a mutation will produce user-visible bugs that are difficult to reproduce and diagnose. The safest pattern is to make every mutation explicitly invalidate the cache keys it affects as part of the mutation's commit phase, not as an afterthought in a background job. For cross-service invalidation where a mutation in one service affects cached data in another, Redis pub/sub or a lightweight message broker can propagate invalidation events, but the coordination overhead should be weighed against simply setting shorter TTLs for data that crosses service boundaries.
Cache at the resolver level for data that changes per-user or per-mutation. Cache at the CDN level for public, read-heavy queries served via persisted queries over GET. Never cache responses that mix public and private data without splitting them into separate queries or using field-level cache directives to set the correct scope.
Production Considerations
Query Depth and Complexity Limits
GraphQL's flexibility is also its primary attack surface. A malicious or poorly written client can send a deeply nested query that causes your resolvers to recurse through thousands of relationships, consuming memory and compute until the server exhausts its resources. Both Strawberry and Ariadne support validation rules that enforce query depth limits and complexity scoring. Complexity scoring assigns a cost value to each field based on its computational weight; the server rejects any query whose total complexity exceeds a configured ceiling.
The threat model here is worth thinking through carefully, because naive depth limiting alone leaves significant gaps. A depth limiter prevents vertical attacks (deeply nested queries), but it does nothing against horizontal attacks where a shallow query requests thousands of fields or uses aliases to multiply resolution cost. Consider a query with a depth of only 2 that uses 500 aliases to resolve the same expensive field 500 times: each alias triggers a separate resolver execution, and the total cost is 500x the cost of a single resolution. Similarly, a query that requests every field on a type with 200 fields is shallow but expensive. This is why defense-in-depth requires layering multiple limiters: depth limits cap vertical complexity, alias limits cap horizontal duplication, and token limits cap overall query size. The three together form a containment envelope that constrains the total computational surface area a single request can consume.
from strawberry.extensions import QueryDepthLimiter
schema = strawberry.Schema(
query=Query,
mutation=Mutation,
extensions=[
QueryDepthLimiter(max_depth=7),
]
)
Strawberry also provides MaxAliasesLimiter and MaxTokensLimiter extensions that prevent alias-based denial-of-service attacks and excessively large queries, respectively. A production configuration should combine all three:
from strawberry.extensions import (
QueryDepthLimiter,
MaxAliasesLimiter,
MaxTokensLimiter,
)
schema = strawberry.Schema(
query=Query,
mutation=Mutation,
extensions=[
QueryDepthLimiter(max_depth=7),
MaxAliasesLimiter(max_alias_count=15),
MaxTokensLimiter(max_token_count=1000),
]
)
Authentication and Authorization
FastAPI's dependency injection system provides a clean path for authentication. A dependency that validates a JWT token can be injected into the GraphQL context getter, making the authenticated user available to every resolver through info.context. Strawberry also supports permission classes, which are attached directly to field decorators and evaluated before the resolver executes:
The distinction between authentication and authorization is especially important in GraphQL because the granularity of access control is finer than in REST. A REST endpoint typically maps to one resource, so a single permission check per endpoint is usually sufficient. In GraphQL, a single query can traverse multiple types and relationships, each with different access requirements. A user might be authorized to see their own order history but not another customer's, and that boundary exists within a single query execution. Permission classes attached to individual fields give you this per-field granularity, but they also introduce a coordination challenge: when a field is denied, the error must propagate correctly through the response without corrupting neighboring fields that the user is authorized to see. Strawberry handles this by returning null for denied fields and populating the errors array with structured permission failure messages, but teams should test these edge cases explicitly rather than assuming the framework handles every combination of partial authorization correctly.
from strawberry.permission import BasePermission
from strawberry.types import Info
class IsAuthenticated(BasePermission):
message = "You must be logged in."
def has_permission(self, source, info: Info, **kwargs) -> bool:
user = info.context.get("user")
return user is not None and user.is_active
@strawberry.type
class Query:
@strawberry.field(permission_classes=[IsAuthenticated])
async def my_account(self, info: Info) -> Account:
return await fetch_account(info.context["user"].id)
Schema Modularization
As a schema grows, keeping all types and resolvers in a single file becomes unmanageable. The recommended pattern for Strawberry is to define types and resolvers per domain module and merge them at the application layer using strawberry.merge_types:
# books/schema.py
@strawberry.type
class BookQuery:
@strawberry.field
async def all_books(self) -> list[Book]:
...
# authors/schema.py
@strawberry.type
class AuthorQuery:
@strawberry.field
async def all_authors(self) -> list[Author]:
...
# main.py
from strawberry.tools import merge_types
CombinedQuery = merge_types("Query", (BookQuery, AuthorQuery))
schema = strawberry.Schema(query=CombinedQuery)
This keeps domain logic isolated and allows teams to work on different parts of the schema in parallel without merge conflicts in a single schema file.
For Django-based projects, the strawberry-graphql-django package (version 0.80.0 as of March 2026) provides deep integration including automatic type generation from Django models, support for Django 4.2 through 6.0, and a query optimizer that analyzes the incoming GraphQL query to apply select_related and prefetch_related automatically, reducing N+1 queries without manual DataLoader setup. Teams migrating from graphene-django should note that strawberry-graphql-django uses a fundamentally different approach to model type generation: instead of inheriting from a base class, you annotate Django model fields with Strawberry's type system, which gives the type checker full visibility into the schema.
Testing GraphQL Resolvers
Strawberry ships with a TestClient that executes queries against your schema in-process, without spinning up an HTTP server. This makes unit testing resolvers fast and straightforward:
from strawberry.test import TestClient
def test_all_books():
client = TestClient(schema)
result = client.execute("""
query {
allBooks {
id
title
author
}
}
""")
assert result.errors is None
assert len(result.data["allBooks"]) > 0
def test_add_book_mutation():
client = TestClient(schema)
result = client.execute("""
mutation {
addBook(input: {
title: "Test Book"
author: "Test Author"
}) {
book { id title }
error
}
}
""")
assert result.errors is None
assert result.data["addBook"]["error"] is None
assert result.data["addBook"]["book"]["title"] == "Test Book"
For async resolvers, use pytest-asyncio alongside Strawberry's async test client. FastAPI's TestClient from httpx also handles async resolvers correctly when the underlying ASGI transport is used.
Introspection in Production
GraphQL's introspection feature allows clients to query the schema itself, which is invaluable during development but represents an information disclosure risk in production. An attacker can use introspection to map your entire data model before crafting targeted queries. Strawberry provides a dedicated DisableIntrospection extension for this purpose, and you can also disable field name suggestions separately to prevent schema leakage through error messages:
from strawberry.extensions import DisableIntrospection
schema = strawberry.Schema(
query=Query,
mutation=Mutation,
extensions=[
DisableIntrospection,
],
config=strawberry.SchemaConfig(disable_field_suggestions=True),
)
# And in the router, restrict queries to POST-only:
graphql_app = GraphQLRouter(schema, allow_queries_via_get=False)
The DisableIntrospection extension blocks all introspection queries by adding a validation rule that rejects any query containing the __schema or __type fields. The separate disable_field_suggestions setting in SchemaConfig prevents Strawberry from suggesting similar field names in error messages, which could otherwise leak schema details to an attacker probing the endpoint. Combining both, along with restricting queries to POST-only, are standard hardening steps recommended by the OWASP API Security Project and the Apollo GraphQL security guidelines.
Observability and Resolver Tracing
A GraphQL endpoint surfaces as a single route in traditional HTTP monitoring, which means tools that measure latency per URL give you one unhelpful number for the entire API. Identifying which resolvers are slow, which queries are expensive, and which clients are sending pathological requests requires GraphQL-aware instrumentation. Strawberry ships with an OpenTelemetry extension that creates spans for each resolver execution, tagging them with the field name, parent type, and return type. When connected to a tracing backend like Jaeger or Datadog, this produces a flamegraph for every query, showing exactly how time is distributed across the resolver tree. Install the required extra with pip install 'strawberry-graphql[opentelemetry]':
from strawberry.extensions import QueryDepthLimiter, DisableIntrospection
from strawberry.extensions.tracing import OpenTelemetryExtension
schema = strawberry.Schema(
query=Query,
mutation=Mutation,
extensions=[
QueryDepthLimiter(max_depth=7),
DisableIntrospection,
OpenTelemetryExtension,
],
config=strawberry.SchemaConfig(disable_field_suggestions=True),
)
Beyond tracing, logging the query string, variables, and execution time for every request is essential for identifying abuse patterns and debugging production issues. Strawberry's extension lifecycle provides hooks at query parsing, validation, and execution phases, which means you can log or meter at whatever granularity your observability stack requires. For teams running Datadog, Strawberry also includes a dedicated DatadogTracingExtension that handles tagging and resource naming automatically. The goal is to treat every GraphQL operation with the same visibility you would give a REST endpoint: if you cannot answer the question "which queries had the highest p99 latency in the last hour," your monitoring has a blind spot that will eventually become a production incident.
Key Takeaways
- Match the library to the project: Strawberry is the right default for new async Python projects. Graphene remains the practical choice for mature Django codebases. Ariadne fits teams that want SDL as the authoritative contract between frontend and backend. The choice is not purely technical; it also determines how your team communicates about the API. Code-first libraries make the Python code the source of truth, which works when the backend team owns the schema. Schema-first approaches make the SDL the source of truth, which works better when frontend and backend teams negotiate the contract as peers.
- Use DataLoader for every relationship field: Any resolver that loads related objects from a database inside a list query will produce N+1 queries without DataLoader. Create loaders in the context getter, not as globals, to keep the cache request-scoped and safe. Treat the absence of a DataLoader on a relationship field as a latent performance bug, because it will only manifest under production load when list sizes exceed development fixtures.
- Paginate every list field from day one: Returning unbounded lists is a performance and security liability. Use the Relay Connection pattern with cursor-based pagination backed by indexed
WHEREclauses, not array offsets, to keep page retrieval time constant. Adding pagination retroactively is a breaking change that forces every client to update simultaneously; adding it upfront costs almost nothing and preserves your ability to evolve the schema without coordination overhead. - Encode business errors in the response type, not in exceptions: The top-level GraphQL
errorsarray conflates schema failures, runtime crashes, and business rejections into a single undifferentiated list. Structured result types give business outcomes a typed, predictable location that clients can pattern-match on without parsing error strings. As the schema grows, extend the error side of the union with domain-specific types that carry actionable detail: validation errors with field-level messages, quota errors with limit metadata, conflict errors with retry guidance. - Use persisted queries and layer your caching: Persisted queries transform a GraphQL endpoint from an open interpreter into a controlled dispatcher that only runs pre-approved operations. Cache at the resolver level with Redis for user-specific data and at the CDN level for public read-heavy queries served via GET. Separate public and private data into distinct queries so CDN cache policies are not dragged down by the most restrictive field in a mixed response.
- Layer your defenses against query abuse: Depth limiters block vertical nesting attacks but leave horizontal amplification untouched. Combine
QueryDepthLimiter,MaxAliasesLimiter, andMaxTokensLimiterto form a containment envelope that constrains vertical depth, horizontal duplication, and raw query size simultaneously. For public APIs, add persisted queries as the outermost defense so that only pre-registered operations reach the limiter layer at all. - Instrument resolvers with OpenTelemetry: A single
/graphqlendpoint is invisible to traditional HTTP monitoring. Per-resolver tracing with Strawberry's OpenTelemetry extension gives you the flamegraphs and p99 latency data you need to diagnose production slowdowns. Deploy tracing before the first production release, not after the first incident, because retrofitting instrumentation under pressure produces incomplete coverage. - Disable introspection and field suggestions in production: Introspection exposes the full shape of your data model to anyone who can reach the endpoint. Use Strawberry's
DisableIntrospectionextension and setdisable_field_suggestions=TrueinSchemaConfigto prevent schema leakage through both introspection queries and error messages. Restricting queries to POST-only further eliminates the casual browser-based probing that introspection enables. - Modularize schemas by domain: Use
merge_typesin Strawberry or Ariadne's schema merging utilities to keep each domain's types and resolvers in their own module. The organizational benefit matters as much as the code hygiene: domain-scoped schema modules let separate teams own their portion of the graph without merge conflicts, which becomes critical once more than two or three developers are working on the schema concurrently.
Python's GraphQL ecosystem has matured considerably. Strawberry in particular has closed the ergonomics gap with JavaScript-based GraphQL servers, offering type safety, async-native resolvers, federation support, and a clean integration path with FastAPI that rivals anything available in other languages. The concepts in this tutorial, from schema design through DataLoader batching, cursor-based pagination, caching strategy, and production hardening, apply equally whether you are building an internal microservice or a customer-facing API. Start with a small schema, add DataLoaders as soon as you introduce relationship fields, paginate every list field from the beginning, and put resolver tracing in place before your first deployment. In GraphQL, the performance and observability gaps you skip over in development tend to surface in production before you expect them.
Frequently Asked Questions
Is GraphQL faster than REST?
GraphQL is not inherently faster than REST. Performance depends on schema design, resolver efficiency, and caching strategy. GraphQL often reduces over-fetching and round trips, but poorly designed resolvers can introduce N+1 query problems that make it slower than a well-designed REST API. The advantage is precision: clients request only the fields they need, which typically reduces payload sizes and total network round trips for complex frontends that would otherwise require multiple REST calls.
Which Python GraphQL library should I use for a new project?
For new Python projects, Strawberry GraphQL is the recommended default. It uses native Python type hints and dataclass syntax, supports async/await natively, integrates cleanly with FastAPI, and provides Apollo Federation v2 support. Use Graphene if you have an existing Django codebase with established Graphene schemas. Use Ariadne if your team prefers a schema-first approach where the SDL file serves as the contract between frontend and backend.
What is the N+1 problem in GraphQL and how do you fix it in Python?
The N+1 problem occurs when resolving a list of parent objects triggers one database query per parent to load related children. For example, fetching 50 books and resolving each book's author fires 51 total queries. DataLoader solves this by collecting all requested keys during one tick of the event loop and firing a single batch query. In Strawberry, create the DataLoader in the context getter (not as a global) so the cache stays request-scoped.
References: GraphQL Foundation — graphql.org/learn; GraphQL Foundation, Performance — graphql.org/learn/performance; GraphQL Foundation, Pagination — graphql.org/learn/pagination; Relay Cursor Connections Specification — relay.dev/graphql/connections.htm; Strawberry GraphQL documentation — strawberry.rocks/docs; Strawberry GraphQL, DataLoaders — strawberry.rocks/docs/guides/dataloaders; Strawberry GraphQL, Deployment — strawberry.rocks/docs/operations/deployment; Strawberry GraphQL, Tracing — strawberry.rocks/docs/operations/tracing; Strawberry GraphQL, Breaking Changes — strawberry.rocks/docs/breaking-changes; Strawberry GraphQL PyPI (v0.308.3) — pypi.org/project/strawberry-graphql; strawberry-graphql-django PyPI (v0.80.0) — pypi.org/project/strawberry-graphql-django; Graphene PyPI (v3.4.3) — pypi.org/project/graphene; Ariadne documentation — ariadnegraphql.org; Ariadne PyPI (v0.29.0) — pypi.org/project/ariadne; graphql-core PyPI — pypi.org/project/graphql-core; Apollo GraphQL, Automatic Persisted Queries — apollographql.com/docs/apollo-server/performance/apq; Apollo GraphQL Blog, Why You Should Disable Introspection in Production — apollographql.com; Lee Byron, GraphQL Co-Creator, InfoQ Interview (October 2015) — infoq.com; Lee Byron, Reactiflux Q&A Transcript — reactiflux.com; Lee Byron, DataLoader v2.0 (November 2019) — leebyron.com/dataloader-v2; Dan Schafer, GraphQL Co-Creator quote via LevelUp Engineering — levelup.gitconnected.com; Nick Schrock, Facebook Engineering Blog, GraphQL: A data query language (September 2015) — engineering.fb.com; GraphQL Wikipedia — en.wikipedia.org/wiki/GraphQL; Python PEP 484, Type Hints — peps.python.org/pep-0484; Mirumee Software (Ariadne creators) — mirumee.com; BrowserStack, Building Efficient GraphQL APIs with Python (October 2025) — browserstack.com/guide/graphql-python.