Python does not just talk to one kind of API. It speaks several dialects fluently — REST, SOAP, GraphQL, gRPC, WebSocket, and Webhook — and each one was built to solve a different set of problems. Knowing which type to reach for, and exactly how Python implements it, is what separates a functional script from a well-engineered system.
An Application Programming Interface is the contract that lets two pieces of software exchange information without either one needing to understand the other's internal workings. When you write Python code that calls an external service — a payment processor, a weather feed, a database microservice — you are working with one of a handful of API architectures that have emerged over the past three decades. Each architecture makes different tradeoffs around data format, transport protocol, performance, and ease of use. This article walks through each major type, shows what Python code that uses it actually looks like, and explains the real-world conditions under which each one earns its keep.
What an API Actually Is in Python Context
The term "API" is used loosely enough that it sometimes creates confusion. In Python, it can refer to a web API that your code calls over a network, or it can refer to a library's public interface — the functions and classes that a module exposes for other code to use. This article focuses on web APIs: services accessed over a network using a defined protocol. These are the APIs you reach when you call requests.get(), spin up a WebSocket connection, or generate a gRPC stub from a .proto file.
The Postman Blog describes the landscape plainly: REST APIs treat each piece of data as a resource with a unique endpoint, while other styles like GraphQL, gRPC, and WebSocket were built to address specific limitations that REST exposes under certain workloads. Understanding the full set of options means you can match a tool to a problem rather than forcing every problem into the one tool you already know.
The six API types covered here — REST, SOAP, GraphQL, gRPC, WebSocket, and Webhook — are not mutually exclusive in a real system. A single Python application may consume a REST API for user data, a WebSocket API for live notifications, and a gRPC service for internal microservice calls, all at the same time.
The Four Tensions Every API Resolves
Before walking through each API type in isolation, it is worth establishing the conceptual framework that connects all of them. Every API architecture is a response to the same four engineering tensions. The differences between REST, GraphQL, gRPC, SOAP, WebSocket, and Webhook come down to which tensions each one prioritizes and which compromises each one accepts.
The Four Tensions
1. Human readability vs. machine efficiency. REST and GraphQL use JSON — a format that a developer can read in a terminal. gRPC uses Protocol Buffers — a binary format that is faster to serialize but opaque to the human eye. SOAP uses XML, which is technically human-readable but so verbose that in practice it functions more like a machine format. This tension runs through every decision about data serialization.
2. Client control vs. server control. GraphQL hands the client a query language and says "ask for exactly what you need." REST hands the server the authority to define fixed response shapes. gRPC splits control through a contract (.proto file) that both sides agree on at compile time. SOAP takes this to the extreme with WSDL, a machine-readable specification that leaves no room for ambiguity. The question is always: who decides what the data looks like?
3. Statelessness vs. persistence. REST is stateless by design — every request is independent. WebSocket is stateful by design — the connection persists. gRPC occupies a middle ground where unary calls are stateless but streaming calls maintain a channel. Webhooks sidestep the question entirely because the provider manages the state and notifies you when it changes.
4. Universality vs. specialization. REST works everywhere — browsers, mobile apps, IoT devices, command-line tools. gRPC works best between backend services that can handle HTTP/2 and code generation. WebSocket requires a persistent connection that not every network topology supports. SOAP works in environments that need WS-Security and formal contracts. The more universal the protocol, the fewer optimizations it can make for specific workloads.
Keep these four tensions in mind as the article moves through each API type. They are the recurring thread that explains why six different approaches exist rather than one. Each section that follows is not just a description of a technology — it is a specific answer to the question of which tensions matter more for a particular class of problem.
"REST is old, GraphQL is the future, and everything else is niche." This framing collapses a multi-dimensional decision into a linear timeline, and it leads to poor architectural choices. GraphQL does not replace REST any more than a screwdriver replaces a hammer. They solve different problems along different tension axes. A system that uses GraphQL for its public API might still use REST internally for simple service-to-service CRUD, gRPC for latency-critical paths, and webhooks for event-driven integrations. The right question is never "which is newest?" but "which set of tradeoffs aligns with this specific integration?"
REST APIs
REST, which stands for Representational State Transfer, was proposed by Roy Fielding in his 2000 doctoral dissertation and has since become the dominant architecture for public web APIs. REST is not a protocol — it is a set of architectural constraints. A RESTful API communicates over HTTP, organizes data as resources identified by URLs, and uses standard HTTP verbs to describe operations: GET to retrieve, POST to create, PUT or PATCH to update, and DELETE to remove. Responses are almost always formatted as JSON, though XML is also valid.
The stateless constraint is one of REST's defining characteristics. Each request from the client must carry all the information needed for the server to fulfill it. The server does not store session state between requests. This makes REST services easy to scale horizontally because any server in a pool can handle any incoming request without needing to consult shared session storage.
"REST APIs use standard HTTP methods to access and manipulate resources." — Postman Blog, A Guide to the Different Types of APIs (2025)
In Python, the standard tool for consuming REST APIs is the requests library. It is not part of the standard library but is so universally used that it is effectively the default. For building REST APIs, the two leading frameworks are Flask and FastAPI. FastAPI has grown rapidly since its release because it generates OpenAPI documentation automatically and uses Python type hints to validate request and response data at the framework level.
# Consuming a REST API with the requests library
import requests
# GET request: retrieve a list of public repositories for a user
response = requests.get(
"https://api.github.com/users/python/repos",
headers={"Accept": "application/vnd.github+json"}
)
# Check the HTTP status code before reading the body
response.raise_for_status()
repos = response.json()
for repo in repos[:3]:
print(repo["name"], "-", repo["description"])
# POST request: create a resource (simplified example)
payload = {"title": "New Issue", "body": "Something went wrong."}
create_response = requests.post(
"https://api.example.com/issues",
json=payload,
headers={"Authorization": "Bearer YOUR_TOKEN"}
)
print(create_response.status_code) # 201 Created
Always call response.raise_for_status() immediately after a requests call. It raises an HTTPError exception for any 4xx or 5xx status code, which prevents silent failures where your code continues processing an error response body as if it were valid data.
REST is the right default choice for public-facing APIs, CRUD-oriented web services, and integrations where broad client compatibility matters. Its weakness appears when a client needs data from multiple related resources — each requires a separate round trip — or when the fixed response shape of an endpoint returns far more data than the client actually needs.
REST maximizes universality and human readability at the expense of client control. The server defines the response shape, and the client gets exactly what the endpoint returns — no more, no less. When that fixed shape stops fitting the client's needs, the standard REST response is to build another endpoint. GraphQL's entire value proposition begins at the exact point where this approach starts to strain.
SOAP APIs
SOAP, Simple Object Access Protocol, predates REST by several years and was designed for a different era of enterprise computing. Where REST is flexible and resource-oriented, SOAP is rigid, contract-first, and protocol-agnostic — it can technically run over HTTP, SMTP, or other transports, though HTTP is by far the most common. Every SOAP message is an XML document structured as an envelope containing a header (metadata) and a body (the actual payload). The service's capabilities are described in a WSDL file, Web Services Description Language, which acts as a machine-readable contract that clients use to know what operations are available and what data types those operations expect.
The Py-Core Python Programming series on Medium (April 2025) describes SOAP's architecture as fundamentally rigid: every message follows a strict envelope structure containing metadata in the header and the payload in the body, with no room for informal deviation. This formality is what makes SOAP suitable for environments where ambiguity in message structure is not acceptable.
SOAP's verbosity is a genuine cost in terms of bandwidth and parsing overhead, but its strictness is also its strength in regulated industries. The WS-Security standard, which SOAP supports natively, provides message-level encryption and signing that is distinct from transport-level TLS. This makes SOAP a common choice in banking, healthcare, and government systems where audit trails, non-repudiation, and compliance requirements demand that the security guarantees be embedded in the message itself rather than the channel.
In Python, the standard library for working with SOAP is zeep. It parses WSDL documents and generates Python objects that map directly to the service's data types, which makes the client code considerably cleaner than constructing raw XML by hand.
# Consuming a SOAP API with zeep
# pip install zeep
from zeep import Client
# zeep reads the WSDL and builds a client automatically
wsdl_url = "http://www.dneonline.com/calculator.asmx?WSDL"
client = Client(wsdl=wsdl_url)
# Call a SOAP operation by name; zeep handles XML marshalling
result = client.service.Add(intA=10, intB=32)
print(result) # 42
# Inspect what operations the service exposes
from zeep import helpers
print(client.wsdl.services)
SOAP is rarely chosen for new projects today, but it is very much alive in enterprise and financial environments. If you integrate with a bank, a legacy healthcare system, or certain government data services, there is a strong chance you will be handed a WSDL URL and expected to consume it. Knowing zeep is a practical skill for these situations.
SOAP and REST sit at opposite ends of the server control axis. SOAP's WSDL contract specifies every operation, every data type, and every fault condition before a single byte crosses the wire. REST's constraints are looser — resource URLs and HTTP verbs provide structure, but the response schema is informal and often documented only in human-readable text. GraphQL, which comes next, occupies a middle ground: a typed schema that defines available data, combined with a query language that lets the client choose which parts of that schema it needs.
GraphQL APIs
GraphQL was developed internally at Meta starting in 2012 and released publicly in 2015. It was created to solve a specific problem that Meta engineers encountered at scale: mobile clients on slow connections were being forced to make multiple REST round-trips to assemble a single screen's worth of data, and many of those responses contained far more fields than the app actually used. GraphQL addresses both problems with a single mechanism — a query language that lets the client describe exactly the shape of the data it wants, and a single endpoint that resolves those queries server-side.
Instead of dozens of resource endpoints, a GraphQL API exposes one endpoint (typically /graphql) that accepts either a query (read) or a mutation (write). The client sends a document describing which fields it needs, and the server responds with JSON shaped exactly to that specification, no more and no less. This eliminates both over-fetching (receiving unused data) and under-fetching (needing a second request to get related data).
Industry data cited in the Dev.to article Which API Style Is Crushing It in 2025 indicates that GraphQL usage among Python developers building data-heavy applications has seen roughly 30% growth over the preceding two years, a trend driven by the efficiency gains that single-endpoint query models provide in bandwidth-constrained environments.
Python clients consume GraphQL APIs using the requests library perfectly well, since a GraphQL request is just an HTTP POST with a JSON body containing the query string. For more structured work, the gql library provides a dedicated client with schema validation and transport abstraction.
# Consuming a GraphQL API using the requests library
import requests
endpoint = "https://api.github.com/graphql"
token = "YOUR_GITHUB_TOKEN"
# GraphQL query: request only the fields we actually need
query = """
query {
viewer {
login
repositories(first: 3, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
name
stargazerCount
primaryLanguage {
name
}
}
}
}
}
"""
response = requests.post(
endpoint,
json={"query": query},
headers={"Authorization": f"Bearer {token}"}
)
data = response.json()
for repo in data["data"]["viewer"]["repositories"]["nodes"]:
lang = repo["primaryLanguage"]["name"] if repo["primaryLanguage"] else "N/A"
print(f"{repo['name']} | Stars: {repo['stargazerCount']} | Lang: {lang}")
The GitHub API v4 is GraphQL-only. If you are writing automation against GitHub — repository analytics, code search, project management — learning GraphQL queries pays off immediately because you can retrieve deeply nested relational data in a single request that would require five or six separate REST calls against the v3 API.
GraphQL is the right choice for client-driven development where the front-end team controls what data they request, for mobile apps where bandwidth efficiency matters, and for APIs that expose richly relational data. Its main cost is added server complexity: the server must implement a resolver function for every field in the schema, and poorly written resolvers can trigger the "N+1 query" problem where resolving a list of objects causes a separate database query for each item in the list.
Multiple endpoints, each returning a fixed shape. Client makes three calls to /users, /users/1/repos, /repos/42/issues to assemble one screen. Server defines every response shape. Simple to cache (each URL is a cache key).
Single endpoint, client describes the exact data shape it needs. One query retrieves user, repos, and issues in a single round-trip. Client controls the response. Caching is harder (POST requests to one URL with varying query bodies).
"GraphQL is always more efficient because it eliminates over-fetching." This is true at the network level but misleading at the system level. A GraphQL query that asks for deeply nested related data can trigger a cascade of database queries on the server — the N+1 problem — that makes the total system cost higher than three well-optimized REST endpoints with proper indexing. The efficiency gain is real for the client, but it shifts complexity to the server's resolver layer. Tools like DataLoader exist to batch these queries, but they add engineering work that REST endpoints never require.
gRPC APIs
gRPC is an open-source Remote Procedure Call framework originally developed by Google and released publicly in 2016. Where REST and GraphQL are oriented around resources and queries expressed in human-readable text, gRPC is oriented around calling functions on a remote server as if they were local functions in your code. The data is not serialized as JSON or XML but as Protocol Buffers (protobuf), a binary format that is significantly more compact and faster to serialize and deserialize than text-based alternatives.
gRPC uses HTTP/2 as its transport, which enables multiplexing (many simultaneous requests over a single connection), header compression, and native streaming in both directions. The official gRPC documentation describes it as a high-performance, open-source RPC framework designed to run in any environment, with built-in support for load balancing, distributed tracing, health checking, and authentication. The service interface is defined in a .proto file, and the grpcio-tools package compiles that file into Python stubs — a generated server base class and a generated client class.
Performance comparisons cited by Wallarm (2025) and referenced in the Dev.to article Which API Style Is Crushing It in 2025 suggest that gRPC can deliver latency reductions of up to 50% compared to equivalent REST implementations, particularly in internal service-to-service communication where binary serialization and HTTP/2 multiplexing compound their advantages.
gRPC supports four communication patterns: unary (one request, one response), server streaming (one request, a stream of responses), client streaming (a stream of requests, one response), and bidirectional streaming (simultaneous streams in both directions). This flexibility makes it the dominant choice for internal microservice-to-microservice communication in high-performance systems.
# Step 1: Define service in a .proto file (greeting.proto)
# syntax = "proto3";
# service Greeter {
# rpc SayHello (HelloRequest) returns (HelloReply);
# }
# message HelloRequest { string name = 1; }
# message HelloReply { string message = 1; }
# Step 2: Generate Python stubs
# python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. greeting.proto
# Step 3: Implement the client
# pip install grpcio grpcio-tools
import grpc
import greeting_pb2
import greeting_pb2_grpc
def run():
# Open a channel to the gRPC server
with grpc.insecure_channel("localhost:50051") as channel:
stub = greeting_pb2_grpc.GreeterStub(channel)
request = greeting_pb2.HelloRequest(name="PythonCodeCrack")
response = stub.SayHello(request)
print("Server replied:", response.message)
run()
gRPC has a steeper initial setup curve than REST because it requires defining a .proto contract and running a code generation step before writing any application logic. That cost is front-loaded; once the stubs exist, the client code is clean and type-safe. The current production Python package is grpcio, with the most recent stable release being v1.78.0 as of early 2026 per PyPI (note: v1.78.1 was published but subsequently yanked due to a known issue).
The trade-off with gRPC is browser support. Because it relies on HTTP/2 features that browsers do not expose at the JavaScript layer, gRPC is not suitable for direct browser-to-server communication without a proxy layer such as gRPC-Web. It is, however, the standard for backend-to-backend communication in cloud-native microservices architectures, and Python's grpcio and grpcio-tools packages provide full support for all four streaming patterns.
gRPC pushes machine efficiency and specialization to their logical extreme. Binary serialization, HTTP/2 multiplexing, code-generated type safety — every design choice optimizes for throughput between services that are tightly coupled at the contract level. The cost is universality: browsers cannot consume gRPC natively, debugging requires specialized tools because the data is not human-readable, and the code generation step adds a build dependency that REST and GraphQL never impose.
Human-readable text sent as JSON over HTTP/1.1. The client writes a query string describing what it wants. Response is JSON. Schema is introspectable at runtime. Any HTTP client can send it.
Binary protobuf message sent over HTTP/2. The client calls a generated function stub. Response is a deserialized protobuf object. Schema is compiled at build time. Requires generated code to send it.
WebSocket APIs
HTTP is a request-response protocol: the client sends a request and waits for a response. For applications that need a continuous, bidirectional channel — a live chat window, a trading dashboard updating in real time, a multiplayer game — the request-response model introduces unnecessary overhead and latency. WebSocket solves this by upgrading an HTTP connection to a persistent, full-duplex TCP channel. After the initial handshake (which uses HTTP), both the client and the server can send messages to the other at any time without waiting for a request.
The Postman Blog characterizes WebSocket as a persistent, full-duplex channel over a single TCP connection that eliminates HTTP overhead after the initial handshake completes. The connection remains open for the life of the session, and either party can push data the moment it is available. This is fundamentally different from REST's stateless model, where each request is independent and the server has no persistent awareness of the client between calls.
# WebSocket client using the websockets library
# pip install websockets
import asyncio
from websockets.asyncio.client import connect
import json
async def stream_prices():
uri = "wss://stream.example-exchange.com/ws/btcusdt@ticker"
async with connect(uri) as websocket:
print("Connected. Waiting for price updates...")
# The connection stays open; messages arrive as the server pushes them
async for raw_message in websocket:
data = json.loads(raw_message)
print(f"Price: {data.get('c')} | 24h Change: {data.get('P')}%")
# Send a message back to the server (bidirectional)
await websocket.send(json.dumps({"action": "acknowledge"}))
asyncio.run(stream_prices())
The websockets library is async-native and integrates naturally with asyncio. If you are building a WebSocket server in Python rather than a client, the same library handles that too. For a FastAPI or Starlette application, WebSocket support is built directly into the framework under the WebSocket class, so you do not need a separate package on the server side.
WebSocket is the right choice when the server needs to push data to the client without the client asking for it first, and when the volume and frequency of messages makes polling via repeated HTTP requests impractical. It carries its own complexity: connections are long-lived, which means your server must manage connection state, handle disconnections gracefully, and scale connection counts rather than simply scaling request throughput. For use cases where only the server needs to push (and the client does not send back), Server-Sent Events (SSE) are a simpler alternative available through Python's standard HTTP machinery.
WebSocket trades statelessness for persistence. Every other API type discussed so far operates on discrete request-response cycles, even if those cycles happen rapidly. WebSocket breaks that model entirely: the connection stays open, state accumulates, and either side can speak at any time. This is exactly what real-time applications need, but it introduces operational complexity — connection lifecycle management, heartbeat monitoring, graceful reconnection — that stateless architectures avoid by design.
WebSocket and Webhook both address the same fundamental problem — the server has new information and the client needs to know about it — but they solve it from opposite directions. WebSocket keeps a persistent channel open so the server can push immediately. Webhook has no persistent channel at all; instead, it sends a one-shot HTTP POST to the client's registered URL when an event occurs. The choice between them often comes down to latency requirements: if milliseconds matter (trading data, live chat), WebSocket. If seconds-to-minutes are acceptable (payment confirmations, CI/CD triggers), Webhook.
Webhooks
A webhook is often described as a "reverse API." With a standard API, your code initiates the conversation by sending a request to a server and waiting for a response. With a webhook, the roles are inverted: an external service sends an HTTP POST to a URL that you expose whenever a specific event occurs. Your server is the listener, not the caller. You register your URL with the external service, and it delivers event notifications to you in real time, unprompted.
Solwey Consulting's architecture guide puts it concisely: "Unlike APIs that demand data requests, webhooks proactively alert you to specific events." A payment processor sends a webhook when a transaction is completed. A source control platform sends a webhook when a pull request is merged. A customer support system sends a webhook when a ticket is updated. The pattern eliminates the need for your code to poll an endpoint repeatedly and check whether anything has changed.
# Receiving a webhook with Flask
# pip install flask
from flask import Flask, request, jsonify
import hmac
import hashlib
app = Flask(__name__)
WEBHOOK_SECRET = b"your_shared_secret"
@app.route("/webhook/payment", methods=["POST"])
def receive_payment_event():
# 1. Verify the signature to confirm the payload came from the real sender
signature_header = request.headers.get("X-Signature-SHA256", "")
expected = hmac.new(WEBHOOK_SECRET, request.data, hashlib.sha256).hexdigest()
expected_header = f"sha256={expected}"
if not hmac.compare_digest(signature_header, expected_header):
return jsonify({"error": "Invalid signature"}), 401
# 2. Parse the event payload
event = request.json
event_type = event.get("type")
if event_type == "payment.completed":
order_id = event["data"]["order_id"]
amount = event["data"]["amount"]
print(f"Payment received for order {order_id}: ${amount}")
# Queue fulfillment, send receipt, update database, etc.
# 3. Acknowledge receipt immediately with a 200 response
return jsonify({"status": "received"}), 200
if __name__ == "__main__":
app.run(port=5000)
Always verify webhook signatures before processing a payload. An endpoint that accepts unverified POSTs from the internet can be spoofed by anyone who knows its URL. Nearly every major webhook provider (Stripe, GitHub, Shopify) includes an HMAC signature in a request header for exactly this reason. Use hmac.compare_digest() rather than a plain equality check to prevent timing-attack vulnerabilities.
Two operational details matter enormously when working with webhooks. First, your handler must respond with a 200 status code quickly — typically within a few seconds. If it does not, the sending service will assume the delivery failed and retry. If your processing logic is slow (database writes, sending emails, calling other APIs), queue the work and acknowledge immediately. Second, webhook deliveries are not guaranteed to arrive exactly once. Build your handler to be idempotent: processing the same event twice should not produce duplicate side effects. A database record keyed on the event's unique ID is a standard safeguard.
A System in Practice: How APIs Compose
The sections above treat each API type individually, which is necessary for understanding their mechanics but incomplete as a picture of how real systems work. Production Python applications rarely use a single API type. They compose several, each handling the communication pattern it was designed for. The following vignette traces a realistic e-commerce system to show how these technologies interconnect.
An E-Commerce Order Flow
A customer places an order through a web application. The browser calls the backend's REST API to create the order — a straightforward POST /api/orders with a JSON body. REST is the right choice here because the operation is a simple resource creation, the client is a browser, and broad compatibility matters.
The order service needs to check inventory and calculate shipping. These are internal microservices owned by the same team, running in the same Kubernetes cluster, where latency and throughput matter and external compatibility does not. The order service calls the inventory service and the shipping service via gRPC, using protobuf-defined contracts that were compiled into Python stubs at build time. The binary serialization and HTTP/2 multiplexing make these internal calls significantly faster than equivalent REST calls would be.
Once the order is confirmed, the system needs to update the customer's dashboard in real time. The browser maintains a WebSocket connection to a notification service. The moment the order status changes, the server pushes an update through the open channel — no polling, no delay.
Payment processing happens through a third-party provider. The order service does not poll the provider to check if payment cleared. Instead, the payment provider sends a webhook — an HTTP POST to a registered endpoint — when the transaction completes. The webhook handler verifies the HMAC signature, acknowledges with a 200, and queues the fulfillment job asynchronously.
The analytics team needs flexible access to order data for dashboards and reports. Rather than building dozens of REST endpoints for every possible data slice, the team exposes a GraphQL API that lets internal dashboard clients query exactly the fields and relationships they need. Over-fetching vanishes because each dashboard component requests only its own data shape.
One legacy integration remains: a partner bank that processes refunds. The bank exposes a SOAP service with a WSDL contract. The refund service uses zeep to consume it, because that is what the bank requires.
This is not a contrived example. It is the shape of real systems built by teams that understand the tradeoffs each API type makes. Every API in the vignette above exists because it solves one specific communication need better than the alternatives. REST for external simplicity. gRPC for internal speed. WebSocket for real-time push. Webhook for event-driven reactions. GraphQL for flexible data queries. SOAP for regulatory compliance. The four tensions from earlier — readability vs. efficiency, client vs. server control, statelessness vs. persistence, universality vs. specialization — are resolved differently at each boundary in the system.
Boundary Thinking
Every point where two services or systems exchange data is a boundary. The API type you choose at each boundary should be determined by the constraints at that specific boundary, not by a blanket organizational preference. External boundaries facing browsers and third-party consumers tend toward REST or GraphQL because universality matters. Internal boundaries between your own services lean toward gRPC because efficiency matters and you control both sides. Event-driven boundaries where timing is unpredictable use webhooks. Persistent, high-frequency boundaries use WebSocket. Compliance boundaries use whatever the regulator or partner mandates. The discipline is recognizing each boundary for what it is and choosing accordingly.
Authentication Across API Types
Every API type covered so far requires some form of authentication, yet the mechanisms differ in meaningful ways. Understanding the authentication landscape is not optional — it is the difference between a working integration and a rejected request.
API Keys
The simplest form of API authentication is the API key: a unique string that identifies your application or account. You include it in the request, typically as a header or query parameter, and the server verifies it against its records. API keys are common in public REST APIs where the primary goal is tracking usage and enforcing rate limits rather than identifying individual users. The GitHub REST API, OpenAI, and weather services all use API keys or personal access tokens in this pattern.
# Authenticating with an API key via headers
import requests
response = requests.get(
"https://api.example.com/data",
headers={"X-API-Key": "your_api_key_here"}
)
response.raise_for_status()
print(response.json())
API keys are easy to implement but carry a significant limitation: they authenticate the application, not the user. If a key is leaked, anyone who has it can make requests on your behalf until you revoke and rotate it. Never embed API keys directly in client-side code or commit them to version control. Environment variables and secret management tools exist for this reason.
OAuth 2.0
OAuth 2.0 is the standard for delegated authorization. It allows a user to grant your application limited access to their account on a third-party service without handing over their password. The flow involves redirecting the user to the service's authorization page, receiving an authorization code, and exchanging that code for an access token that your application uses on subsequent API requests. This is how applications connect to Google, GitHub, Slack, and hundreds of other services on a user's behalf.
# Using an OAuth 2.0 Bearer token with requests
import requests
access_token = "eyJhbGciOiJSUzI1NiIsInR..." # Obtained via OAuth flow
response = requests.get(
"https://api.example.com/user/profile",
headers={"Authorization": f"Bearer {access_token}"}
)
response.raise_for_status()
print(response.json())
The requests-oauthlib library handles the full OAuth 2.0 handshake in Python, including token refresh. For OpenID Connect — the identity layer built on top of OAuth 2.0 — the same library provides support for obtaining identity tokens alongside access tokens.
JWT (JSON Web Tokens)
JWTs are signed tokens that carry claims (user ID, role, expiration time) in a compact, self-contained format. The server generates a JWT after initial authentication, and the client includes it in subsequent requests. Because the token is cryptographically signed, the server can verify its authenticity without querying a database on every request — which is why JWTs are favored in stateless architectures and microservice communication. FastAPI has built-in support for OAuth2 with JWT through its security utilities, and the PyJWT library handles token encoding and decoding.
Authentication method often depends on API type. REST and GraphQL APIs commonly use API keys, OAuth 2.0, or JWT Bearer tokens over HTTPS. gRPC supports per-call credentials and TLS client certificates natively through its channel credentials API. SOAP has its own standard, WS-Security, which embeds authentication tokens and digital signatures directly inside the XML envelope. WebSocket connections typically authenticate during the initial HTTP handshake, passing a token as a query parameter or header before the upgrade occurs.
REST, GraphQL, gRPC, and WebSocket all rely on TLS (HTTPS / WSS) to encrypt data in transit. Authentication tokens ride inside headers over an encrypted channel. If the channel is compromised, the tokens are exposed.
SOAP's WS-Security signs and encrypts the message itself, independent of the transport. Even if intercepted, the message body remains protected. This is why regulated industries mandate it — the security guarantee is embedded in the data, not the pipe.
"JWTs are more secure than API keys." This conflates two different dimensions. JWTs carry claims (user identity, expiration, permissions) and can be verified without a database lookup, which makes them useful for stateless architectures and microservice-to-microservice trust. API keys are simpler identifiers that require server-side lookup. The security of either mechanism depends on how it is stored, rotated, and transmitted — not on the format itself. A JWT leaked in a client-side JavaScript bundle is no more secure than an exposed API key. The correct framing is that JWTs and API keys serve different trust models, not that one is inherently stronger.
Error Handling, Retries, and Rate Limits
Calling an API over a network means accepting that things will fail. Servers go down. Connections time out. You exceed a rate limit. A well-engineered API integration handles all of these conditions without crashing or producing silent corruption. The question is not whether errors will occur but how your code responds when they do.
HTTP Status Codes
Every HTTP-based API response includes a status code that tells you what happened. The 2xx range means success. The 4xx range means the client did something wrong: 400 for a malformed request, 401 for missing or invalid credentials, 403 for insufficient permissions, 404 for a resource that does not exist, and 429 for exceeding the rate limit. The 5xx range means the server failed: 500 for an internal error, 502 for a bad gateway, 503 for a temporarily unavailable service. Your code should distinguish between these categories because the correct response to each is different. A 401 means you need to refresh your credentials. A 429 means you need to wait. A 500 means you should retry after a delay.
Retry Logic with Exponential Backoff
Retrying a failed request immediately, in a tight loop, is one of the fastest ways to make a bad situation worse. If the server is overwhelmed, hammering it with retries only deepens the problem. Exponential backoff solves this by increasing the delay between each retry attempt. The tenacity library is the standard tool for this in Python, and it integrates cleanly with both requests and httpx.
# Retry with exponential backoff using tenacity
# pip install tenacity requests
import requests
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
@retry(
stop=stop_after_attempt(5),
wait=wait_exponential(multiplier=1, min=2, max=30),
retry=retry_if_exception_type(requests.exceptions.RequestException)
)
def fetch_with_retry(url, headers=None):
response = requests.get(url, headers=headers, timeout=10)
response.raise_for_status()
return response.json()
# Usage
try:
data = fetch_with_retry("https://api.example.com/resource")
print(data)
except requests.exceptions.RequestException as e:
print(f"All retries exhausted: {e}")
Respecting Rate Limits
Rate limits cap how many requests you can make within a time window. Exceeding them returns a 429 Too Many Requests status, and repeated violations can result in temporary or permanent bans. Many APIs include a Retry-After header in their 429 response that tells you exactly how many seconds to wait. Your code should read this header and honor it rather than guessing. For client-side rate limiting — throttling your own requests before the server has to tell you to stop — the requests-ratelimiter package wraps a standard requests.Session with configurable limits.
# Handling rate limits by reading the Retry-After header
import requests
import time
def fetch_respecting_limits(url, headers=None, max_retries=3):
for attempt in range(max_retries):
response = requests.get(url, headers=headers, timeout=10)
if response.status_code == 429:
wait_seconds = int(response.headers.get("Retry-After", 5))
print(f"Rate limited. Waiting {wait_seconds}s before retry...")
time.sleep(wait_seconds)
continue
response.raise_for_status()
return response.json()
raise Exception("Exceeded maximum retries due to rate limiting")
Always set a timeout parameter on every outbound request. A request without a timeout will hang indefinitely if the server never responds, eventually consuming all available connections or threads in your application. The requests library defaults to no timeout at all. Set timeout=10 (or a value appropriate to your use case) as a baseline, and adjust from there.
Paginating API Responses
APIs that return collections — a list of users, a feed of transactions, search results — rarely return the full dataset in a single response. Instead, they break it into pages and provide a mechanism for your code to request the next batch. This is pagination, and ignoring it means your code will only ever see the first page of results.
Three pagination patterns appear in practice. Offset-based pagination uses a page number or offset parameter: ?page=2&per_page=50. This is simple but can produce inconsistent results when items are inserted or deleted between requests, causing records to shift across page boundaries. Cursor-based pagination provides an opaque token (a cursor) in each response that points to the next batch. The GitHub REST API, Stripe, and Slack all use this pattern because it is stable under concurrent writes. Link header pagination includes a Link HTTP header with URLs for the next, previous, first, and last pages, following RFC 5988. The requests library parses this header automatically via response.links.
# Cursor-based pagination: following a "next" cursor until exhaustion
import requests
def fetch_all_pages(base_url, headers=None):
all_items = []
url = base_url
while url:
response = requests.get(url, headers=headers, timeout=10)
response.raise_for_status()
data = response.json()
all_items.extend(data.get("results", []))
# The API provides the next page URL directly; None when exhausted
url = data.get("next")
return all_items
# Usage with a cursor-paginated API
items = fetch_all_pages("https://api.example.com/items?limit=100")
print(f"Fetched {len(items)} total items across all pages")
When paginating large datasets, use a requests.Session() object instead of bare requests.get() calls. A session reuses the underlying TCP connection across requests to the same host, which eliminates the overhead of establishing a new connection on every page. For APIs that return thousands of pages, this reduces total fetch time significantly.
GraphQL handles pagination differently. The standard pattern is Connections, where a query returns a pageInfo object with hasNextPage and endCursor fields. You pass the cursor as an after argument in the next query to fetch the subsequent page. This mechanism is built into the GraphQL specification and is consistent across GraphQL APIs that follow the Relay connection pattern.
Pagination and async concurrency are deeply related problems that surface at different scales. Pagination solves the problem of too much data to return at once. Async concurrency solves the problem of too many requests to make sequentially. When you combine them — paginating through a large dataset where each page triggers additional API calls — the performance difference between synchronous requests and async httpx becomes dramatic. A paginated scraping job that takes 45 minutes synchronously might complete in 3 minutes with async concurrency and proper connection pooling.
Async Clients and httpx
The requests library is synchronous. Every call blocks the current thread until the server responds. For scripts that make a handful of API calls, this is fine. For applications that need to make dozens or hundreds of concurrent requests — scraping data from multiple endpoints, orchestrating microservice calls, feeding data pipelines — synchronous execution becomes a bottleneck because each request waits in line behind the one before it.
httpx is a modern Python HTTP client built by Encode (the team behind Starlette, Uvicorn, and Django REST Framework) that provides both synchronous and asynchronous APIs in a single library. Its synchronous interface is nearly identical to requests, making the transition straightforward. Its asynchronous interface uses Python's native async/await syntax and integrates with asyncio, which allows you to fire off many requests concurrently without spawning threads.
# Concurrent API calls with httpx (async)
# pip install httpx
import httpx
import asyncio
async def fetch_multiple(urls):
async with httpx.AsyncClient(timeout=10.0) as client:
tasks = [client.get(url) for url in urls]
responses = await asyncio.gather(*tasks)
return [r.json() for r in responses if r.status_code == 200]
# Fetch three endpoints concurrently instead of sequentially
urls = [
"https://api.example.com/users",
"https://api.example.com/orders",
"https://api.example.com/products",
]
results = asyncio.run(fetch_multiple(urls))
print(f"Received {len(results)} successful responses")
httpx also supports HTTP/2 natively when installed with the optional httpx[http2] dependency. HTTP/2 multiplexes multiple requests over a single TCP connection, compresses headers, and supports stream prioritization — the same transport benefits that make gRPC fast. Enabling it is a single parameter: httpx.AsyncClient(http2=True). The client falls back to HTTP/1.1 automatically for servers that do not support HTTP/2.
You do not need to choose between requests and httpx permanently. Use requests for simple scripts and synchronous work where its massive ecosystem and documentation are an advantage. Use httpx when you need async concurrency, HTTP/2, or when you are already working inside an async framework like FastAPI or Starlette. Both can coexist in the same project.
Choosing the Right API Type
The Postman Blog offers a practical summary that holds up well as a decision rule: use REST for CRUD operations and broad client compatibility, WebSocket for real-time bidirectional communication, GraphQL for complex nested data where client flexibility matters, gRPC for high-performance internal microservices, and SOAP for regulated enterprise systems where the XML contract and WS-Security compliance are mandated. Webhooks sit outside this comparison because they invert the direction of communication entirely.
A few cross-cutting questions help sharpen the decision further:
- Who initiates the communication? If your code calls out to a service, you are consuming an API (REST, GraphQL, gRPC, SOAP, WebSocket). If the external service calls your code when something happens, you are receiving webhooks.
- Does the connection need to stay open? If data needs to flow continuously in real time in both directions, WebSocket. If only the server pushes and the client listens, consider Server-Sent Events or WebSocket. If communication is request-response, use REST, GraphQL, or gRPC.
- How much data do you need, and from how many sources? If you are stitching together data from multiple related objects and over-fetching is a real cost, GraphQL is worth its complexity. If your queries are simple and predictable, REST is sufficient.
- Is performance at the binary level a requirement? Internal microservices where latency and throughput matter should strongly consider gRPC. The protobuf binary format and HTTP/2 multiplexing provide measurable gains over JSON over HTTP/1.1 at scale.
- Is this a legacy or compliance-governed integration? If you are handed a WSDL file by a financial institution or a healthcare system, SOAP is not a choice — it is the requirement. Use
zeep.
Those questions narrow the field, but the decision still requires evaluating constraints that surface only once you move past the textbook comparison and into the specifics of your system.
Evaluating the Deployment and Operational Constraints
Network topology and infrastructure restrictions. gRPC requires HTTP/2 end-to-end, which means every proxy, load balancer, and CDN in the request path must support it. AWS Application Load Balancers gained HTTP/2 target support, but many corporate proxy chains and legacy infrastructure still downgrade to HTTP/1.1 silently. If your deployment environment includes a corporate proxy, a WAF, or a CDN layer you do not control, test gRPC connectivity through the entire path before committing to it. The fallback is gRPC-Web behind an Envoy proxy, which adds an operational dependency. WebSocket connections face a similar gatekeeping problem: corporate firewalls and load balancers with aggressive idle timeouts will kill persistent connections unless you implement heartbeat pings at the application layer. If you know your users sit behind restrictive networks, Server-Sent Events over standard HTTPS may be the pragmatic alternative to WebSocket.
Team capability and debugging ergonomics. gRPC and GraphQL both increase system power at the cost of tooling familiarity. A team that is comfortable with curl and Postman for debugging REST endpoints will find gRPC's binary payloads opaque without tools like grpcurl or Postman's gRPC support. GraphQL's single endpoint means that standard HTTP logging does not distinguish between a lightweight profile lookup and a deeply nested query that triggers dozens of resolver functions. Before choosing either, evaluate whether your team has the observability infrastructure — distributed tracing with OpenTelemetry, per-resolver metrics, protobuf-aware log decoders — to debug them effectively in production. The cost of adopting a more powerful API type without the corresponding observability is paid in incident response time.
Schema evolution and versioning strategy. REST APIs commonly version through URL paths (/v1/, /v2/) or custom headers. This is simple but creates long-lived parallel endpoints that must be maintained. GraphQL handles evolution through schema deprecation annotations — fields are marked @deprecated with a reason, and clients are expected to migrate on their own timeline. gRPC's protobuf schema supports backward-compatible evolution by design: new fields can be added without breaking existing clients because unknown fields are silently ignored during deserialization. If your API serves clients you do not control (mobile apps with staggered update cycles, third-party integrations), the versioning model should be a primary factor in the decision. GraphQL and gRPC both handle gradual evolution more gracefully than REST's blunt versioning, but they also demand stricter schema governance to avoid bloat.
Data sensitivity and compliance surface area. The API type you choose determines where security enforcement happens and how auditable the data path is. SOAP's WS-Security embeds encryption and digital signatures in the message body itself, which means the security guarantees survive even if the transport layer is terminated and re-established at a proxy boundary. REST and GraphQL rely on transport-level TLS, which protects data in transit but provides no protection at rest between hops in a multi-proxy architecture. gRPC supports mutual TLS (mTLS) natively through its channel credentials API, which gives you strong service-to-service authentication in zero-trust environments. If your system handles PII, financial data, or health records, map the data path through every network hop and evaluate whether transport-level encryption is sufficient or whether message-level security is required by your compliance framework.
Cost at scale and resource consumption. WebSocket connections consume server memory for the entire duration they remain open. A REST API serving 10,000 concurrent users processes requests and releases resources immediately. A WebSocket server maintaining 10,000 concurrent connections holds 10,000 open TCP sockets in memory continuously, each requiring heartbeat management and state tracking. The infrastructure cost profile is fundamentally different: REST scales with request throughput, WebSocket scales with concurrent connection count. If your real-time feature serves a large user base but pushes updates infrequently (a delivery tracking page that updates every 30 seconds), polling a REST endpoint with appropriate caching headers may be cheaper and simpler than maintaining persistent connections. Similarly, GraphQL's flexibility comes with a server-side cost: resolvers for deeply nested queries can generate unpredictable database load. Without query complexity analysis and depth limiting (libraries like graphql-core support this), a single expensive query from one client can degrade performance for all users.
In practice, modern Python applications frequently use more than one API type. A SaaS product might expose a REST API for third-party developers, use gRPC internally between its own microservices, push real-time notifications to browser clients via WebSocket, and receive payment events through webhooks. Python's ecosystem has mature, well-maintained libraries for every one of these patterns, which means the language does not impose an architectural constraint — the constraint should come from the problem.
Key Takeaways
- REST is the sensible default. For public-facing web APIs with predictable, resource-oriented access patterns, REST over HTTP with JSON remains the most universally understood and tooled architecture. The
requestslibrary and FastAPI cover nearly every use case. - GraphQL gives the client control over the data shape. When your clients have diverse or rapidly changing data requirements, and when over-fetching or under-fetching is a real cost, GraphQL's single-endpoint query model is the correct tool. The GitHub API v4 is a well-documented public example to learn from.
- gRPC wins on performance and streaming between services. For internal microservice communication where latency, throughput, and strong typing matter, gRPC with Protocol Buffers and HTTP/2 offers measurable advantages over REST. The
grpcioandgrpcio-toolspackages provide full Python support. - WebSocket is the only option for true bidirectional real-time communication. When the server needs to push data to the client unprompted, and the client needs to respond back in an ongoing session, WebSocket is the correct architecture. Python's async
websocketslibrary and FastAPI's native WebSocket support make this straightforward. - SOAP is still relevant in regulated industries. It may rarely be chosen for new work, but it is alive in banking, healthcare, and government systems. The
zeeplibrary handles WSDL parsing and XML marshalling so you do not have to construct SOAP envelopes by hand. - Webhooks invert the model. When you need to react to external events rather than poll for them, expose an HTTPS endpoint, register it with the provider, and always verify the signature. Respond with
200immediately and process asynchronously to avoid delivery timeouts and duplicate retries. - Authentication is not one-size-fits-all. API keys work for simple identification. OAuth 2.0 handles delegated user authorization. JWTs enable stateless verification across microservices. Match the authentication method to the API type and the trust model of your integration.
- Errors are part of the contract. Read HTTP status codes. Implement exponential backoff with
tenacity. Respect rate limits by readingRetry-Afterheaders. Set atimeouton every outbound request. These are not optional hardening steps — they are baseline requirements for any integration that will run in production. - Paginate or lose data. APIs return pages, not full datasets. Understand whether your API uses offset-based, cursor-based, or Link header pagination, and write your fetching logic to follow pages until the dataset is complete.
- Async changes the throughput equation. When your application makes many concurrent API calls,
httpxwithasyncioeliminates the sequential bottleneck thatrequestsimposes. It also provides HTTP/2 support, which multiplexes requests over a single connection for additional performance gains.
Understanding these API types — and the practical engineering that surrounds them — gives you a complete vocabulary for the systems you will build and the integrations you will maintain. Each one reflects a distinct set of engineering priorities that emerged from real problems at real scale. The four tensions introduced at the beginning of this article — readability vs. efficiency, client vs. server control, statelessness vs. persistence, universality vs. specialization — are not abstract categories. They are the actual forces pulling on every architectural decision you make at every system boundary. The next time you are deciding how to connect two systems in Python, the question is not "how do I call an API?" but "which tensions matter here, which API type resolves them in my favor, how will I authenticate it, and what happens when the call fails?" The answers to those questions shape the reliability, performance, and maintainability of everything built on top of it.