You have a working Python script that calls APIs using the requests library. It runs fine for 10 calls. But now you need it to handle 200 calls, and the synchronous loop takes over a minute. You know async can fix this, but staring at a codebase full of requests.get() calls and wondering where to start is paralyzing. This guide walks through the conversion step by step, using the same function at each stage so you can see exactly what changes and why. The migration path uses httpx, created by Tom Christie (also the creator of Django REST Framework), which provides both synchronous and asynchronous APIs with a requests-compatible interface.
The strategy is incremental. Each step produces working code that you can test before moving to the next. You do not need to rewrite everything at once.
The Starting Point: Synchronous Code
Here is a typical synchronous script that fetches user data from an API in a loop. It works, it is readable, and it is slow.
import requests
import time
def fetch_user(user_id):
response = requests.get(
f"https://jsonplaceholder.typicode.com/users/{user_id}"
)
response.raise_for_status()
return response.json()
def fetch_all_users(user_ids):
results = []
for user_id in user_ids:
user = fetch_user(user_id)
results.append(user)
return results
if __name__ == "__main__":
start = time.perf_counter()
users = fetch_all_users(range(1, 11))
elapsed = time.perf_counter() - start
for u in users:
print(f"{u['id']}: {u['name']}")
print(f"\nFetched {len(users)} users in {elapsed:.2f}s")
With 10 users and a typical response time of 200-300 milliseconds per call, this takes roughly 2-3 seconds. Each request waits for the previous one to complete before starting. The CPU sits idle during every network round trip.
The httpx Client documentation warns that top-level API calls create a fresh connection per request -- nothing gets reused. With many requests to the same host, this overhead compounds quickly. Using a client instance solves this by maintaining a connection pool.
Step 1: Replace requests with httpx (Stay Synchronous)
The first step is the easiest: swap requests for httpx without changing anything about the structure of your code. This is a near drop-in replacement. The goal is to verify that httpx works correctly with your API before introducing async complexity.
import httpx # Changed from: import requests
import time
def fetch_user(user_id):
response = httpx.get( # Changed from: requests.get
f"https://jsonplaceholder.typicode.com/users/{user_id}"
)
response.raise_for_status()
return response.json()
def fetch_all_users(user_ids):
results = []
for user_id in user_ids:
user = fetch_user(user_id)
results.append(user)
return results
if __name__ == "__main__":
start = time.perf_counter()
users = fetch_all_users(range(1, 11))
elapsed = time.perf_counter() - start
for u in users:
print(f"{u['id']}: {u['name']}")
print(f"\nFetched {len(users)} users in {elapsed:.2f}s")
Two lines changed: the import and the function call. Everything else stays the same. The response object, .raise_for_status(), .json(), .status_code, .headers -- all work identically. Run your tests. If they pass, move on.
Install httpx with pip install httpx. It has no dependency on the requests library, so both can coexist in the same project during migration. The current version (0.28.1) requires Python 3.8 or higher.
httpx is not a 100% drop-in replacement for requests. Two differences will trip you up immediately if you do not address them. First, httpx does not follow redirects by default -- you must pass follow_redirects=True or set it on the client. Second, httpx enforces a default 5-second timeout on all operations, while requests has no timeout by default. The httpx compatibility guide documents the full list of differences.
Step 2: Convert to async def and await
Now convert the functions to coroutines. This step introduces three changes: def becomes async def, httpx.get() becomes await client.get(), and the entry point uses asyncio.run().
import asyncio
import httpx
import time
async def fetch_user(client, user_id): # async def
response = await client.get( # await
f"https://jsonplaceholder.typicode.com/users/{user_id}"
)
response.raise_for_status()
return response.json()
async def fetch_all_users(user_ids): # async def
async with httpx.AsyncClient() as client: # AsyncClient
results = []
for user_id in user_ids:
user = await fetch_user(client, user_id) # await
results.append(user)
return results
if __name__ == "__main__":
start = time.perf_counter()
users = asyncio.run(fetch_all_users(range(1, 11))) # asyncio.run
elapsed = time.perf_counter() - start
for u in users:
print(f"{u['id']}: {u['name']}")
print(f"\nFetched {len(users)} users in {elapsed:.2f}s")
This version is async, but it is not yet faster. The for loop still processes one request at a time because each await pauses until the response arrives before starting the next iteration. The value of this step is structural: the code is now in the correct shape for the next optimization.
A common mistake at this stage is forgetting to change the entry point. You cannot call an async def function directly from synchronous code. You must use asyncio.run() or the function will return a coroutine object instead of the result.
The Mental Model: How the Event Loop Sees Your Code
Before jumping to asyncio.gather, it is worth understanding why the previous step did not speed anything up. The answer is in how Python's event loop schedules work.
Think of the event loop as a single dispatcher at a restaurant who manages every table. When a coroutine hits await, it is telling the dispatcher: "I am waiting on the kitchen -- go help someone else." The dispatcher then switches to another coroutine that is ready to run. When the kitchen finishes, the dispatcher picks that coroutine back up where it left off.
In the Step 2 code, the for loop creates this pattern: request 1, wait, request 2, wait, request 3, wait. There is only ever one coroutine registered with the event loop at a time. The dispatcher has nobody else to switch to. It is a restaurant with one table occupied.
The fix is not about making individual requests faster. It is about giving the dispatcher multiple tables. When you create all the coroutines up front and hand them to the event loop together, every await becomes an opportunity for the dispatcher to advance a different request. Ten requests waiting on 10 different network round trips means the event loop can interleave them, and total wall-clock time collapses from the sum of all latencies to the maximum of any single one.
This distinction matters because it reshapes how you think about performance tuning. The bottleneck in async I/O code is never CPU -- it is how many coroutines are waiting simultaneously. More concurrent waiters means less idle time. Fewer concurrent waiters means you are back to sequential behavior, even if every function is marked async def.
Step 3: Replace the Loop with asyncio.gather
This is the step that delivers the performance improvement. Instead of awaiting each request sequentially inside a for loop, you create all the coroutines up front and pass them to asyncio.gather. They all run concurrently.
import asyncio
import httpx
import time
async def fetch_user(client, user_id):
response = await client.get(
f"https://jsonplaceholder.typicode.com/users/{user_id}"
)
response.raise_for_status()
return response.json()
async def fetch_all_users(user_ids):
async with httpx.AsyncClient() as client:
tasks = [fetch_user(client, uid) for uid in user_ids] # Build list
results = await asyncio.gather(*tasks) # Run all
return results
if __name__ == "__main__":
start = time.perf_counter()
users = asyncio.run(fetch_all_users(range(1, 11)))
elapsed = time.perf_counter() - start
for u in users:
print(f"{u['id']}: {u['name']}")
print(f"\nFetched {len(users)} users in {elapsed:.2f}s")
The for loop that built results one at a time is now a list comprehension that creates coroutine objects. asyncio.gather(*tasks) schedules all of them simultaneously. Total execution time drops from the sum of all response times to the duration of the single slowest response. For 10 calls at 200ms each, this means going from ~2 seconds to ~200 milliseconds. As the Python documentation confirms, asyncio.gather returns results in the same order that coroutines were passed in, so your results list maps directly to your input IDs.
Handling Errors in asyncio.gather
By default, if any coroutine inside asyncio.gather raises an exception, it is immediately propagated to the caller. The remaining coroutines continue running, but their results are lost. For production code, pass return_exceptions=True to capture exceptions as return values instead of letting them propagate.
results = await asyncio.gather(*tasks, return_exceptions=True)
for i, result in enumerate(results):
if isinstance(result, Exception):
print(f"Task {i} failed: {result}")
else:
print(f"Task {i}: {result['name']}")
This pattern lets you process partial results rather than losing everything when a single request fails.
If you are on Python 3.11 or higher, asyncio.TaskGroup provides stronger safety guarantees than asyncio.gather. When any task in the group raises an exception, TaskGroup automatically cancels the remaining tasks and raises an ExceptionGroup. The official Python documentation recommends it for structured concurrency. For migration scenarios targeting Python 3.8-3.10 compatibility, asyncio.gather remains the standard choice.
Throttling Concurrent Requests with asyncio.Semaphore
Firing 200 requests simultaneously will overwhelm the API or trigger rate limits. Use asyncio.Semaphore to cap concurrency at a fixed number of in-flight requests.
import asyncio
import httpx
SEM = asyncio.Semaphore(10) # Max 10 concurrent requests
async def fetch_user(client, user_id):
async with SEM:
response = await client.get(f"/users/{user_id}")
response.raise_for_status()
return response.json()
async def fetch_all_users(user_ids):
async with httpx.AsyncClient(
base_url="https://jsonplaceholder.typicode.com",
timeout=httpx.Timeout(10.0, connect=5.0),
) as client:
tasks = [fetch_user(client, uid) for uid in user_ids]
return await asyncio.gather(*tasks)
The semaphore wrapping fetch_user ensures that no more than 10 HTTP requests are active at any given moment, even if you pass 500 user IDs to asyncio.gather. All 500 coroutines are created immediately, but only 10 proceed past async with SEM at a time. This is essential for production code that interacts with rate-limited third-party APIs.
Step 4: Add a Session for Connection Pooling
The previous step already uses httpx.AsyncClient as a context manager, which provides connection pooling. But you can optimize further by configuring the client with timeouts and connection limits.
import asyncio
import httpx
import time
async def fetch_user(client, user_id):
response = await client.get(f"/users/{user_id}") # Relative URL
response.raise_for_status()
return response.json()
async def fetch_all_users(user_ids):
async with httpx.AsyncClient(
base_url="https://jsonplaceholder.typicode.com",
timeout=httpx.Timeout(10.0, connect=5.0),
limits=httpx.Limits(max_connections=20, max_keepalive_connections=10),
) as client:
tasks = [fetch_user(client, uid) for uid in user_ids]
return await asyncio.gather(*tasks)
if __name__ == "__main__":
start = time.perf_counter()
users = asyncio.run(fetch_all_users(range(1, 11)))
elapsed = time.perf_counter() - start
for u in users:
print(f"{u['id']}: {u['name']}")
print(f"\nFetched {len(users)} users in {elapsed:.2f}s")
The base_url parameter lets you use relative paths in your request calls, reducing repetition. The Timeout object sets a 10-second overall timeout and a 5-second connection timeout, overriding the httpx default of 5 seconds for all operations. The Limits object caps the connection pool at 20 total connections and 10 keep-alive connections. For reference, the httpx defaults are 100 max connections and 20 keep-alive connections -- tighter limits like those shown here are appropriate when you know you are talking to a single API host and want to prevent socket exhaustion at scale.
You can also set default headers on the client (authentication tokens, content types) so you do not have to repeat them in every request call: httpx.AsyncClient(headers={"Authorization": "Bearer ..."}).
Beyond gather: Choosing a Concurrency Strategy
asyncio.gather is the right default for the majority of migration scenarios, but Python's asyncio module offers three other concurrency primitives that solve problems gather cannot. Understanding when each one fits prevents you from engineering around limitations that a different tool solves directly.
asyncio.as_completed: Process Results as They Arrive
asyncio.gather waits for every coroutine to finish before returning anything. If you are fetching 100 resources and need to start processing each response the moment it arrives -- writing to a database, updating a progress bar, feeding into a downstream pipeline -- asyncio.as_completed is the better tool. It returns an iterator of futures in completion order, not input order.
import asyncio
import httpx
async def fetch_and_process(client, user_id):
response = await client.get(f"/users/{user_id}")
response.raise_for_status()
return response.json()
async def stream_results(user_ids):
async with httpx.AsyncClient(
base_url="https://jsonplaceholder.typicode.com"
) as client:
tasks = [fetch_and_process(client, uid) for uid in user_ids]
for coro in asyncio.as_completed(tasks):
result = await coro
# Process each result immediately -- don't wait for all
print(f"Got user: {result['name']}")
The tradeoff: you lose ordering guarantees. Results arrive in whatever sequence the network delivers them. If you need results mapped to input IDs, you will need to include the ID in the return value.
asyncio.wait: Fine-Grained Control Over Completion
asyncio.wait gives you access to both completed and still-pending tasks, letting you make decisions mid-flight. Pass return_when=asyncio.FIRST_COMPLETED to get back control as soon as any single task finishes, then decide whether to launch replacements, cancel the rest, or keep waiting.
async def fetch_with_fallback(client, user_ids):
tasks = {
asyncio.create_task(client.get(f"/users/{uid}")): uid
for uid in user_ids
}
results = {}
while tasks:
done, pending = await asyncio.wait(
tasks.keys(), return_when=asyncio.FIRST_COMPLETED
)
for task in done:
uid = tasks.pop(task)
try:
response = task.result()
results[uid] = response.json()
except Exception as e:
# Retry failed requests or log them
print(f"User {uid} failed: {e}")
return results
This pattern is ideal when you need to implement dynamic retry logic, circuit-breaker behavior, or progressive timeout escalation -- scenarios where a static gather call cannot adapt to runtime conditions.
When to Use Which
| Strategy | Best For | Result Order | Error Handling |
|---|---|---|---|
asyncio.gather | Batch operations where you need all results at once | Input order preserved | return_exceptions=True |
asyncio.as_completed | Streaming pipelines, progress reporting, early processing | Completion order | try/except per result |
asyncio.wait | Dynamic task management, retries, cancellation logic | Unordered (set-based) | Manual per-task inspection |
asyncio.TaskGroup | Structured concurrency with automatic cleanup (Python 3.11+) | N/A (no return list) | Cancels all on first failure |
For the migration scenario in this guide -- fetching a batch of API responses and processing them together -- asyncio.gather is the right choice. But as your async codebase grows, as_completed and wait will become essential tools for building resilient systems that respond to failures dynamically rather than batching them.
Retry and Backoff: What Production Code Needs
The code shown so far will fail in production the first time a transient network error, a 429 rate limit response, or a temporary DNS hiccup occurs. Real async HTTP code needs retry logic, and the way you implement it in async Python differs meaningfully from the synchronous approach.
The simplest pattern is a manual retry loop with exponential backoff inside the fetch function itself:
import asyncio
import httpx
import random
async def fetch_with_retry(client, user_id, max_retries=3):
for attempt in range(max_retries):
try:
response = await client.get(f"/users/{user_id}")
response.raise_for_status()
return response.json()
except (httpx.HTTPStatusError, httpx.TransportError) as e:
if attempt == max_retries - 1:
raise
# Exponential backoff with jitter
wait = (2 ** attempt) + random.uniform(0, 1)
await asyncio.sleep(wait)
Notice the await asyncio.sleep(wait) instead of time.sleep(wait). This is critical. Calling time.sleep() inside a coroutine blocks the entire event loop -- every other concurrent request freezes until the sleep completes. asyncio.sleep() yields control back to the event loop, letting other coroutines continue executing while this one waits to retry.
The jitter component (random.uniform(0, 1)) prevents the thundering herd problem. If 200 coroutines all hit a rate limit at the same moment and all retry after exactly 2 seconds, you have 200 requests hitting the API again simultaneously. Jitter spreads the retry wave across a time window, which is exactly the behavior rate-limited APIs are designed to encourage.
When an API returns a 429 Too Many Requests response with a Retry-After header, use that value instead of your own backoff calculation. The server is telling you exactly when to try again. Ignoring it risks getting your client blocked entirely: wait = int(response.headers.get("Retry-After", 2 ** attempt)).
The Full Before and After
Here is the complete transformation side by side, from the original synchronous script to the final async version.
| Change | Before (sync) | After (async) |
|---|---|---|
| Library | import requests | import httpx + import asyncio |
| Function definition | def fetch_user(user_id): | async def fetch_user(client, user_id): |
| HTTP call | requests.get(url) | await client.get(url) |
| Client management | None (implicit per-request) | async with httpx.AsyncClient() as client: |
| Multiple calls | for loop with sequential calls | List comprehension + asyncio.gather(*tasks) |
| Entry point | fetch_all_users(ids) | asyncio.run(fetch_all_users(ids)) |
| Error handling | try/except per call | return_exceptions=True on asyncio.gather |
| Rate limiting | Not needed (sequential) | asyncio.Semaphore(n) wrapping each coroutine |
| Execution time (10 calls) | ~2-3 seconds | ~200-300 milliseconds |
Mixing Sync and Async in the Same Codebase
You do not have to convert everything at once. httpx supports both synchronous and asynchronous clients, so sync and async code can coexist in the same project.
import httpx
import asyncio
# This function stays synchronous -- no changes needed
def fetch_config():
response = httpx.get("https://api.example.com/config")
return response.json()
# This function is converted to async for performance
async def fetch_all_items(item_ids):
async with httpx.AsyncClient() as client:
tasks = [client.get(f"https://api.example.com/items/{item_id}") for item_id in item_ids]
responses = await asyncio.gather(*tasks)
return [r.json() for r in responses]
# The main script calls both
def main():
config = fetch_config() # Sync call
items = asyncio.run(fetch_all_items(range(1, 51))) # Async call
print(f"Config loaded. Fetched {len(items)} items.")
if __name__ == "__main__":
main()
The synchronous fetch_config() runs as normal Python. The async fetch_all_items() is called via asyncio.run(). This pattern lets you migrate hot paths (functions that make many API calls) to async first, and leave simple or infrequent calls synchronous until you are ready to convert them.
You cannot call asyncio.run() from inside an already-running event loop. If you are using a framework like FastAPI that manages its own event loop, use await directly instead of asyncio.run().
httpx vs aiohttp: When Migration Paths Diverge
This guide recommends httpx for sync-to-async migration because it provides a single library that works in both modes. But httpx is not the only async HTTP client, and it is not always the best one. Understanding where it fits on the spectrum prevents you from hitting performance walls later.
aiohttp is a mature, async-only HTTP client and server framework built on top of asyncio. It has been in production use since 2014 and is generally faster than httpx for high-concurrency workloads because its entire architecture is optimized for event-loop I/O. The tradeoff is API complexity -- aiohttp requires more boilerplate for simple requests and offers no synchronous mode at all.
| Factor | httpx | aiohttp |
|---|---|---|
| Sync + async in one library | Yes | No (async only) |
| requests-compatible API | Near drop-in | Different API surface |
| HTTP/2 support | Yes (native) | No |
| Raw async throughput | Good | Better for 1000+ concurrent connections |
| WebSocket support | No | Yes (built-in) |
| Server capabilities | No | Yes (full ASGI server) |
| Migration friction from requests | Low | High |
The decision framework: if you are migrating an existing synchronous codebase and need the lowest possible friction, httpx is the right choice. You can convert incrementally, function by function, without maintaining two separate HTTP libraries. If you are building a new async-first service from scratch and expect to handle thousands of concurrent connections, or if you need WebSocket support, aiohttp is worth the steeper learning curve.
There is also a third option that is often overlooked: HTTP/2 multiplexing. If the API you are calling supports HTTP/2, a single TCP connection can carry multiple request/response streams in parallel without needing asyncio.gather at all. httpx supports HTTP/2 natively (pip install httpx[http2]). This does not replace concurrency patterns -- you still want gather for managing coroutines -- but it reduces the number of TCP connections required and eliminates head-of-line blocking at the transport layer.
Key Takeaways
- Migrate incrementally, not all at once: Start by replacing
requestswithhttpxsynchronously. Verify everything works. Then addasync defandawait. Then replace sequential loops withasyncio.gather. Each step produces testable, working code. - The performance gain comes from asyncio.gather, not from async def: Converting functions to
async defis necessary scaffolding, but the speed improvement happens when you replace sequential loops with concurrent execution viaasyncio.gather. - httpx makes the transition smooth: Because httpx offers both
Client(sync) andAsyncClient(async) with the same API, you can migrate one function at a time without maintaining two separate HTTP libraries. - Configure your client for production: Once you have async code working, add
base_url,Timeout, andLimitsto yourAsyncClient. These settings prevent timeout hangs, reduce URL repetition, and control connection pool sizing. - Throttle concurrent requests: Use
asyncio.Semaphoreto limit how many requests are in flight at once. Unboundedasyncio.gathercalls will overwhelm rate-limited APIs and exhaust connection resources. - Choose the right concurrency primitive:
asyncio.gatherfor batch collection,asyncio.as_completedfor streaming pipelines,asyncio.waitfor dynamic task management, andasyncio.TaskGroup(Python 3.11+) for structured concurrency with automatic cleanup. - Build retry logic from day one: Use exponential backoff with jitter and
asyncio.sleep()-- nevertime.sleep()-- inside coroutines. RespectRetry-Afterheaders from rate-limited APIs. - Handle errors deliberately: Pass
return_exceptions=Truetoasyncio.gatherto capture failures without losing successful results. On Python 3.11+, considerasyncio.TaskGroupfor stricter structured concurrency. - Know when httpx is not enough: For 1000+ concurrent connections or WebSocket support, aiohttp's async-only architecture outperforms httpx. For HTTP/2 multiplexing, httpx has the edge.
- Sync and async can coexist: Use
asyncio.run()to call async functions from synchronous entry points. Convert high-volume API callers first and leave simple one-off calls synchronous until migration is complete.
The path from synchronous to asynchronous Python is not a cliff -- it is a staircase. Each step is small, testable, and reversible. By the time you reach asyncio.gather, you have traded minutes of sequential waiting for fractions of a second of concurrent execution, without rewriting your entire application.
Sources and Further Reading
- httpx official documentation -- Full reference for the httpx library, including async support, client configuration, timeouts, and resource limits.
- httpx: Requests Compatibility -- Official list of API differences between httpx and requests, including redirect behavior, timeout defaults, and proxy configuration.
- httpx: Async Support -- AsyncClient usage, streaming, and connection pooling guidance from the httpx maintainers.
- Python docs: asyncio.gather -- Official CPython documentation for asyncio.gather, including return order guarantees and exception handling.
- Python docs: asyncio.TaskGroup -- Official documentation for the TaskGroup context manager introduced in Python 3.11.
- Python docs: asyncio.as_completed -- Official documentation for processing awaitables in completion order.
- Python docs: asyncio.wait -- Official documentation for fine-grained control over task completion with FIRST_COMPLETED and FIRST_EXCEPTION.
- aiohttp official documentation -- Full reference for the async-only HTTP client and server framework, including session management and WebSocket support.
- httpx: HTTP/2 Support -- Guide to enabling HTTP/2 multiplexing with httpx for reduced connection overhead.
- httpx: Timeout Fine-Tuning -- Detailed reference for httpx.Timeout including connect, read, write, and pool timeout controls.
- httpx: Resource Limits -- Documentation for httpx.Limits covering max_connections, max_keepalive_connections, and keepalive_expiry defaults.