Standard Python code runs one line at a time. When your script sends an API request, it sits idle until the server responds. For a single call, that pause is barely noticeable. But when you need to hit 50 endpoints, or 500, those pauses add up fast. Async programming with asyncio and aiohttp lets Python fire off requests without waiting for each one to finish before starting the next.
This guide walks through everything from the fundamentals of async and await to running hundreds of concurrent HTTP requests. By the end, you will have working code you can adapt for your own projects, whether you are pulling data from a REST API, scraping web content, or building a service that talks to multiple backends.
Why Synchronous API Calls Create a Bottleneck
In standard Python, HTTP requests are blocking operations. When you call requests.get(), your program stops executing and waits for the server to send back a response. Nothing else happens during that wait. If the server takes 300 milliseconds to respond and you need to call it 100 times, your script spends 30 seconds doing nothing but waiting.
Here is what that looks like in code using the popular requests library:
import requests
import time
def fetch_users_sync(user_ids):
results = []
for user_id in user_ids:
response = requests.get(f"https://jsonplaceholder.typicode.com/users/{user_id}")
results.append(response.json())
return results
start = time.perf_counter()
users = fetch_users_sync(range(1, 11))
elapsed = time.perf_counter() - start
print(f"Fetched {len(users)} users in {elapsed:.2f} seconds")
With 10 users and a typical response time of 200-400 milliseconds per call, this function takes around 2-4 seconds to complete. Each request waits for the previous one to finish. The CPU is idle for almost the entire duration. The bottleneck is not processing power, it is waiting for network I/O.
Async programming does not make individual requests faster. The server still takes the same amount of time to respond. What changes is that your program can send request #2 while still waiting for the response to request #1. It is concurrency, not parallelism.
The Core Concepts: Coroutines, Event Loops, and await
Before writing async code, you need to understand three things that make it work.
Coroutines
A coroutine is a function defined with async def instead of def. Calling a coroutine does not execute it immediately. Instead, it returns a coroutine object that must be awaited or scheduled on an event loop.
async def fetch_data():
print("This is a coroutine")
return {"status": "ok"}
# This does NOT run the function:
result = fetch_data() # Returns a coroutine object
# This DOES run it:
import asyncio
result = asyncio.run(fetch_data()) # Prints and returns {"status": "ok"}
The Event Loop
The event loop is the engine that manages and schedules coroutines. Think of it as a task manager that keeps track of which coroutines are ready to run and which are waiting for I/O. When one coroutine hits an await statement, the event loop pauses it and runs another coroutine that is ready. The function asyncio.run() creates an event loop, runs the coroutine you pass to it, and then shuts down the loop when everything finishes.
The await Keyword
The await keyword tells the event loop: "This operation will take some time. Pause me here and go run something else until this is done." You can only use await inside an async def function, and you can only await objects that are "awaitable" (coroutines, tasks, and futures).
If you are using Python 3.11 or later, you also have access to asyncio.TaskGroup, which provides a structured way to manage concurrent tasks with better error handling. This article uses asyncio.gather because it works on all Python versions from 3.7 onward.
Your First Async API Call with aiohttp
The aiohttp library is an asynchronous HTTP client and server framework built on top of asyncio. Unlike the requests library, which blocks during HTTP calls, aiohttp uses non-blocking I/O so the event loop can manage other tasks while waiting for a server response.
First, install it:
pip install aiohttp
Here is a basic async API call:
import asyncio
import aiohttp
async def fetch_user(session, user_id):
url = f"https://jsonplaceholder.typicode.com/users/{user_id}"
async with session.get(url) as response:
return await response.json()
async def main():
async with aiohttp.ClientSession() as session:
user = await fetch_user(session, 1)
print(user["name"])
asyncio.run(main())
Let's walk through what is happening here, because every line matters.
The aiohttp.ClientSession() object manages a pool of connections. Creating it with async with ensures it is properly closed when you are finished. This is important because an unclosed session will leak connections and eventually cause errors. The session should be created once and reused across all your requests, not created fresh for every call.
Inside fetch_user, the session.get(url) call returns an awaitable response. The async with block ensures the response is released back to the connection pool when you are done reading it. The await response.json() call reads and parses the response body asynchronously.
Never create a new ClientSession inside a loop. A session per request defeats the purpose of connection pooling and will slow down your code instead of speeding it up.
Running Multiple API Requests Concurrently with asyncio.gather
A single async request is not much faster than a synchronous one. The real power shows up when you run many requests at the same time. The asyncio.gather function takes multiple coroutines and runs them concurrently, returning their results in the same order you passed them in.
import asyncio
import aiohttp
import time
async def fetch_user(session, user_id):
url = f"https://jsonplaceholder.typicode.com/users/{user_id}"
async with session.get(url) as response:
data = await response.json()
return data
async def fetch_all_users(user_ids):
async with aiohttp.ClientSession() as session:
tasks = [fetch_user(session, uid) for uid in user_ids]
results = await asyncio.gather(*tasks)
return results
async def main():
start = time.perf_counter()
users = await fetch_all_users(range(1, 11))
elapsed = time.perf_counter() - start
for user in users:
print(f" {user['id']}: {user['name']}")
print(f"\nFetched {len(users)} users in {elapsed:.2f} seconds")
asyncio.run(main())
The list comprehension [fetch_user(session, uid) for uid in user_ids] creates 10 coroutine objects without starting them. The asyncio.gather(*tasks) call schedules all 10 on the event loop simultaneously. While one request is waiting for a server response, the event loop switches to another coroutine that is ready to do work. The result: all 10 requests complete in roughly the time it takes for one single request to finish.
Sync vs Async: A Side-by-Side Comparison
The following table compares the synchronous approach using requests with the asynchronous approach using aiohttp across several dimensions that matter in practice.
| Characteristic | requests (Synchronous) | aiohttp (Asynchronous) |
|---|---|---|
| 10 sequential API calls at 200ms each | ~2,000ms total | ~200ms total |
| Execution model | Blocking, one at a time | Non-blocking, concurrent |
| Connection management | Session object (optional) | ClientSession (required, with pooling) |
| Learning curve | Low, familiar Python | Moderate, requires understanding async/await |
| Best for | Simple scripts, few requests | High-volume requests, real-time apps |
| Error handling | Standard try/except | try/except inside coroutines, plus task-level errors |
Common Mistakes and How to Avoid Them
Forgetting to await a coroutine
If you call a coroutine without await, Python will not raise an error immediately. Instead, you get a coroutine object instead of the result you expected. Python may print a runtime warning about the coroutine never being awaited, but this is easy to miss.
# Wrong: missing await
async def main():
async with aiohttp.ClientSession() as session:
result = fetch_user(session, 1) # This is a coroutine object, not data
print(result) # Prints: <coroutine object fetch_user at 0x...>
# Correct: include await
async def main():
async with aiohttp.ClientSession() as session:
result = await fetch_user(session, 1) # This is the actual JSON data
print(result)
Creating a session inside a loop
Each ClientSession creates its own connection pool, DNS cache, and cookie jar. Opening and closing one per request adds overhead that eliminates the performance gains of async code.
# Wrong: new session per request
async def fetch_all_bad(user_ids):
results = []
for uid in user_ids:
async with aiohttp.ClientSession() as session: # Wasteful
async with session.get(f"https://api.example.com/users/{uid}") as resp:
results.append(await resp.json())
return results
# Correct: one session, many requests
async def fetch_all_good(user_ids):
async with aiohttp.ClientSession() as session: # Created once
tasks = [fetch_user(session, uid) for uid in user_ids]
return await asyncio.gather(*tasks)
Sending too many requests at once
If you launch 10,000 concurrent requests, you will likely overwhelm the server, trigger rate limits, or exhaust your system's file descriptors. Use an asyncio.Semaphore to cap the number of concurrent requests.
import asyncio
import aiohttp
async def fetch_with_limit(session, url, semaphore):
async with semaphore:
async with session.get(url) as response:
return await response.json()
async def main():
semaphore = asyncio.Semaphore(20) # Max 20 concurrent requests
urls = [f"https://jsonplaceholder.typicode.com/posts/{i}" for i in range(1, 101)]
async with aiohttp.ClientSession() as session:
tasks = [fetch_with_limit(session, url, semaphore) for url in urls]
results = await asyncio.gather(*tasks)
print(f"Fetched {len(results)} posts")
asyncio.run(main())
The semaphore acts as a gatekeeper. Only 20 coroutines can enter the async with semaphore: block at any given time. The rest wait until a slot opens up. This keeps your request volume at a level that servers and your network can handle.
Key Takeaways
- Async solves the waiting problem: Standard Python blocks during API calls. Using
asyncioandaiohttp, your code sends multiple requests concurrently while waiting for responses, dramatically reducing total execution time for I/O-bound operations. - Reuse your ClientSession: Create one
aiohttp.ClientSessionand pass it to all your request functions. This enables connection pooling and avoids the overhead of establishing new connections for every request. - Use asyncio.gather for concurrent execution: Build a list of coroutines and pass them to
asyncio.gather()to run them at the same time. Results come back in the same order you passed the coroutines in. - Control concurrency with Semaphores: Sending thousands of requests simultaneously can overwhelm servers and your own system. Use
asyncio.Semaphoreto set a maximum number of concurrent requests. - async and await are not optional keywords: Every coroutine must be awaited. Every function that uses
awaitmust be declared withasync def. Missing either one produces silent bugs that are difficult to trace.
Async programming introduces new patterns, but the core concept is straightforward: instead of waiting in line, your code takes a number and moves on. Once you internalize that mental model, asyncio and aiohttp become natural tools for any Python project that talks to APIs, databases, or external services over a network.