There's a particular moment in every Python developer's journey where they encounter async and await for the first time and think, "This can't be that different from regular code." Then they try to call an async function from a synchronous one and discover that, actually, it's an entirely different universe with its own rules, its own gotchas, and its own decade-long design history that explains why everything works the way it does.
This article walks through all of it. The mental model, the real history, the PEPs that shaped the feature, and the practical code patterns that separate someone who understands async/await from someone who's just copying Stack Overflow snippets and hoping for the best.
The Core Mental Model: Cooperative Multitasking
Before touching a single keyword, you need to understand what problem async programming solves and how it solves it differently from threads.
A traditional synchronous program does one thing at a time. When it makes a network request, it waits. When it reads a file, it waits. The CPU sits idle while the operating system handles I/O. If you need to do multiple things at once, you spawn threads -- but threads come with their own problems: race conditions, locks, and the Global Interpreter Lock (GIL) that limits true CPU parallelism in CPython.
Async programming takes a different approach. Instead of multiple threads blocking on I/O independently, you have a single thread running an event loop. When a coroutine hits an await, it doesn't block the thread -- it yields control back to the event loop, which can then run other coroutines that are ready. When the I/O completes, the event loop resumes the original coroutine right where it left off.
This is cooperative multitasking. Each coroutine voluntarily gives up control at await points. No preemption, no race conditions on shared state between await points, no locks needed for most operations. The trade-off is that every await is an explicit marker in your code where control might switch to something else.
During the python-dev mailing list discussions around PEP 492, Guido van Rossum explicitly rejected approaches where any function call could silently suspend execution -- designs like gevent or Stackless Python where suspension is invisible. He wanted await to be a syntactically visible suspension point, making async code readable and auditable.
The Long Road to async/await: A History in PEPs
Python's async story didn't start with async and await. It was built in layers over more than a decade, each PEP addressing limitations of the previous approach.
PEP 342 -- Coroutines via Enhanced Generators (Python 2.5, 2005)
The foundation. PEP 342 transformed Python's generators from simple data producers into two-way communication channels by turning yield into an expression and adding send(), throw(), and close() methods to generator objects. This made it possible to write generator-based coroutines: functions that could receive values and be resumed, forming the primitive building blocks of cooperative multitasking.
But using generators as coroutines was awkward. There was no syntactic distinction between a generator producing data and a generator being used as a coroutine. You couldn't tell by looking at a function whether it was meant to be iterated over or awaited.
PEP 380 -- Syntax for Delegating to a Subgenerator (Python 3.3, 2009)
PEP 380 added yield from, which allowed a generator to delegate to another generator. This was essential for composable coroutines -- without it, every intermediate coroutine had to manually loop over and re-yield values from sub-coroutines. yield from handled all of that boilerplate automatically, including forwarding send() and throw() calls.
This was the key piece that made generator-based coroutines practical for real-world async frameworks. But the fundamental confusion between generators-as-iterators and generators-as-coroutines remained.
PEP 3156 -- Asynchronous IO Support Rebooted: the asyncio Module (Python 3.4, 2012)
This was Guido van Rossum's personal project. PEP 3156 introduced the asyncio module (originally code-named "Tulip"), providing a standard event loop, transport and protocol abstractions inspired by Twisted, and a higher-level scheduler built on yield from. For the first time, Python had a standard library module for asynchronous I/O. But the syntax was still generator-based:
# The old way (Python 3.4 style)
import asyncio
@asyncio.coroutine
def fetch_data():
yield from asyncio.sleep(1)
return "data"
That @asyncio.coroutine decorator and yield from syntax worked, but it was clunky. You could accidentally use a regular generator where a coroutine was expected, and the error messages were often unhelpful. Worse, you couldn't use yield from inside with statements or for loops in a way that would allow the context manager or iterator to perform asynchronous operations.
PEP 492 -- Coroutines with async and await Syntax (Python 3.5, 2015)
This is the PEP that changed everything. Authored by Yury Selivanov, a CPython core developer and one of the main developers behind asyncio, PEP 492 introduced async def and await as first-class language syntax.
The PEP moved remarkably fast. The ideas were first raised by Selivanov on the python-ideas mailing list in mid-April 2015. By May 5, Guido van Rossum had accepted it. The implementation was committed on May 11 -- barely a month from proposal to merged code.
The new syntax replaced the old approach entirely:
# The new way (Python 3.5+)
import asyncio
async def fetch_data():
await asyncio.sleep(1)
return "data"
Critically, PEP 492 made coroutines their own distinct type, completely separate from generators. After feedback from the Tornado web framework's developers during the Python 3.5 beta, the implementation was redesigned so that native coroutines were no longer a special kind of generator but an entirely new object type with their own CO_COROUTINE flag. This separation eliminated an entire class of bugs where generators and coroutines could be confused.
PEP 525 -- Asynchronous Generators (Python 3.6, 2016)
Also authored by Yury Selivanov, PEP 525 brought yield into the async world:
async def ticker(delay, to):
for i in range(to):
yield i
await asyncio.sleep(delay)
The PEP noted that in testing, asynchronous generators were twice as fast as the equivalent implemented as an asynchronous iterator class. Before this PEP, creating an async iterable required defining a class with __aiter__ and __anext__ methods -- verbose boilerplate that discouraged async iteration patterns.
PEP 530 -- Asynchronous Comprehensions (Python 3.6, 2016)
Yet another Selivanov PEP, this one brought comprehension syntax to async code:
# Async comprehension
result = [i async for i in async_iterator() if i % 2]
# Await in comprehensions
result = [await func() for func in coroutine_list]
The Fundamentals: What Actually Happens When You Write async def
An async def function is a coroutine function. Calling it doesn't execute the function body -- it returns a coroutine object:
import asyncio
async def greet(name):
await asyncio.sleep(0.1)
return f"Hello, {name}"
# This does NOT run the function
coro = greet("Alice")
print(type(coro))
# <class 'coroutine'>
# You'll also get a warning:
# RuntimeWarning: coroutine 'greet' was never awaited
This is the single most common source of confusion for newcomers. The coroutine object is inert until something drives it -- either await, asyncio.run(), or the event loop's task scheduler.
To actually run it:
# Option 1: asyncio.run() — the standard entry point
result = asyncio.run(greet("Alice"))
print(result) # Hello, Alice
# Option 2: await from within another coroutine
async def main():
result = await greet("Alice")
print(result)
asyncio.run(main())
asyncio.run() creates an event loop, runs the coroutine to completion, and then closes the loop. It's the bridge between the synchronous world and the async world, and it can only be called from synchronous code. Trying to call asyncio.run() from inside an already-running event loop raises a RuntimeError.
The "Colored Functions" Problem
In February 2015, just months before PEP 492 was accepted, Bob Nystrom from Google's Dart team published a widely-discussed blog post titled "What Color is Your Function?" In it, he argued that async/await creates two incompatible kinds of functions -- "red" (async) and "blue" (sync) -- with restrictive rules about how they can call each other.
The core complaint: you can call a sync function from anywhere, but you can only await an async function from inside another async function. This "infects" your codebase. If one function deep in your call stack needs to become async, every function above it in the chain must also become async. Libraries need async variants. Testing gets more complex. The world splits in two.
This criticism has real merit, and Python developers feel it daily. You can't use requests.get() inside an async function without blocking the event loop. You need aiohttp or httpx instead. Your SQLAlchemy queries need an async driver. Your file I/O needs aiofiles or asyncio.to_thread().
The Python community has largely come to terms with the colored-functions divide. The explicit suspension points that await provides are also its greatest strength: you can look at any async function and know exactly where it might yield control, which makes reasoning about concurrent state far easier than in systems where any function call might secretly suspend your thread.
Concurrency in Practice: gather(), TaskGroup, and Real Patterns
Running Coroutines Concurrently
The whole point of async is running multiple I/O operations concurrently. Here's the difference between sequential and concurrent execution:
import asyncio
import time
async def fetch(name, delay):
print(f"Starting {name}")
await asyncio.sleep(delay)
print(f"Finished {name}")
return f"{name}: {delay}s"
async def sequential():
start = time.perf_counter()
a = await fetch("A", 2)
b = await fetch("B", 1)
c = await fetch("C", 1.5)
elapsed = time.perf_counter() - start
print(f"Sequential: {elapsed:.1f}s") # ~4.5s
async def concurrent():
start = time.perf_counter()
a, b, c = await asyncio.gather(
fetch("A", 2),
fetch("B", 1),
fetch("C", 1.5),
)
elapsed = time.perf_counter() - start
print(f"Concurrent: {elapsed:.1f}s") # ~2.0s
asyncio.run(sequential())
asyncio.run(concurrent())
The sequential version takes 4.5 seconds. The concurrent version takes 2 seconds -- the duration of the longest single task -- because all three are running "at the same time" on the event loop.
TaskGroup: The Modern Approach (Python 3.11+)
Python 3.11 introduced asyncio.TaskGroup, contributed by Yury Selivanov and others. The official Python documentation now recommends TaskGroup over create_task() and gather() directly for new code. The reason is structured concurrency: TaskGroup guarantees that all tasks are finished (or cancelled) when the async with block exits, and it handles exceptions more safely than gather().
import asyncio
async def process_item(item_id):
await asyncio.sleep(0.5)
if item_id == 3:
raise ValueError(f"Item {item_id} is invalid")
return f"Processed {item_id}"
async def main():
try:
async with asyncio.TaskGroup() as tg:
tasks = [
tg.create_task(process_item(i))
for i in range(5)
]
except ExceptionGroup as eg:
for exc in eg.exceptions:
print(f"Error: {exc}")
asyncio.run(main())
When any task in a TaskGroup raises an exception, all remaining tasks are automatically cancelled. This is a fundamental improvement over gather(), where a failing task doesn't cancel siblings by default, potentially leaving orphaned tasks running in the background indefinitely.
Common Traps and How to Avoid Them
Trap 1: Blocking the Event Loop
This is the single most damaging mistake in async Python. If you call a synchronous, blocking function inside a coroutine, you freeze the entire event loop:
import asyncio
import time
async def bad_example():
# This blocks the ENTIRE event loop for 3 seconds
time.sleep(3) # WRONG — synchronous sleep
return "done"
async def good_example():
# This yields control to the event loop
await asyncio.sleep(3) # RIGHT — async sleep
return "done"
The fix for unavoidable blocking calls is asyncio.to_thread(), which offloads the blocking work to a thread pool:
import asyncio
def cpu_intensive_work(data):
# Imagine this is a heavy computation or blocking I/O
return sum(x * x for x in range(data))
async def main():
# Run the blocking function in a thread without freezing the loop
result = await asyncio.to_thread(cpu_intensive_work, 10_000_000)
print(f"Result: {result}")
asyncio.run(main())
Trap 2: Creating Coroutines Without Awaiting Them
async def send_notification(message):
await asyncio.sleep(0.1)
print(f"Sent: {message}")
async def main():
# WRONG — creates the coroutine but never runs it
send_notification("hello")
# RIGHT — actually runs the coroutine
await send_notification("hello")
asyncio.run(main())
Python will warn you with RuntimeWarning: coroutine 'send_notification' was never awaited, but in a busy codebase these warnings can slip through. Pay attention to them -- they always indicate a real bug.
Trap 3: Fire-and-Forget Tasks Getting Garbage Collected
async def background_work():
await asyncio.sleep(5)
print("Background work done")
async def main():
# WRONG — task may be garbage collected before it completes
asyncio.create_task(background_work())
# RIGHT — keep a reference
task = asyncio.create_task(background_work())
# ... do other things ...
await task # or store in a set/list
If you don't hold a reference to a task, Python's garbage collector can destroy it before it finishes. The CPython documentation explicitly warns about this. Always store task references.
Trap 4: Trying to await from Synchronous Code
# This DOES NOT work
def process_request():
result = await fetch_data() # SyntaxError!
return result
You can only use await inside an async def function. If you need to call async code from sync code, you have limited options: asyncio.run() (if no event loop is already running), or restructure your code so the async boundary is at the top level.
A Real-World Pattern: Concurrent API Requests
Here's a practical example that ties the concepts together -- fetching data from multiple API endpoints concurrently with proper error handling and timeouts:
import asyncio
import time
async def fetch_api(endpoint, delay, should_fail=False):
"""Simulates an API call with variable latency."""
await asyncio.sleep(delay)
if should_fail:
raise ConnectionError(f"Failed to reach {endpoint}")
return {"endpoint": endpoint, "status": 200, "data": f"Response from {endpoint}"}
async def fetch_with_timeout(endpoint, delay, timeout_seconds=3.0, should_fail=False):
"""Wraps a fetch with a timeout."""
try:
async with asyncio.timeout(timeout_seconds):
return await fetch_api(endpoint, delay, should_fail)
except TimeoutError:
return {"endpoint": endpoint, "status": "timeout", "data": None}
except ConnectionError as e:
return {"endpoint": endpoint, "status": "error", "data": str(e)}
async def main():
endpoints = [
("users", 0.5, False),
("orders", 1.2, False),
("inventory", 0.3, False),
("analytics", 5.0, False), # Will timeout
("payments", 0.8, True), # Will fail
]
start = time.perf_counter()
results = await asyncio.gather(*[
fetch_with_timeout(ep, delay, timeout_seconds=2.0, should_fail=fail)
for ep, delay, fail in endpoints
])
elapsed = time.perf_counter() - start
for result in results:
status = result["status"]
ep = result["endpoint"]
print(f" {ep}: {status}")
print(f"\nAll requests completed in {elapsed:.1f}s")
asyncio.run(main())
All five requests run concurrently. The total time is roughly 2 seconds (the timeout duration) rather than the 7.8 seconds it would take sequentially. Failures and timeouts are handled gracefully per-request, and no single failure crashes the entire batch.
What Async Is NOT For
Async/await is designed for I/O-bound concurrency: network requests, database queries, file operations, websocket connections. It is explicitly not designed for CPU-bound work.
If you need to run heavy computation in parallel, use multiprocessing or concurrent.futures.ProcessPoolExecutor. The event loop runs on a single thread. A CPU-bound coroutine that does heavy number crunching without any await points will monopolize the event loop and starve all other coroutines.
import asyncio
async def cpu_bound_bad():
# This monopolizes the event loop — no await points
total = sum(i * i for i in range(50_000_000))
return total
async def cpu_bound_good():
# Offload to a thread (or process) pool
loop = asyncio.get_running_loop()
total = await loop.run_in_executor(
None, # Default thread pool
lambda: sum(i * i for i in range(50_000_000))
)
return total
The State of Async in 2025
Python's async ecosystem has matured significantly since the early days of PEP 492. asyncio.TaskGroup and asyncio.timeout() (both introduced in Python 3.11) brought structured concurrency into the standard library. Python 3.12 added asyncio.eager_task_factory(), which can speed up certain async-heavy workloads by four to six times by running short coroutines immediately instead of scheduling them for the next event loop iteration. Python 3.13 further improved TaskGroup's handling of simultaneous cancellations.
Third-party libraries like Trio (which pioneered structured concurrency in Python) and anyio (which provides a compatibility layer between asyncio and Trio) continue to push the ecosystem forward. Web frameworks like FastAPI, Starlette, and Litestar are built async-first. Database drivers like asyncpg and databases provide native async support.
The legacy @asyncio.coroutine decorator for generator-based coroutines was removed entirely in Python 3.11. async/await is the way.
Key Takeaways
- Async is cooperative, not preemptive. Every
awaitis an explicit yield point. Betweenawaits, your coroutine runs uninterrupted on a single thread -- no race conditions, no locks needed for local state. - Calling a coroutine function does not run it. It returns a coroutine object. You must
awaitit, pass it toasyncio.run(), or schedule it as a task for anything to happen. - The colored-functions divide is real but manageable.
awaitcan only be used insideasync def. Plan your async boundary at the top level of your application and work down from there. - Use
TaskGroupfor structured concurrency. It's the modern, safer alternative to rawgather()for new Python 3.11+ code -- exceptions cancel siblings automatically and task lifetimes are scoped. - Async is for I/O, not CPU. For CPU-bound work, reach for
multiprocessingorrun_in_executor(). A blocking computation inside a coroutine freezes your entire event loop.
Python's async/await is not syntactic sugar sprinkled over threads. It's a fundamentally different concurrency model built deliberately over a decade through PEPs 342, 380, 3156, 492, 525, and 530, shaped by Guido van Rossum, Yury Selivanov, and the broader Python community. Don't just learn the syntax. Learn the machinery. That's the difference between writing async code and understanding it.