Python's async/await: What It Actually Does and How to Use It

There's a particular moment in every Python developer's journey where they encounter async and await for the first time and think, "This can't be that different from regular code." Then they try to call an async function from a synchronous one and discover that, actually, it's an entirely different universe with its own rules, its own gotchas, and its own decade-long design history that explains why everything works the way it does.

This article walks through all of it. The mental model, the real history, the PEPs that shaped the feature, and the practical code patterns that separate someone who understands async/await from someone who's just copying Stack Overflow snippets and hoping for the best.

The Core Mental Model: Cooperative Multitasking

Before touching a single keyword, you need to understand what problem async programming solves and how it solves it differently from threads.

A traditional synchronous program does one thing at a time. When it makes a network request, it waits. When it reads a file, it waits. The CPU sits idle while the operating system handles I/O. If you need to do multiple things at once, you spawn threads -- but threads come with their own problems: race conditions, locks, and the Global Interpreter Lock (GIL) that limits true CPU parallelism in CPython.

Async programming takes a different approach. Instead of multiple threads blocking on I/O independently, you have a single thread running an event loop. When a coroutine hits an await, it doesn't block the thread -- it yields control back to the event loop, which can then run other coroutines that are ready. When the I/O completes, the event loop resumes the original coroutine right where it left off.

This is cooperative multitasking. Each coroutine voluntarily gives up control at await points. No preemption, no race conditions on shared state between await points, no locks needed for most operations. The trade-off is that every await is an explicit marker in your code where control might switch to something else.

Design Philosophy

During the python-dev mailing list discussions around PEP 492, Guido van Rossum explicitly rejected approaches where any function call could silently suspend execution -- designs like gevent or Stackless Python where suspension is invisible. He wanted await to be a syntactically visible suspension point, making async code readable and auditable.

The Long Road to `async`/`await`: A History in PEPs

Python's async story didn't start with async and await. It was built in layers over more than a decade, each PEP addressing limitations of the previous approach.

PEP 342 -- Coroutines via Enhanced Generators (Python 2.5, 2005)

The foundation. PEP 342 transformed Python's generators from simple data producers into two-way communication channels by turning yield into an expression and adding send(), throw(), and close() methods to generator objects. This made it possible to write generator-based coroutines: functions that could receive values and be resumed, forming the primitive building blocks of cooperative multitasking.

But using generators as coroutines was awkward. There was no syntactic distinction between a generator producing data and a generator being used as a coroutine. You couldn't tell by looking at a function whether it was meant to be iterated over or awaited.

PEP 380 -- Syntax for Delegating to a Subgenerator (Python 3.3, 2009)

PEP 380 added yield from, which allowed a generator to delegate to another generator. This was essential for composable coroutines -- without it, every intermediate coroutine had to manually loop over and re-yield values from sub-coroutines. yield from handled all of that boilerplate automatically, including forwarding send() and throw() calls.

This was the key piece that made generator-based coroutines practical for real-world async frameworks. But the fundamental confusion between generators-as-iterators and generators-as-coroutines remained.

PEP 3156 -- Asynchronous IO Support Rebooted: the `asyncio` Module (Python 3.4, 2012)

This was Guido van Rossum's personal project. PEP 3156 introduced the asyncio module (originally code-named "Tulip"), providing a standard event loop, transport and protocol abstractions inspired by Twisted, and a higher-level scheduler built on yield from. For the first time, Python had a standard library module for asynchronous I/O. But the syntax was still generator-based:

# The old way (Python 3.4 style)
import asyncio

@asyncio.coroutine
def fetch_data():
    yield from asyncio.sleep(1)
    return "data"

That @asyncio.coroutine decorator and yield from syntax worked, but it was clunky. You could accidentally use a regular generator where a coroutine was expected, and the error messages were often unhelpful. Worse, you couldn't use yield from inside with statements or for loops in a way that would allow the context manager or iterator to perform asynchronous operations.

PEP 492 -- Coroutines with `async` and `await` Syntax (Python 3.5, 2015)

This is the PEP that changed everything. Authored by Yury Selivanov, a CPython core developer and one of the main developers behind asyncio, PEP 492 introduced async def and await as first-class language syntax.

The PEP moved remarkably fast. The ideas were first raised by Selivanov on the python-ideas mailing list in mid-April 2015. By May 5, Guido van Rossum had accepted it. The implementation was committed on May 11 -- barely a month from proposal to merged code.

The new syntax replaced the old approach entirely:

# The new way (Python 3.5+)
import asyncio

async def fetch_data():
    await asyncio.sleep(1)
    return "data"

Critically, PEP 492 made coroutines their own distinct type, completely separate from generators. After feedback from the Tornado web framework's developers during the Python 3.5 beta, the implementation was redesigned so that native coroutines were no longer a special kind of generator but an entirely new object type with their own CO_COROUTINE flag. This separation eliminated an entire class of bugs where generators and coroutines could be confused.

PEP 525 -- Asynchronous Generators (Python 3.6, 2016)

Also authored by Yury Selivanov, PEP 525 brought yield into the async world:

async def ticker(delay, to):
    for i in range(to):
        yield i
        await asyncio.sleep(delay)

The PEP noted that in testing, asynchronous generators were twice as fast as the equivalent implemented as an asynchronous iterator class. Before this PEP, creating an async iterable required defining a class with __aiter__ and __anext__ methods -- verbose boilerplate that discouraged async iteration patterns.

PEP 530 -- Asynchronous Comprehensions (Python 3.6, 2016)

Yet another Selivanov PEP, this one brought comprehension syntax to async code:

# Async comprehension
result = [i async for i in async_iterator() if i % 2]

# Await in comprehensions
result = [await func() for func in coroutine_list]

The Fundamentals: What Actually Happens When You Write `async def`

An async def function is a coroutine function. Calling it doesn't execute the function body -- it returns a coroutine object:

import asyncio

async def greet(name):
    await asyncio.sleep(0.1)
    return f"Hello, {name}"

# This does NOT run the function
coro = greet("Alice")
print(type(coro))
# <class 'coroutine'>

# You'll also get a warning:
# RuntimeWarning: coroutine 'greet' was never awaited

This is the single most common source of confusion for newcomers. The coroutine object is inert until something drives it -- either await, asyncio.run(), or the event loop's task scheduler.

To actually run it:

# Option 1: asyncio.run() — the standard entry point
result = asyncio.run(greet("Alice"))
print(result)  # Hello, Alice

# Option 2: await from within another coroutine
async def main():
    result = await greet("Alice")
    print(result)

asyncio.run(main())

Note

asyncio.run() creates an event loop, runs the coroutine to completion, and then closes the loop. It's the bridge between the synchronous world and the async world, and it can only be called from synchronous code. Trying to call asyncio.run() from inside an already-running event loop raises a RuntimeError.

The "Colored Functions" Problem

In February 2015, just months before PEP 492 was accepted, Bob Nystrom from Google's Dart team published a widely-discussed blog post titled "What Color is Your Function?" In it, he argued that async/await creates two incompatible kinds of functions -- "red" (async) and "blue" (sync) -- with restrictive rules about how they can call each other.

The core complaint: you can call a sync function from anywhere, but you can only await an async function from inside another async function. This "infects" your codebase. If one function deep in your call stack needs to become async, every function above it in the chain must also become async. Libraries need async variants. Testing gets more complex. The world splits in two.

This criticism has real merit, and Python developers feel it daily. You can't use requests.get() inside an async function without blocking the event loop. You need aiohttp or httpx instead. Your SQLAlchemy queries need an async driver. Your file I/O needs aiofiles or asyncio.to_thread().

The Trade-off

The Python community has largely come to terms with the colored-functions divide. The explicit suspension points that await provides are also its greatest strength: you can look at any async function and know exactly where it might yield control, which makes reasoning about concurrent state far easier than in systems where any function call might secretly suspend your thread.

Concurrency in Practice: `gather()`, `TaskGroup`, and Real Patterns

Running Coroutines Concurrently

The whole point of async is running multiple I/O operations concurrently. Here's the difference between sequential and concurrent execution:

import asyncio
import time

async def fetch(name, delay):
    print(f"Starting {name}")
    await asyncio.sleep(delay)
    print(f"Finished {name}")
    return f"{name}: {delay}s"

async def sequential():
    start = time.perf_counter()
    a = await fetch("A", 2)
    b = await fetch("B", 1)
    c = await fetch("C", 1.5)
    elapsed = time.perf_counter() - start
    print(f"Sequential: {elapsed:.1f}s")  # ~4.5s

async def concurrent():
    start = time.perf_counter()
    a, b, c = await asyncio.gather(
        fetch("A", 2),
        fetch("B", 1),
        fetch("C", 1.5),
    )
    elapsed = time.perf_counter() - start
    print(f"Concurrent: {elapsed:.1f}s")  # ~2.0s

asyncio.run(sequential())
asyncio.run(concurrent())

The sequential version takes 4.5 seconds. The concurrent version takes 2 seconds -- the duration of the longest single task -- because all three are running "at the same time" on the event loop.

`TaskGroup`: The Modern Approach (Python 3.11+)

Python 3.11 introduced asyncio.TaskGroup, contributed by Yury Selivanov and others. The official Python documentation now recommends TaskGroup over create_task() and gather() directly for new code. The reason is structured concurrency: TaskGroup guarantees that all tasks are finished (or cancelled) when the async with block exits, and it handles exceptions more safely than gather().

import asyncio

async def process_item(item_id):
    await asyncio.sleep(0.5)
    if item_id == 3:
        raise ValueError(f"Item {item_id} is invalid")
    return f"Processed {item_id}"

async def main():
    try:
        async with asyncio.TaskGroup() as tg:
            tasks = [
                tg.create_task(process_item(i))
                for i in range(5)
            ]
    except ExceptionGroup as eg:
        for exc in eg.exceptions:
            print(f"Error: {exc}")

asyncio.run(main())

When any task in a TaskGroup raises an exception, all remaining tasks are automatically cancelled. This is a fundamental improvement over gather(), where a failing task doesn't cancel siblings by default, potentially leaving orphaned tasks running in the background indefinitely.

Common Traps and How to Avoid Them

Trap 1: Blocking the Event Loop

This is the single most damaging mistake in async Python. If you call a synchronous, blocking function inside a coroutine, you freeze the entire event loop:

import asyncio
import time

async def bad_example():
    # This blocks the ENTIRE event loop for 3 seconds
    time.sleep(3)  # WRONG — synchronous sleep
    return "done"

async def good_example():
    # This yields control to the event loop
    await asyncio.sleep(3)  # RIGHT — async sleep
    return "done"

The fix for unavoidable blocking calls is asyncio.to_thread(), which offloads the blocking work to a thread pool:

import asyncio

def cpu_intensive_work(data):
    # Imagine this is a heavy computation or blocking I/O
    return sum(x * x for x in range(data))

async def main():
    # Run the blocking function in a thread without freezing the loop
    result = await asyncio.to_thread(cpu_intensive_work, 10_000_000)
    print(f"Result: {result}")

asyncio.run(main())

Trap 2: Creating Coroutines Without Awaiting Them

async def send_notification(message):
    await asyncio.sleep(0.1)
    print(f"Sent: {message}")

async def main():
    # WRONG — creates the coroutine but never runs it
    send_notification("hello")
    
    # RIGHT — actually runs the coroutine
    await send_notification("hello")

asyncio.run(main())

Warning

Python will warn you with RuntimeWarning: coroutine 'send_notification' was never awaited, but in a busy codebase these warnings can slip through. Pay attention to them -- they always indicate a real bug.

Trap 3: Fire-and-Forget Tasks Getting Garbage Collected

async def background_work():
    await asyncio.sleep(5)
    print("Background work done")

async def main():
    # WRONG — task may be garbage collected before it completes
    asyncio.create_task(background_work())

    # RIGHT — keep a reference
    task = asyncio.create_task(background_work())
    # ... do other things ...
    await task  # or store in a set/list

If you don't hold a reference to a task, Python's garbage collector can destroy it before it finishes. The CPython documentation explicitly warns about this. Always store task references.

Trap 4: Trying to `await` from Synchronous Code

# This DOES NOT work
def process_request():
    result = await fetch_data()  # SyntaxError!
    return result

You can only use await inside an async def function. If you need to call async code from sync code, you have limited options: asyncio.run() (if no event loop is already running), or restructure your code so the async boundary is at the top level.

A Real-World Pattern: Concurrent API Requests

Here's a practical example that ties the concepts together -- fetching data from multiple API endpoints concurrently with proper error handling and timeouts:

import asyncio
import time

async def fetch_api(endpoint, delay, should_fail=False):
    """Simulates an API call with variable latency."""
    await asyncio.sleep(delay)
    if should_fail:
        raise ConnectionError(f"Failed to reach {endpoint}")
    return {"endpoint": endpoint, "status": 200, "data": f"Response from {endpoint}"}

async def fetch_with_timeout(endpoint, delay, timeout_seconds=3.0, should_fail=False):
    """Wraps a fetch with a timeout."""
    try:
        async with asyncio.timeout(timeout_seconds):
            return await fetch_api(endpoint, delay, should_fail)
    except TimeoutError:
        return {"endpoint": endpoint, "status": "timeout", "data": None}
    except ConnectionError as e:
        return {"endpoint": endpoint, "status": "error", "data": str(e)}

async def main():
    endpoints = [
        ("users", 0.5, False),
        ("orders", 1.2, False),
        ("inventory", 0.3, False),
        ("analytics", 5.0, False),    # Will timeout
        ("payments", 0.8, True),       # Will fail
    ]

    start = time.perf_counter()
    
    results = await asyncio.gather(*[
        fetch_with_timeout(ep, delay, timeout_seconds=2.0, should_fail=fail)
        for ep, delay, fail in endpoints
    ])

    elapsed = time.perf_counter() - start

    for result in results:
        status = result["status"]
        ep = result["endpoint"]
        print(f"  {ep}: {status}")

    print(f"\nAll requests completed in {elapsed:.1f}s")

asyncio.run(main())

All five requests run concurrently. The total time is roughly 2 seconds (the timeout duration) rather than the 7.8 seconds it would take sequentially. Failures and timeouts are handled gracefully per-request, and no single failure crashes the entire batch.

What Async Is NOT For

Async/await is designed for I/O-bound concurrency: network requests, database queries, file operations, websocket connections. It is explicitly not designed for CPU-bound work.

If you need to run heavy computation in parallel, use multiprocessing or concurrent.futures.ProcessPoolExecutor. The event loop runs on a single thread. A CPU-bound coroutine that does heavy number crunching without any await points will monopolize the event loop and starve all other coroutines.

import asyncio

async def cpu_bound_bad():
    # This monopolizes the event loop — no await points
    total = sum(i * i for i in range(50_000_000))
    return total

async def cpu_bound_good():
    # Offload to a thread (or process) pool
    loop = asyncio.get_running_loop()
    total = await loop.run_in_executor(
        None,  # Default thread pool
        lambda: sum(i * i for i in range(50_000_000))
    )
    return total

The State of Async in 2025

Python's async ecosystem has matured significantly since the early days of PEP 492. asyncio.TaskGroup and asyncio.timeout() (both introduced in Python 3.11) brought structured concurrency into the standard library. Python 3.12 added asyncio.eager_task_factory(), which can speed up certain async-heavy workloads by four to six times by running short coroutines immediately instead of scheduling them for the next event loop iteration. Python 3.13 further improved TaskGroup's handling of simultaneous cancellations.

Third-party libraries like Trio (which pioneered structured concurrency in Python) and anyio (which provides a compatibility layer between asyncio and Trio) continue to push the ecosystem forward. Web frameworks like FastAPI, Starlette, and Litestar are built async-first. Database drivers like asyncpg and databases provide native async support.

The Old Way Is Gone

The legacy @asyncio.coroutine decorator for generator-based coroutines was removed entirely in Python 3.11. async/await is the way.

Key Takeaways

Async is cooperative, not preemptive. Every await is an explicit yield point. Between awaits, your coroutine runs uninterrupted on a single thread -- no race conditions, no locks needed for local state.
Calling a coroutine function does not run it. It returns a coroutine object. You must await it, pass it to asyncio.run(), or schedule it as a task for anything to happen.
The colored-functions divide is real but manageable. await can only be used inside async def. Plan your async boundary at the top level of your application and work down from there.
Use TaskGroup for structured concurrency. It's the modern, safer alternative to raw gather() for new Python 3.11+ code -- exceptions cancel siblings automatically and task lifetimes are scoped.
Async is for I/O, not CPU. For CPU-bound work, reach for multiprocessing or run_in_executor(). A blocking computation inside a coroutine freezes your entire event loop.

Python's async/await is not syntactic sugar sprinkled over threads. It's a fundamentally different concurrency model built deliberately over a decade through PEPs 342, 380, 3156, 492, 525, and 530, shaped by Guido van Rossum, Yury Selivanov, and the broader Python community. Don't just learn the syntax. Learn the machinery. That's the difference between writing async code and understanding it.