Flatten a List in Python: Every Method, Performance Trade-offs, and PEP 798

Flattening nested data structures is one of the operations Python developers encounter constantly, yet there is no built-in flatten() function in the language. A single StackOverflow post on flattening a list of lists has been viewed over 4.6 million times, a figure cited directly in PEP 798 as evidence of how widespread the need is. The question comes up in data processing pipelines, API response parsing, file system traversal, and dozens of other everyday contexts.

Flattening a list of lists is one of the most common Python data processing tasks.

Problem: Flattening a Nested List in Python

A common Python problem is converting a list of lists into a single flat list. For example:

nested = [[1, 2, 3], [4, 5], [6, 7, 8]]

The goal is to produce:

[1, 2, 3, 4, 5, 6, 7, 8]

Quick Answer: Flatten a List in Python

The fastest way to flatten a list in Python is using a list comprehension.

The most common way to flatten a list of lists in Python is a list comprehension:

nested = [[1,2,3],[4,5],[6,7,8]]
flat = [item for sublist in nested for item in sublist]
print(flat)
# [1,2,3,4,5,6,7,8]

This approach is O(n) and is considered the idiomatic Python solution for flattening a shallow list of lists.

No copy-paste solutions. We are going to understand why each approach works, when to choose it, and what Python's own designers intended.

Method at a Glance Select any approach to see its trade-off profile
Time complexity
O(n)
Memory
O(n) — full copy in RAM
Lazy / streaming
No
Depth handled
Shallow only
Best for
Teaching, debugging, maximum clarity
Works on generators
Yes (outer)
What lives in memory
input:
12|34|5
building:
12345
← grows as appended
Both the original nested structure and the growing output list coexist in memory.
Time complexity
O(n)
Memory
O(n) — full copy in RAM
Lazy / streaming
No
Depth handled
Shallow only
Best for
Idiomatic production code (pre-3.15)
Readability trap
Loop order is counterintuitive
What lives in memory
input:
12|34|5
output:
12345
CPython pre-allocates the list object — slightly faster than append() due to internal optimization.
Time complexity
O(n)
Memory
O(1) iterator — no intermediate copy
Lazy / streaming
Yes
Depth handled
Shallow only
Best for
Large data, generators, library code
Works on generators
Yes — fully lazy
What lives in memory (before list() is called)
input:
12|34|5
chain obj:
ptr
← just a pointer, no copy
chain.from_iterable holds a reference to the outer iterable and advances it one element at a time — never the whole structure.
Time complexity
O(n²) — quadratic
Memory
O(n²) intermediate copies
Lazy / streaming
No
Depth handled
Shallow only
Use in production
Never
Why it exists
+ operator overload curiosity
Intermediate allocations for [[1,2],[3,4],[5]]
pass 1:
12+34 → new list
pass 2:
1234+5 → new list
Each + creates a brand-new list. With 1,000 sublists, the first sublist gets copied 999 times. 50–200x slower than chain.
Time complexity
O(n²) — quadratic
Memory
O(n²) intermediate copies
Lazy / streaming
No
Depth handled
Shallow only
Use in production
Never
Why it exists
FP fold pattern, same flaw as sum()
Same allocation pattern as sum()
pass 1:
12+34 → new list
pass 2:
1234+5 → new list
functools.reduce(operator.add, nested) is mechanically identical to sum(nested, []). Each step copies all prior elements into a new list.
Time complexity
O(n) C-speed
Memory
flatten(): copy / ravel(): view
Lazy / streaming
No
Depth handled
Uniform ndarray shape only
Best for
Data already in NumPy arrays
Anti-pattern
Converting lists to arrays just to flatten
flatten() vs ravel() memory model
original:
1234
array in C contiguous block
ravel():
ref
points into same block
flatten():
1234
brand-new allocation
Time complexity
O(n) but recursion overhead
Memory
Call stack depth = nesting depth
Lazy / streaming
Yes — generator-based
Depth handled
Arbitrary nesting
Recursion limit risk
Yes — use iterative version for deep structures
String trap
Must guard against str iteration
Call stack for [1, [2, [3, [4]]]]
frame 1:
iter
deep_flatten([1,[2,[3,[4]]]])
frame 2:
iter
deep_flatten([2,[3,[4]]])
frame 3:
iter
deep_flatten([3,[4]])
frame 4:
4
yields up through all frames
Time complexity
O(n)
Memory
O(n) for list, O(1) for generator
Lazy / streaming
Yes — (*sub for sub in nested) is a generator
Depth handled
Shallow only
Requires
Python 3.15+ (Oct 2026)
PEP
798 — accepted Nov 2025
Syntax clarity comparison
[*sub for sub in nested] Read: "for each sub, unpack it, collect everything" — matches mental model [item for sub in nested for item in sub] Read: two for-clauses, variable tracking required — trips even experienced developers

Before writing a single line of code, we need to be precise. Flattening comes in two distinct flavors.

Shallow flatten takes one level of nesting and removes it. A list of lists becomes a flat list. A list of tuples becomes a flat list of their elements. The inner containers are unpacked, but if those inner containers hold more containers, those remain untouched.

# Shallow flatten
nested = [[1, 2], [3, [4, 5]], [6]]
# shallow result: [1, 2, 3, [4, 5], 6]

Deep flatten (also called recursive flatten) walks all the way down through every level of nesting until only non-iterable elements remain.

# Deep flatten
nested = [[1, 2], [3, [4, 5]], [6]]
# deep result: [1, 2, 3, 4, 5, 6]

The vast majority of real-world flattening is shallow. You have a list of rows, a list of chunks, a list of batches, and you want one unified sequence. That is where we will spend the bulk of our focus, with deep flattening addressed as a distinct problem near the end.

Method 1: The Nested For Loop

The most explicit approach is the one you would write if nobody had ever told you about comprehensions or itertools.

nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

flat = []
for sublist in nested:
    for item in sublist:
        flat.append(item)

print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

This is as readable as Python gets. Two loops, one append, done. The outer loop grabs each sublist, the inner loop grabs each element from that sublist. There is no magic, no imports, and no ambiguity about what is happening.

The list.extend() method offers a slight variation that avoids the inner loop entirely:

flat = []
for sublist in nested:
    flat.extend(sublist)

print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

The extend() method accepts any iterable and appends every item from it to the list. This is functionally identical to the inner loop version, but it pushes the iteration into C-level code inside CPython, making it marginally faster.

When to use it

When readability is your primary concern and the data set is small to moderate. This is also the best approach for beginners who need to understand the mechanics before reaching for more advanced tools. Performance is O(n) where n is the total number of elements across all sublists. Memory allocation happens incrementally as the list grows.

Deeper Thread Why does extend() push iteration into C?

CPython is the reference interpreter, written in C. When Python executes a pure Python for loop, each iteration involves bytecode dispatch — the interpreter fetches an instruction, decodes it, executes it, and moves to the next. This is fast, but it is not C-native speed.

list.extend() is implemented directly in C as list_extend in Objects/listobject.c. Once you cross the boundary from Python bytecode into that C function, the loop over the sublist runs without bytecode overhead — no Python-level instruction fetches, no GIL-release-and-reacquire per item. The result is the same flat list, but the iteration itself happens in compiled C.

This is a recurring Python performance pattern: the moment you delegate iteration to a built-in, you are trading Python bytecode for C machine code. The same principle explains why sum(), map(), and itertools functions often outperform equivalent Python loops even when the algorithmic complexity is identical.

Python's list comprehension syntax can express the same nested loop in a single line:

nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

flat = [item for sublist in nested for item in sublist]

print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

This is the most common Pythonic flatten you will encounter in production code. But there is a genuine readability trap here that the Python community has debated for years.

The loop ordering in comprehensions reads left to right in the same order you would write them as nested for statements. The outer loop (for sublist in nested) comes first, then the inner loop (for item in sublist). Many programmers instinctively want to reverse this order, putting the "element extraction" loop first. In the Hacker News discussion around PEP 798 in July 2025, one commenter noted that even after using Python for over fifteen years, the nested comprehension loop order felt unintuitive.

The PEP 798 rationale itself documents this confusion directly. It states that the proposal was motivated partly by a written exam in a Python programming class where several students used the set version of the proposed syntax ({*it for it in its}) in their solutions, assuming it already existed. By contrast, the double-loop version [x for it in its for x in it] is one that students often get wrong because the natural impulse is to reverse the order of the for clauses. The list version follows from the same intuition.

Students independently invented the syntax {*sub for sub in nested} on a Python exam — the set version — assuming the language already worked that way. That intuition became the evidence that convinced the Steering Council.

Deeper Thread What does CPython actually compile a list comprehension to?

A list comprehension is not syntactic sugar for a for loop at the Python level — it compiles to a distinct bytecode sequence. In CPython 3.12+, a list comprehension creates an implicit function scope (to isolate the loop variable) and uses LIST_APPEND bytecode for each element rather than calling append() as a method. The LIST_APPEND opcode bypasses the full method-lookup machinery and writes directly to the list's internal buffer.

This is why list comprehensions are consistently faster than equivalent for-loop + append() patterns in microbenchmarks. They also pre-size the internal array when the size can be estimated at compile time, avoiding repeated reallocation as the list grows.

You can inspect this yourself: import dis; dis.dis('[x for sub in nested for x in sub]') will show you the bytecode, including the implicit function scope and the LIST_APPEND instructions.

When to use it

When you are comfortable with comprehension syntax and want concise code. This is the idiomatic choice for shallow flattening in Python today (pre-3.15). Performance is O(n), essentially identical to the explicit loop version. The comprehension builds the list in a single expression, which can be slightly faster than repeated append() calls because CPython optimizes list comprehension internals.

Method 3: itertools.chain.from_iterable()

The itertools module, written and maintained by Python core developer Raymond Hettinger since 2001, provides what many consider the cleanest tool for flattening. Hettinger, who received the Python Software Foundation Distinguished Service Award in 2014, designed itertools.chain to combine multiple iterables into a single iterable that yields elements from each one in sequence. See the official itertools documentation for the full module reference.

import itertools

nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

flat = list(itertools.chain.from_iterable(nested))

print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

There are two ways to call chain. The first, chain(*nested), unpacks the outer list and passes each sublist as a separate argument. The second, chain.from_iterable(nested), accepts a single iterable of iterables and consumes them lazily. Hettinger himself posed the distinction as a quiz on X (formerly Twitter) in March 2022: "Why is chain.from_iterable(source) preferable to chain(*source)?"

The answer comes down to lazy evaluation. When you write chain(*nested), Python must unpack the entire outer iterable into memory as function arguments before chain even starts running. With chain.from_iterable(nested), the outer iterable is consumed one element at a time. If nested is a generator that produces sublists on the fly, from_iterable never needs to hold all of them in memory simultaneously.

import itertools

# This works with generators, no intermediate list needed
def generate_batches():
    for i in range(0, 100, 10):
        yield list(range(i, i + 10))

flat = list(itertools.chain.from_iterable(generate_batches()))
print(flat[:15])
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
Pro Tip

When you are working with large data, generators, or any situation where lazy evaluation matters, itertools.chain.from_iterable() is the professional-grade tool for flattening in production Python. Memory efficiency comes from the chain object being an iterator — it does not build the entire flat list in memory unless you explicitly convert it with list().

chain.from_iterable() never holds the full output in memory. It advances through your data like a reading cursor — one element at a time, on demand. For the full story on how Python makes this possible, see Lazy Evaluation in Python.

Deeper Thread The iterator protocol: what makes Python iteration work

Every time Python encounters a for x in y statement, it calls iter(y) to get an iterator object, then repeatedly calls next() on it until a StopIteration exception is raised. This two-method contract (__iter__ and __next__) is the iterator protocol — and it is what makes all of Python's lazy evaluation possible.

A chain object implements exactly this protocol. Its __next__ method maintains a reference to the current inner iterator and advances it. When that iterator raises StopIteration, chain's __next__ moves to the next element from the outer iterable and starts advancing that one instead. At no point does it look ahead.

This is why you can pipe an infinite generator into chain.from_iterable() and consume it safely with itertools.islice() — the chain object will never attempt to materialize data that hasn't been requested.

Method 4: sum() with Lists (and Why You Should Avoid It)

You will sometimes encounter this pattern in the wild:

nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
flat = sum(nested, [])

print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

It works because sum() applies the + operator repeatedly across the elements, starting with the initial value (the empty list []). Since + concatenates lists, the result is a single flat list.

The problem is performance. Python educator Trey Hunner, writing at Python Morsels in May 2022, explained the core issue: the algorithm sum uses makes list flattening extremely slow. The + operator on lists creates a new list every time it is called. Flattening a list of 1,000 sublists each containing 3 items takes roughly 3 million operations instead of 3 thousand. In Big-O terms, this is O(n²) instead of O(n).

import itertools
import time

# Demonstrating the quadratic scaling problem
def sum_flatten(nested):
    return sum(nested, [])

def chain_flatten(nested):
    return list(itertools.chain.from_iterable(nested))

# Build a test case: 5000 sublists of 10 items each
data = [list(range(10)) for _ in range(5000)]

start = time.perf_counter()
chain_flatten(data)
chain_time = time.perf_counter() - start

start = time.perf_counter()
sum_flatten(data)
sum_time = time.perf_counter() - start

print(f"chain.from_iterable: {chain_time:.4f}s")
print(f"sum():               {sum_time:.4f}s")
print(f"sum() is {sum_time / chain_time:.1f}x slower")

On a typical machine, sum() will be somewhere between 50x and 200x slower for this test case, and the gap widens as the data grows.

Avoid This Pattern

Essentially never use sum(nested, []) for flattening lists. The word "sum" implies arithmetic, and using it for list concatenation obscures intent while introducing a quadratic performance penalty. This pattern should be treated as an anti-pattern.

Method 5: functools.reduce() (and Why You Should Also Avoid It)

Another pattern you will occasionally encounter in the wild, often from developers with a functional programming background:

import functools
import operator

nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
flat = functools.reduce(operator.add, nested)

print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

functools.reduce() applies a binary function cumulatively across a sequence. When you pass operator.add (or equivalently the + operator via a lambda), it concatenates the sublists left to right. It has the same fundamental problem as sum(nested, []): the + operator creates a new list on every application. The algorithm is O(n²) for the same reason — the first sublist is copied once for each remaining sublist.

You may also see it written with a lambda:

flat = functools.reduce(lambda acc, sub: acc + sub, nested, [])

This version passes an explicit initial value (an empty list) and is semantically identical to sum(nested, []). The performance characteristics are the same: fine for a handful of tiny sublists, unusable at any real scale.

Avoid This Pattern

Do not use functools.reduce(operator.add, nested) for flattening. It is O(n²) by the same mechanism as sum(nested, []), and it obscures intent behind a function that is primarily useful for fold operations on scalar values. Guido van Rossum has explicitly said he dislikes reduce() for most uses and moved it out of builtins into functools in Python 3 precisely to discourage casual use. There is no reason to reach for it here when itertools.chain.from_iterable() exists.

Method 6: NumPy's flatten() and ravel()

If you are working with numerical data in NumPy arrays, the library provides dedicated methods. Refer to the NumPy ndarray.flatten() documentation for the full parameter reference.

import numpy as np

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# flatten() returns a copy
flat_copy = matrix.flatten()
print(flat_copy)
# [1 2 3 4 5 6 7 8 9]

# ravel() returns a view when possible (more memory-efficient)
flat_view = matrix.ravel()
print(flat_view)
# [1 2 3 4 5 6 7 8 9]

The critical distinction: flatten() always returns a new array (a copy), while ravel() returns a view of the original data whenever the memory layout allows it. A view shares the same underlying data, so modifying the view modifies the original array. If you need a safe, independent copy, use flatten(). If you need speed and can tolerate the aliasing, use ravel().

Both methods also accept an order parameter: 'C' for row-major (the default, reads across rows), 'F' for column-major (reads down columns), 'A' to flatten in the order the data is stored in memory, and 'K' to flatten in the order elements appear in memory.

matrix = np.array([[1, 2], [3, 4], [5, 6]])

print(matrix.flatten(order='C'))  # Row-major: [1 2 3 4 5 6]
print(matrix.flatten(order='F'))  # Column-major: [1 3 5 2 4 6]
When to use it

When your data is already in NumPy arrays or you are doing numerical computation. Do not convert a list to a NumPy array just to flatten it; the conversion overhead negates any benefit.

Deep Flattening: Handling Arbitrary Nesting

For structures nested to unknown or variable depth, you need recursion (or an explicit stack):

def deep_flatten(iterable):
    """Flatten an arbitrarily nested iterable, preserving strings as atoms."""
    for item in iterable:
        if isinstance(item, (list, tuple, set, frozenset)):
            yield from deep_flatten(item)
        else:
            yield item

nested = [1, [2, [3, [4, [5]]], 6], [7, 8]]
print(list(deep_flatten(nested)))
# [1, 2, 3, 4, 5, 6, 7, 8]

The yield from syntax (introduced in PEP 380, accepted for Python 3.3) delegates to the recursive generator call, allowing the deeply nested items to bubble up through each level. Without yield from, you would need an inner loop at each recursion level to re-yield the items.

The isinstance check explicitly targets container types while treating strings as atomic values. Without this guard, a string like "hello" would be iterated character by character since strings are iterable in Python, and each single character is itself a string, leading to infinite recursion.

An iterative version using an explicit stack avoids Python's recursion limit. For a thorough look at how Python's call stack works and why the limit exists, see Recursion in Python: How It Actually Works Under the Hood.

def deep_flatten_iterative(iterable):
    """Iterative deep flatten using an explicit stack."""
    stack = list(iterable)
    stack.reverse()
    while stack:
        item = stack.pop()
        if isinstance(item, (list, tuple, set, frozenset)):
            items = list(item)
            items.reverse()
            stack.extend(items)
        else:
            yield item

nested = [1, [2, [3, [4, [5]]], 6], [7, 8]]
print(list(deep_flatten_iterative(nested)))
# [1, 2, 3, 4, 5, 6, 7, 8]

The stack is seeded by converting the iterable to a list and reversing it in place with list.reverse(), which avoids the double-wrapping of list(reversed(list(...))). The same pattern is applied to each sub-container as it is expanded, ensuring elements come out in the correct left-to-right order.

Deeper Thread yield from, tail recursion, and why Python's recursion limit exists

yield from (PEP 380) delegates to a sub-generator. In the recursive deep flatten, each level of nesting creates a new generator frame on the call stack. Unlike languages with tail-call optimization (Scheme, Haskell, Kotlin), CPython does not eliminate tail calls — every recursive call allocates a new stack frame even if the recursion is in tail position.

Python deliberately set its default recursion limit to 1,000 (sys.getrecursionlimit()) to prevent stack overflows from becoming OS-level crashes. For nested data structures you control, this is rarely a problem. For untrusted or arbitrarily deep structures, the iterative stack-based version is the correct choice because it uses heap memory (unbounded) rather than the call stack (bounded).

The practical trade-off: recursive generators are elegant and lazy — they yield values on demand. The iterative version reverses the list before pushing to the stack, which materializes each sub-list in memory. For truly enormous data, neither version is ideal; you want a streaming parser that never loads the full structure.

Mixed-Type Structures: Tuples, Sets, and Heterogeneous Nesting

Real-world nested data is rarely a clean list of lists. You will often encounter structures where the outer container is a list but the inner containers are tuples, or where the nesting is heterogeneous — some elements are sublists, some are plain values at the same level. Understanding how each method handles these cases is essential.

Shallow methods and mixed container types. The for-loop, list comprehension, and itertools.chain.from_iterable() all work with any iterable at both the outer and inner level, because Python's iteration protocol is duck-typed. You do not need lists specifically:

import itertools

# Mix of list, tuple, range — all work
mixed = [[1, 2], (3, 4), range(5, 8)]

flat = list(itertools.chain.from_iterable(mixed))
print(flat)
# [1, 2, 3, 4, 5, 6, 7]

Heterogeneous depth — some items are sublists, some are not. This is where shallow methods break down silently. If your outer list contains a mix of sublists and plain values, iterating and unpacking will throw a TypeError on the plain values:

mixed_depth = [[1, 2], 3, [4, 5]]

# This raises TypeError: 'int' object is not iterable
flat = list(itertools.chain.from_iterable(mixed_depth))

The correct approach for heterogeneous structures is the deep flatten generator with an isinstance guard, but you need to decide what "atom" means for your data. An integer is clearly an atom. A string is an iterable but usually an atom. A named tuple is a tuple subclass that is likely an atom. You need to be explicit:

from collections.abc import Iterable

def flatten_mixed(data, atom_types=(str, bytes)):
    """Flatten heterogeneous structure where some elements may not be containers."""
    for item in data:
        if isinstance(item, Iterable) and not isinstance(item, atom_types):
            yield from flatten_mixed(item, atom_types)
        else:
            yield item

mixed_depth = [[1, 2], 3, [4, [5, 6]]]
print(list(flatten_mixed(mixed_depth)))
# [1, 2, 3, 4, 5, 6]

NumPy and ragged arrays. NumPy's flatten() and ravel() only work on uniform ndarrays — arrays where every dimension has consistent shape. If you try to build a NumPy array from a ragged list (sublists of different lengths), NumPy will create an object array rather than a numeric array, and the flatten methods will not produce what you expect:

import numpy as np

# Ragged: sublists have different lengths
ragged = [[1, 2, 3], [4, 5]]

# NumPy creates an object array — NOT a 2D numeric array
arr = np.array(ragged, dtype=object)
print(arr.dtype)   # object
print(arr.shape)   # (2,)

# ravel() on an object array gives you the sublists as objects, not their elements
print(arr.ravel())  # [list([1, 2, 3]) list([4, 5])]

For ragged numeric data in NumPy, use np.concatenate() instead — it is designed to join arrays along an axis and handles 1D sublists correctly when they are already arrays:

import numpy as np

sublists = [np.array([1, 2, 3]), np.array([4, 5]), np.array([6, 7, 8, 9])]
flat = np.concatenate(sublists)
print(flat)
# [1 2 3 4 5 6 7 8 9]
NumPy ragged data

When you have a list of NumPy arrays of different lengths, prefer np.concatenate() over converting to a single ndarray and calling flatten(). np.concatenate() handles variable-length inputs and keeps everything in numeric (non-object) dtype.

Performance in Practice: Actual Numbers

The O(n) vs O(n²) distinction is important, but it can feel abstract until you see it. Here is a complete benchmarking script you can run yourself:

import itertools
import functools
import operator
import time

def time_it(fn, data, label, runs=5):
    times = []
    for _ in range(runs):
        start = time.perf_counter()
        result = fn(data)
        times.append(time.perf_counter() - start)
    best = min(times)
    print(f"{label:<42} {best * 1000:.3f} ms")
    return result

# Test data: 10,000 sublists of 10 items each
data = [list(range(10)) for _ in range(10_000)]

print(f"{'Method':<42} {'Best of 5 (ms)'}")
print("-" * 58)

time_it(lambda d: [item for sub in d for item in sub],
        data, "list comprehension")
time_it(lambda d: list(itertools.chain.from_iterable(d)),
        data, "itertools.chain.from_iterable")
time_it(lambda d: [x for sub in d for x in sub],  # identical
        data, "comprehension (control repeat)")

def loop_extend(d):
    flat = []
    for sub in d:
        flat.extend(sub)
    return flat
time_it(loop_extend, data, "for loop + extend()")

def loop_append(d):
    flat = []
    for sub in d:
        for item in sub:
            flat.append(item)
    return flat
time_it(loop_append, data, "for loop + append()")

# Anti-patterns — only run on much smaller data to avoid hanging
small = [list(range(10)) for _ in range(500)]
time_it(lambda d: sum(d, []),           small, "sum(nested, []) — 500 sublists")
time_it(lambda d: functools.reduce(operator.add, d), small, "reduce(operator.add) — 500 sublists")

On CPython 3.13 running on a typical modern machine, the results look roughly like this (your numbers will differ by hardware, but the relative ordering is stable):

Method                                     Best of 5 (ms)
----------------------------------------------------------
list comprehension                         1.8 ms
itertools.chain.from_iterable              1.4 ms    ← fastest for list output
for loop + extend()                        2.1 ms
for loop + append()                        4.3 ms    ← ~2.5x slower than chain
sum(nested, []) — 500 sublists             18.2 ms   ← on 20x less data
reduce(operator.add) — 500 sublists        17.8 ms   ← same problem, same penalty

A few things to notice. First, itertools.chain.from_iterable() is generally the fastest option for producing a concrete list — its inner loop runs in C and avoids the overhead of CPython's comprehension machinery. Second, the extend()-based loop is meaningfully faster than the double-append loop, because extend() delegates its inner iteration to C. Third, the anti-patterns (sum() and reduce()) are compared on 20 times fewer sublists and are still an order of magnitude slower — at 10,000 sublists they would take several seconds.

Benchmarking caveat

Microbenchmark results depend heavily on sublist size, total element count, and Python version. For sublists of a single element, the picture shifts. For generator inputs that cannot be pre-sized, the picture shifts further. Always benchmark with data representative of your actual workload before optimizing beyond itertools.chain.from_iterable().

The PEP History: Why Python Has No Built-In flatten()

Python's relationship with flattening is shaped by three PEPs and a design philosophy.

PEP 20: The Zen of Python (1999)

Tim Peters, a renowned software engineer and long-standing CPython core developer, wrote the Zen of Python and posted it to the Python mailing list in 1999. Among its 19 aphorisms is one directly relevant to our topic: "Flat is better than nested." This principle, later formalized as PEP 20, speaks to the design of code structure and module hierarchies rather than data manipulation, but the philosophical underpinning is clear: Python's designers value flat structures.

Ironically, while the language philosophically prefers flat over nested, it has historically not provided a first-class tool for making nested data flat.

1999
PEP 20 — The Zen of Python
Tim Peters codifies "Flat is better than nested." The language philosophy endorses flatness without providing a mechanism to achieve it in data.
2001
itertools module introduced
Raymond Hettinger adds itertools.chain to the standard library. chain.from_iterable() becomes the de facto flattening tool for years — an expert-level tool for a beginner-level need.
2013–2015
PEP 448 — Additional Unpacking Generalizations
An early draft included [*item for item in ranges] as a flattening syntax. Removed before acceptance due to readability concerns. The PEP explicitly left the door open for future proposals.
2021–2023
Recurring python-ideas and Discourse threads
The idea of unpacking in comprehensions was raised in a 2021 Discourse thread and resurfaced in subsequent years without reaching critical mass or attracting a sponsor.
June 2025
Pre-PEP posted by Adam Hartz and Erik Demaine
The formal proposal reaches the Python discussion forum. CPython core developer Jelle Zijlstra agrees to sponsor it as PEP 798, noting he had missed the feature himself multiple times.
September 2025
Steering Council review — framing shift
The council rejects the framing of "superior clarity" (too subjective) and requests the argument be repositioned around syntactic consistency with existing unpacking. Hartz agrees and revises the PEP.
November 2025
PEP 798 unanimously accepted
The Steering Council votes unanimously in favor. The syntax [*sub for sub in nested] becomes an official part of the Python language specification, targeted for Python 3.15.
October 2026
Python 3.15 ships
PEP 798 syntax available in stable release. 27 years after "Flat is better than nested" was written, Python finally has an idiomatic syntax for making nested data flat.

PEP 448: Additional Unpacking Generalizations (2013–2015)

PEP 448, authored by Joshua Landau, proposed extending the * and ** unpacking operators to work in more contexts. It was accepted by Guido van Rossum on February 25, 2015, and shipped with Python 3.5.

PEP 448 is the reason you can write things like:

a = [1, 2]
b = [3, 4]
combined = [*a, *b]  # [1, 2, 3, 4]

d1 = {'x': 1}
d2 = {'y': 2}
merged = {**d1, **d2}  # {'x': 1, 'y': 2}

Critically, an earlier draft of PEP 448 did include support for unpacking inside comprehensions as a flattening mechanism. The syntax [*item for item in ranges] would have given Python a one-line flatten in 2015. However, the PEP documents that this feature was removed before acceptance. According to PEP 448, the comprehension-based unpacking feature faced strong readability objections alongside limited support, and was ultimately excluded to avoid holding back the proposal's less contentious elements (PEP 448, Landau, 2015).

The PEP explicitly left the door open for future proposals to address unpacking inside comprehensions (PEP 448, Landau, 2015).

PEP 798: Unpacking in Comprehensions (2025)

A decade after PEP 448, the door finally opened. PEP 798, authored by Adam Hartz and Erik Demaine, picks up exactly where PEP 448 left off. The idea was raised in a Discourse thread in October 2021, and reached critical mass in June 2025 when Hartz posted a formal pre-PEP to the Python discussion forum.

In July 2025, CPython core developer Jelle Zijlstra reviewed the proposal and stated on the Python discussion forum that the feature is a nice addition that he had missed several times in the past, and agreed to sponsor it as PEP 798.

The PEP was submitted to the Python Steering Council in September 2025. Council member Pablo Galindo Salgado responded that the council was uncomfortable positioning the new syntax as offering superior clarity compared to existing alternatives such as itertools.chain.from_iterable, explicit loops, or nested comprehensions, because readability is subjective. Instead, the council suggested that the stronger argument was one of syntactic consistency, extending Python's existing unpacking patterns naturally into comprehensions. Hartz agreed and revised the PEP accordingly.

In early November 2025, the Steering Council unanimously accepted PEP 798. The feature is targeted for Python 3.15, expected in October 2026. As of early 2026, it is already available for testing in Python 3.15 alpha builds.

Here is what the syntax looks like:

# Python 3.15+ (PEP 798)
nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

flat = [*sublist for sublist in nested]
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Works with sets (union)
all_items = {*sublist for sublist in nested}
# {1, 2, 3, 4, 5, 6, 7, 8, 9}

# Works with dictionaries (merge)
dicts = [{'a': 1}, {'b': 2}, {'c': 3}]
merged = {**d for d in dicts}
# {'a': 1, 'b': 2, 'c': 3}

# Works with generators (lazy)
gen = (*sublist for sublist in nested)
# generator that yields 1, 2, 3, 4, 5, 6, 7, 8, 9

Compare the new syntax to the current double-loop comprehension:

# Current (Python 3.x)
flat = [item for sublist in nested for item in sublist]

# PEP 798 (Python 3.15+)
flat = [*sublist for sublist in nested]

The PEP 798 version reads naturally to anyone familiar with unpacking: "take each sublist, unpack it, collect everything into a list." The double-loop version requires you to mentally parse two for clauses and track which variable belongs to which loop.

Choosing the Right Approach: A Decision Framework

Here is how to decide which flattening method to use based on your actual situation.

Interactive Decision Guide — answer each question to find your method
What best describes your data?

Use itertools.chain.from_iterable() when you are processing large data, working with generators or lazy iterables, or writing library-quality code that other developers will maintain. This is the workhorse. It handles arbitrary iterables (not just lists), operates lazily, and signals intent through a well-known standard library function.

Use a list comprehension ([item for sub in nested for item in sub]) for straightforward cases in scripts, notebooks, or application code where the data fits comfortably in memory and the team is fluent in comprehension syntax.

Use [*sub for sub in nested] once your project targets Python 3.15 or later. This will become the new idiomatic approach for shallow flattening, endorsed by the language specification itself.

Use explicit for loops when you are teaching, debugging, or working in a codebase where clarity matters more than conciseness. Prefer the extend() variant over the double-append loop — it is more readable and faster, because the inner iteration is handled in C.

Use NumPy's flatten() or ravel() when your data is already in uniform NumPy arrays. Use np.concatenate() when you have a list of NumPy arrays of different lengths. Never convert pure Python lists to arrays just for flattening.

Use a recursive generator for deep flattening of arbitrarily nested structures. Keep the isinstance guard tight to avoid infinite recursion on strings. For heterogeneous structures where some elements are not containers at all, the same generator pattern applies — the guard handles mixed depth naturally.

Never use sum(nested, []) or functools.reduce(operator.add, nested) for flattening. Both are O(n²) by the same mechanism and are unsuitable for anything beyond the smallest toy examples.

A Complete, Production-Ready Flatten Utility

Bringing everything together, here is a utility module that handles both shallow and deep flattening with proper type handling:

"""flatten.py - Production-ready flattening utilities."""

import itertools
from collections.abc import Iterable


def shallow_flatten(iterable_of_iterables):
    """Flatten one level of nesting. Returns an iterator.

    >>> list(shallow_flatten([[1, 2], [3, 4], [5]]))
    [1, 2, 3, 4, 5]
    >>> list(shallow_flatten([range(3), range(3, 6)]))
    [0, 1, 2, 3, 4, 5]
    """
    return itertools.chain.from_iterable(iterable_of_iterables)


def deep_flatten(iterable, *, max_depth=None,
                 atom_types=(str, bytes, bytearray)):
    """Recursively flatten nested iterables.

    Args:
        iterable: The nested structure to flatten.
        max_depth: Maximum nesting levels to flatten. None means unlimited.
        atom_types: Types to treat as atoms (not iterated into).

    >>> list(deep_flatten([1, [2, [3, [4]]]]))
    [1, 2, 3, 4]
    >>> list(deep_flatten([1, [2, [3, [4]]]], max_depth=1))
    [1, 2, [3, [4]]]
    >>> list(deep_flatten(['hello', ['world']]))
    ['hello', 'world']
    """
    return _deep_flatten_inner(iterable, 0, max_depth, atom_types)


def _deep_flatten_inner(iterable, depth, max_depth, atom_types):
    for item in iterable:
        if (isinstance(item, Iterable)
                and not isinstance(item, atom_types)
                and (max_depth is None or depth < max_depth)):
            yield from _deep_flatten_inner(item, depth + 1,
                                           max_depth, atom_types)
        else:
            yield item


if __name__ == '__main__':
    import doctest
    doctest.testmod()

This module uses collections.abc.Iterable for a robust check instead of hard-coding container types, treats strings and bytes as atoms by default (preventing the infinite recursion problem), and supports a max_depth parameter for controlled partial flattening. The depth counter is kept entirely private inside _deep_flatten_inner, so callers cannot accidentally corrupt the recursion state. Both functions return iterators for memory efficiency.

The Philosophical Thread

Python's creator Guido van Rossum has described the joy of coding Python as seeing concise, readable code that expresses substantial action in minimal space (van Rossum, Computer Programming for Everybody proposal, 1999). Flattening is a microcosm of this philosophy. The language gives you multiple tools, each optimized for a different balance of readability, performance, and expressiveness.

The Zen of Python (PEP 20, Tim Peters, 1999) states a preference for one obvious way to accomplish any task, while acknowledging that what is obvious may not be immediately apparent. For a decade, there was no single obvious way to flatten. The double-loop comprehension was concise but confusing. The itertools.chain.from_iterable call was correct but verbose. PEP 798 represents the community finally converging on an answer: [*sub for sub in nested] is the flatten syntax Python was always meant to have.

It took a nearly-dropped feature from a 2013 PEP draft, a decade of intermittent community discussion, students on a written exam independently inventing the set version of the syntax because it felt natural, a formal pre-PEP in 2021 and again in 2025, a core developer sponsorship, a Steering Council review focused on syntactic consistency over claims of superior readability, and a unanimous acceptance vote in November 2025. That is how Python evolves: deliberately, with real evidence, in the service of making the right way the easy way.

27 years after Tim Peters wrote "Flat is better than nested," Python finally has a syntax for making nested data flat. The aphorism became the specification.

Now go flatten something.

back to articles