Python’s zip() looks simple, but it hides decades of design decisions: how it replaced map(None, ...), why it truncates by default, what changed in Python 3, and how strict=True finally made length mismatches fail loudly.
You’ve probably used zip() dozens of times. The difference between “works” and “correct” often comes down to whether you understand truncation, laziness, and how strict=True fails (often after partial iteration).
Before zip() Existed
Before Python 2.0, lockstep iteration over multiple sequences was commonly done using map(None, ...) in Python 2:
# Python 2-era pattern (historical)
a = (1, 2, 3)
b = (4, 5, 6)
for pair in map(None, a, b):
print(pair)
# (1, 4)
# (2, 5)
# (3, 6)
The biggest pitfall was behavior on unequal lengths: the shorter inputs were padded with None.
# Historical Python 2 behavior
a = (1, 2, 3)
c = (4, 5, 6, 7)
print(map(None, a, c))
# [(1, 4), (2, 5), (3, 6), (None, 7)]
PEP 201 highlighted why this was problematic: it was non-obvious, confusing (using None as a “function”), and the padding semantics were a silent bug factory.
PEP 201: The Birth of zip()
PEP 201 (Barry Warsaw, July 2000) introduced zip() in Python 2.0 as a built-in for “lockstep iteration.” The key semantics were intentionally simple:
- Accept one or more inputs
- Produce tuples of corresponding items
- Stop when the shortest input is exhausted
Guido van Rossum preferred the name zip() (borrowing from Haskell), rejected optional padding for KISS reasons, and rejected the original suggestion that zip() be lazy in its initial implementation.
How zip() Works (Iterator Semantics)
zip() aggregates the i-th element from each input iterable into the i-th output tuple:
names = ["Alice", "Bob", "Charlie"]
scores = [95, 87, 92]
grades = ["A", "B+", "A-"]
print(list(zip(names, scores, grades)))
# [('Alice', 95, 'A'), ('Bob', 87, 'B+'), ('Charlie', 92, 'A-')]
A good mental model is matrix transposition (rows ⇄ columns):
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
print([list(row) for row in zip(*matrix)])
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
Iterator vs Snapshot
zip() does not snapshot values. It stores iterators and pulls items at iteration time. If the underlying iterable is mutable (like a list), element mutations can be observed during iteration:
names = ["Alice", "Bob", "Charlie"]
ids = [1, 2, 3]
z = zip(names, ids)
print(next(z)) # ('Alice', 1)
names[1] = "MODIFIED"
print(next(z)) # ('MODIFIED', 2)
In other words: the iterators are created when zip() is called, but element values are read when the iterator is consumed.
Python 3: zip() Gets Lazy
Python 3 changed several built-ins (including zip()) to return iterators instead of materialized lists (see PEP 3100).
# Python 3
z = zip([1, 2, 3], [4, 5, 6])
print(z) # <zip object at 0x...>
print(list(z)) # [(1, 4), (2, 5), (3, 6)]
print(list(z)) # [] (consumed)
This matters for performance: it avoids allocating a list of tuples up front and allows streaming consumption.
No-Argument Behavior and the Unzip Edge Case
In early Python 2, zip() with no arguments raised TypeError. This was changed in Python 2.4 to return an empty result, enabling the inverse “unzip” pattern to behave sensibly on empty inputs:
print(list(zip())) # []
However, unzipping an empty list of pairs still fails at the unpacking layer:
pairs = []
a, b = zip(*pairs)
# ValueError: not enough values to unpack (expected 2, got 0)
Safer pattern:
pairs = []
if pairs:
a, b = zip(*pairs)
else:
a, b = (), ()
Default Truncation and Infinite Iterables
By design, zip() stops when the shortest input is exhausted:
names = ["Alice", "Bob", "Charlie", "Diana"]
scores = [95, 87, 92]
print(list(zip(names, scores)))
# [('Alice', 95), ('Bob', 87), ('Charlie', 92)]
This behavior is essential when combining finite and infinite iterables:
from itertools import count
for i, x in zip(count(), ["a", "b", "c"]):
print(i, x)
# 0 a
# 1 b
# 2 c
PEP 618: strict (Python 3.10)
Silent truncation is also a common source of bugs when lengths are expected to match. PEP 618 added strict=True to make mismatches fail loudly in Python 3.10+:
names = ["Alice", "Bob", "Charlie"]
scores = [95, 87]
print(list(zip(names, scores)))
# [('Alice', 95), ('Bob', 87)]
print(list(zip(names, scores, strict=True)))
# ValueError: zip() argument 1 is longer than argument 2
How strict Works Under the Hood
strict=True does not pre-check len(). It detects mismatches at iteration time:
zipadvances all iterators in lockstep.- When one iterator is exhausted, it probes the others one more step.
- If any other iterator still yields an item,
ValueErroris raised.
This means the loop body can run for matching pairs before failing (important if you have side effects):
def gen_names():
yield "Alice"
yield "Bob"
def gen_scores():
yield 95
yield 87
yield 92
for name, score in zip(gen_names(), gen_scores(), strict=True):
print(f"{name}: {score}")
# Alice: 95
# Bob: 87
# ValueError (after partial processing)
If you need all-or-nothing semantics, validate lengths or counts explicitly before processing, or stage results before side effects.
The zip() Family: Picking the Right Tool
Python gives you three primary strategies:
from itertools import zip_longest
short = [1, 2]
long = [10, 20, 30, 40]
print(list(zip(short, long)))
# [(1, 10), (2, 20)]
# list(zip(short, long, strict=True))
# ValueError
print(list(zip_longest(short, long, fillvalue=0)))
# [(1, 10), (2, 20), (0, 30), (0, 40)]
- Use default
zip()when truncation is intentional (including finite + infinite scenarios). - Use
strict=Truewhen mismatch indicates a bug. - Use
zip_longest()when you need full coverage with explicit padding.
Real Patterns That Matter (Expanded)
Dictionary Construction
keys = ["name", "age", "city"]
values = ["Alice", 30, "Portland"]
user = dict(zip(keys, values, strict=True))
print(user)
# {'name': 'Alice', 'age': 30, 'city': 'Portland'}
Unzip (Transpose Back)
pairs = [("Alice", 95), ("Bob", 87), ("Charlie", 92)]
names, scores = zip(*pairs)
print(names) # ('Alice', 'Bob', 'Charlie')
print(scores) # (95, 87, 92)
Sliding Windows
Classic pattern (creates slices / copies):
data = [10, 20, 30, 40, 50]
print(list(zip(data, data[1:])))
# [(10, 20), (20, 30), (30, 40), (40, 50)]
Modern alternative (Python 3.10+), clearer and avoids slicing copies:
from itertools import pairwise
data = [10, 20, 30, 40, 50]
print(list(pairwise(data)))
# [(10, 20), (20, 30), (30, 40), (40, 50)]
Ragged Matrix Transposition
zip(*matrix) truncates to the shortest row. If rows are ragged and you want padding, use zip_longest:
from itertools import zip_longest
matrix = [
[1, 2, 3],
[4, 5],
[6, 7, 8],
]
print([list(r) for r in zip(*matrix)])
# [[1, 4, 6], [2, 5, 7]] (truncated)
print([list(r) for r in zip_longest(*matrix, fillvalue=None)])
# [[1, 4, 6], [2, 5, 7], [3, None, 8]]
Common Mistakes and Reliability Tips
- Forgetting
zip()is single-pass in Python 3. Once consumed, it’s exhausted. - Ignoring silent data loss. If lengths should match, use
strict=True. - Relying on pairing from unordered iterables. Sets are not aligned; pairing two sets produces arbitrary matches.
- Mutating inputs mid-iteration. Lazy iteration means you may observe mutations unexpectedly.
- Assuming
strict=Trueprevents partial work. It can raise after some pairs have already been processed.
Timeline
PEP 201 (2000) — Introduced zip() (Python 2.0), list-returning, truncating semantics.
Python 2.4 (2004) — zip() with no arguments returns an empty result.
PEP 3100 (2008) — Python 3.0: zip() becomes a lazy iterator.
PEP 618 (2020) — Added strict parameter (shipped in Python 3.10, 2021).
Python 3.10 (2021) — itertools.pairwise() added (useful companion to zip() windowing patterns).