Looping is the heartbeat of programming. Every program that processes data, interacts with users, or automates tasks relies on loops to repeat operations. Python provides several distinct approaches to iteration—each with its own purpose, performance profile, and place in idiomatic code.
This article doesn't rehash the basics you've seen in a thousand tutorials. We're going to trace the evolution of looping in Python through the actual Python Enhancement Proposals (PEPs) that shaped them, surface quotes from the language designers who debated these features, and demonstrate real code that solves real problems.
The for Loop: Python's Workhorse
The problem it solved: Before PEP 234, iterating a dictionary required calling .keys() to materialize an entire list in memory first. File objects and non-sequential containers couldn't be looped over at all without that workaround. The for loop as it exists today was redesigned to eliminate that constraint entirely.
The for loop in Python is fundamentally different from what you'd find in C, Java, or JavaScript. There is no index initialization, no condition check, no manual increment. Python's for iterates directly over the items of any iterable object—a list, a string, a dictionary, a file, a generator, or any custom object that implements the iterator protocol.
server_logs = ["INFO: request received", "WARN: high latency", "ERROR: timeout"]
for entry in server_logs:
if entry.startswith("ERROR"):
print(f"Alert triggered: {entry}")
This directness was no accident. Guido van Rossum, Python's creator, has long argued that the programmer's time matters more than the machine's. In a 2020 interview published on the Dropbox Blog, he stated that in Python, "every symbol you type is essential." That economy of expression shows up most clearly in the for loop.
Before Python 2.2, the for loop relied on the sequence protocol—objects needed to implement __getitem__() and accept sequential integer indices. The interpreter would call __getitem__(0), then __getitem__(1), and so on, until it caught an IndexError. It worked, but it was a hack. Dictionaries, file objects, and other non-sequential containers couldn't be iterated directly. You had to call .keys() on a dictionary first, producing an entire list in memory, and then iterate over that list.
PEP 234, authored by Ka-Ping Yee and Guido van Rossum and implemented for Python 2.2, changed everything. It introduced the iterator protocol: any object can define an __iter__() method that returns an iterator, and that iterator provides a __next__() method that yields one item at a time. The PEP reshaped how the for loop works at the bytecode level. According to PEP 234, the bytecode generated for for loops was updated to use two new opcodes—GET_ITER and FOR_ITER—that drive iteration through the iterator protocol rather than the old sequence protocol. This change made it possible to iterate over dictionaries, files, generators, and any custom iterable with a simple for loop—no intermediate list needed.
Iterable vs. Iterator: A Distinction Worth Getting Right
PEP 234 draws a clear line between two things that beginners frequently conflate. An iterable is any object that implements __iter__() and returns an iterator. A list, a string, a dictionary—these are iterables. An iterator is the object that actually does the traversal: it implements both __iter__() (returning itself) and __next__(), and it maintains state between calls. Iterators are one-shot. Once they're exhausted, they're done.
The practical consequence is this: you can loop over a list as many times as you like because each for loop calls __iter__() and gets a fresh iterator. But if you hold onto an iterator directly, a second loop over it produces nothing.
Think of an iterable as a recipe book and an iterator as your finger tracking the current line. The book (iterable) doesn't move — you can open it again and start a fresh read anytime. The finger (iterator) is one-shot: it tracks where you are, but once it falls off the last page, it can't reset itself. Calling iter() on the book creates a new finger each time.
"Iterable" and "iterator" are two distinct things. A list is iterable but not an iterator. A generator is both. When people say "you can loop over it," they mean it's an iterable — not necessarily that it implements __next__(). Conflating the two leads to subtle bugs when passing objects to functions that call next() directly.
numbers = [10, 20, 30]
# Iterable: reusable, each loop gets a fresh iterator
for n in numbers:
print(n) # 10 20 30
for n in numbers:
print(n) # 10 20 30 --- works again, no problem
# Iterator: one-shot
it = iter(numbers)
for n in it:
print(n) # 10 20 30
for n in it:
print(n) # --- nothing. The iterator is exhausted.
Generators are iterators, not reusable iterables. If you pass a generator to two functions that each try to iterate it, only the first gets the data. This is a common source of bugs when passing generator expressions around. When in doubt, convert to a list first—but only if memory allows it.
# Post PEP 234: iterate directly over dictionary keys
config = {"host": "localhost", "port": 8080, "debug": True}
for key in config:
print(f"{key} = {config[key]}")
The old __getitem__() fallback still works for backward compatibility, but every modern Python object that supports iteration implements the iterator protocol from PEP 234.
Under the hood: what the bytecode shows
PEP 234 changed the Python bytecode generated for for loops from LOAD_SUBSCR + index increments to two new opcodes: GET_ITER (calls __iter__() once at loop start) and FOR_ITER (calls __next__() on each step, catching StopIteration to exit). You can see this directly with dis:
import dis
def loop_example():
items = [1, 2, 3]
for x in items:
pass
dis.dis(loop_example)
# ...
# GET_ITER <-- calls items.__iter__(), returns an iterator
# FOR_ITER <-- calls iterator.__next__() each iteration
# ... <-- StopIteration caught here, jumps to end of loop
The FOR_ITER opcode handles StopIteration internally at the C level — it never surfaces to Python-level exception handling unless you catch it explicitly, which is why a for loop exits cleanly when an iterator is exhausted.
for with range(): Counted Iteration
When you do need a numeric counter, range() is the idiomatic tool. It generates a lazy sequence of integers without building a list in memory (in Python 3, range() returns a range object, not a list).
# Simulate retry logic for a network request
import time
max_retries = 5
for attempt in range(1, max_retries + 1):
print(f"Connection attempt {attempt}...")
# Simulated success on attempt 3
if attempt == 3:
print("Connected.")
break
time.sleep(0.5)
else:
print("All retries exhausted. Connection failed.")
Note the else clause on that loop. The else block executes only if the loop completes normally—that is, without hitting a break. It's perfect for search-and-fail patterns and retry logic.
The else on a loop does not mean "run this if the loop was empty" or "run this if the condition was false." It means "run this if the loop was not exited via break." An empty loop's else still runs. A loop that finishes all iterations without break runs the else. Only a break prevents it. The naming causes genuine confusion — many experienced Python developers avoid for/else entirely for this reason, preferring a boolean flag instead.
Iterating in Reverse and with a Step
The article title says complete, so here's something that often goes unaddressed: what if you need to go backwards, or skip elements? Python handles both cleanly without forcing you to manually compute indices.
To iterate a sequence in reverse, use reversed(). It returns a reverse iterator without copying the sequence, making it memory-efficient for lists and other sequences that support __reversed__(). For numeric ranges, pass a negative step to range(start, stop, step).
# Iterate a list in reverse
alerts = ["low", "medium", "high", "critical"]
for alert in reversed(alerts):
print(alert)
# critical, high, medium, low
# Countdown with range(start, stop, step)
for i in range(10, 0, -1):
print(i)
# 10 9 8 7 6 5 4 3 2 1
# Every other element (step of 2)
ports = [80, 443, 8080, 8443, 22, 3389]
for port in ports[::2]:
print(port)
# 80, 8080, 22
reversed() requires the object to support either __reversed__() or both __len__() and __getitem__(). It does not work on arbitrary iterators or generators. If you need to reverse a generator's output, you have to materialize it first: list(gen)[::-1].
break, continue, and Loop Control
Two keywords that the article has touched around but not addressed directly: break exits the nearest enclosing loop immediately. continue skips the rest of the current iteration and jumps straight to the next one. Both work identically in for and while loops.
log_entries = [
"INFO: service started",
"INFO: request received",
"ERROR: null pointer exception",
"INFO: retrying",
"ERROR: disk full",
]
# continue: skip non-errors, process only errors
for entry in log_entries:
if not entry.startswith("ERROR"):
continue
print(f"Escalating: {entry}")
# break: stop at the first critical condition
for entry in log_entries:
if "disk full" in entry:
print("Storage alert. Halting ingestion.")
break
In nested loops, break exits only the inner loop it belongs to, not all enclosing loops. Python has no labeled break. If you need to exit multiple levels at once, the cleanest approach is to refactor the inner loop into a function and use return, or raise an exception and catch it outside the loops.
# break only exits the inner loop
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
target = 5
def find_in_matrix(matrix, target):
"""Return index when found, or None. Uses return to exit both loops at once."""
for row_idx, row in enumerate(matrix):
for col_idx, val in enumerate(row):
if val == target:
return row_idx, col_idx
return None
print(find_in_matrix(matrix, target)) # (1, 1)
for with enumerate(): Index + Value
One of the most common antipatterns in Python is writing for i in range(len(my_list)) to get both an index and a value. Raymond Hettinger's PEP 279 (created January 30, 2002, accepted for Python 2.3) introduced enumerate() specifically to kill this pattern. The PEP's rationale is direct: just as zip() solves the problem of looping over multiple sequences, enumerate() "solves the loop counter problem."
The PEP also reveals something interesting about the naming process. The function was nearly called indexed(), iterindexed(), or count(). Hettinger noted in PEP 279 that all names involving "count" had "the further disadvantage of implying that the count would begin from one instead of zero," while names involving "index" clashed with database terminology. The community response, as documented in the PEP, was "close to 100% favorable" for what became enumerate().
# Parse a config file and report line-specific errors
config_lines = [
"host=localhost",
"port=8080",
"debug", # Missing value---this is a problem
"timeout=30",
]
for line_num, line in enumerate(config_lines, start=1):
if "=" not in line:
print(f"Syntax error on line {line_num}: '{line}' (missing '=' separator)")
The start parameter wasn't part of the original implementation. Guido van Rossum pointed out during the PEP 279 review that enumerate(seqn, 4, 6) could be ambiguously interpreted as a slice. The optional start parameter was added later in Python 2.6.
for with zip(): Parallel Iteration
When you need to walk through two or more iterables in lockstep, zip() pairs them element by element.
students = ["Alice", "Bob", "Cara"]
scores = [92, 87, 95]
grades = ["A", "B+", "A"]
for student, score, grade in zip(students, scores, grades):
print(f"{student}: {score} ({grade})")
In Python 3.10+, zip() gained a strict=True parameter (PEP 618) that raises a ValueError if the iterables have different lengths—eliminating a subtle class of bugs where zip() silently truncated to the shortest iterable.
The while Loop: Conditional Repetition
The problem it solved: Some iteration problems don't have a known endpoint. You can't write for i in range(n) when n is "however many times until the user types quit" or "however many steps until convergence." The while loop exists because those problems are real and common.
The while loop is Python's tool for indefinite iteration—when you don't know in advance how many times you need to repeat. It keeps executing as long as its condition evaluates to True.
# Binary search implementation
def binary_search(sorted_list, target):
low, high = 0, len(sorted_list) - 1
while low <= high:
mid = (low + high) // 2
if sorted_list[mid] == target:
return mid
elif sorted_list[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
data = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
print(binary_search(data, 23)) # Output: 5
The while loop also supports the else clause, which runs when the condition becomes False (but not when the loop exits via break). And like for, it works with break and continue for fine-grained control.
The Walrus Operator and while Loops
Python 3.8 introduced the walrus operator (:=) via PEP 572, authored by Chris Angelico, and this had a significant impact on how while loops are written. The walrus operator allows you to assign a value and test it in the same expression, eliminating the common pattern of duplicating a function call before and inside the loop.
PEP 572 was one of the most contentious proposals in Python's history. The debate was so heated that Guido van Rossum stepped down as Python's Benevolent Dictator for Life shortly after accepting the PEP in July 2018. The Wikipedia article on the Python programming language confirms that van Rossum "resigned as Benevolent Dictator for Life after conflict about adding the assignment expression operator in Python 3.8."
Despite the controversy, the walrus operator produces notably cleaner while loops:
# Without walrus operator: duplicated input() call
command = input("Enter command: ")
while command != "quit":
print(f"Executing: {command}")
command = input("Enter command: ")
# With walrus operator: single input() call
while (command := input("Enter command: ")) != "quit":
print(f"Executing: {command}")
The second version eliminates the redundant input() call and keeps the loop's logic concentrated in one place. The walrus operator is also powerful when reading data in chunks:
# Read a file in fixed-size chunks
with open("large_dataset.bin", "rb") as f:
while (chunk := f.read(4096)):
process(chunk)
Think of the walrus operator as a sticky note placed on the result before you test it. Without :=, you write the result on a note, check it, then write it again before the next loop. With :=, you write it once and the note stays visible inside the loop body — same information, half the work.
As PEP 572 explains, the precedence of := was deliberately chosen to be "just lower than bitwise OR" so that common patterns inside while and if conditions can be written without parentheses in the simplest cases, though wrapping the entire expression in parentheses is the recommended practice for clarity.
Comprehensions: Loops as Expressions
The problem it solved: Before comprehensions, building a filtered or transformed list required three lines of boilerplate — declare an empty list, write a loop, append inside it. PEP 202 asked: what if building a list could be a single readable expression instead of a procedure?
Comprehensions are Python's declarative approach to iteration. Instead of writing a loop that appends to a list, you express the transformation in a single expression. They were introduced in Python 2.0 via PEP 202, authored by Barry Warsaw.
PEP 202 is refreshingly concise. It states that list comprehensions offer a more concise way to build lists in situations where map(), filter(), or nested loops would otherwise be required. The nesting behavior mirrors standard for loops—comprehension clauses nest in the same order as the equivalent for loops and if statements would.
Guido van Rossum himself eventually came to prefer comprehensions over the functional alternatives. In a discussion cited on Real Python, van Rossum described using map with lambda as one of his "regrets" and recommended using a list comprehension instead, adding that "a for loop is clearer" than reduce().
List Comprehensions
# Extract all IP addresses from log entries that indicate failed logins
logs = [
"192.168.1.10 - failed login",
"10.0.0.5 - successful login",
"192.168.1.10 - failed login",
"172.16.0.3 - failed login",
"10.0.0.5 - failed login",
]
failed_ips = [line.split()[0] for line in logs if "failed" in line]
print(failed_ips)
# ['192.168.1.10', '192.168.1.10', '172.16.0.3', '10.0.0.5']
Dictionary and Set Comprehensions
Python 2.7 and 3.0 added dictionary and set comprehensions, as described in PEP 274. These follow the same pattern but produce different data structures:
# Build a word frequency map
text = "the quick brown fox jumps over the lazy brown dog"
words = text.split()
word_freq = {word: words.count(word) for word in set(words)}
print(word_freq)
# {'the': 2, 'quick': 1, 'brown': 2, 'fox': 1, ...}
# Get unique file extensions from a directory listing
import os
files = ["report.pdf", "data.csv", "image.png", "backup.csv", "notes.txt"]
extensions = {os.path.splitext(f)[1] for f in files}
print(extensions)
# {'.pdf', '.csv', '.png', '.txt'}
Nested Comprehensions
Comprehensions can nest, with the last index varying fastest—just like nested for loops:
# Flatten a matrix
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
flat = [num for row in matrix for num in row]
print(flat) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
Nested comprehensions can become unreadable fast. If you find yourself nesting more than two levels deep, switch to explicit loops. The Zen of Python (PEP 20) reminds us that "readability counts" — and that flat structures are preferable to nested ones.
List comprehensions are often described as "syntactic sugar" for a loop, but that framing understates an important difference: [x for x in items if condition] is a filter expression that returns a value, not a statement. You can pass it to a function, assign it, or nest it. A for loop can't do that — it's a statement, not an expression. This distinction matters once you start composing them.
Never Mutate a List While Iterating Over It
This is not a theoretical edge case. It's one of the first bugs you write when you're comfortable with Python but not yet careful. Modifying a list while a for loop is running over it produces unpredictable behavior: items get skipped, indices shift, and the loop may terminate early without warning.
# WRONG: removing items while iterating skips elements
numbers = [1, 2, 3, 4, 5, 6]
for n in numbers:
if n % 2 == 0:
numbers.remove(n)
print(numbers) # [1, 3, 5] -- looks right, but 4 was never removed (try it with [2, 4, 6])
numbers = [2, 4, 6]
for n in numbers:
if n % 2 == 0:
numbers.remove(n)
print(numbers) # [4] -- 4 was skipped entirely
# CORRECT: iterate over a copy, or build a new list
numbers = [2, 4, 6]
for n in numbers[:]: # iterate over a shallow copy
if n % 2 == 0:
numbers.remove(n)
print(numbers) # []
# Better still: use a comprehension to build the filtered result
numbers = [2, 4, 6]
numbers = [n for n in numbers if n % 2 != 0]
print(numbers) # []
The same rule applies to dictionaries. In Python 3, iterating over a dictionary and modifying its size raises a RuntimeError: dictionary changed size during iteration. Iterate over a copy of the keys instead: for key in list(d.keys()):.
Generator Expressions and Generator Functions: Lazy Iteration
The problem it solved: List comprehensions are expressive, but they build the entire result in memory before you can use any of it. For a million-row file, that's a problem. Generators were designed to decouple producing values from consuming them — generating exactly one item at a time, on demand.
Generators are a fundamentally different approach to looping. Instead of building an entire collection in memory and then iterating over it, generators produce values one at a time, on demand. This "lazy evaluation" makes them essential when working with large datasets or infinite sequences.
Generator Expressions
PEP 289 (authored by Raymond Hettinger, accepted for Python 2.4) introduced generator expressions as a "high performance, memory efficient generalization of list comprehensions." The syntax is identical to a list comprehension, but with parentheses instead of brackets:
# Sum of squares without materializing a list
total = sum(x**2 for x in range(1_000_000))
# Memory comparison:
import sys
list_comp = [x**2 for x in range(10_000)]
gen_expr = (x**2 for x in range(10_000))
print(f"List: {sys.getsizeof(list_comp):,} bytes") # ~87,624 bytes
print(f"Generator: {sys.getsizeof(gen_expr)} bytes") # ~200 bytes
The memory difference is dramatic. PEP 289 notes that early benchmarks showed generators holding a clear performance edge over list comprehensions. While Python 2.4 closed the speed gap for small-to-mid-sized data, the PEP explains that at larger data volumes generators tend to outperform list comprehensions because they avoid exhausting cache memory and allow Python to reuse objects between iterations.
The syntax difference is a single character — brackets vs. parentheses — but the behavior is fundamentally different. Here they are side by side:
# Brackets: all 10,000 values computed immediately
# and stored in memory as a list object
squared = [x**2 for x in range(10_000)]
# You can index, slice, len(), iterate multiple times
print(squared[0]) # 0 -- random access OK
print(len(squared)) # 10000
for n in squared: pass # iterates fine
for n in squared: pass # iterates again, fine
# Parentheses: a generator object, ~200 bytes
# Values computed only as next() is called
squared = (x**2 for x in range(10_000))
# Cannot index, cannot len(), one-shot only
# print(squared[0]) -- TypeError
# print(len(squared)) -- TypeError
for n in squared: pass # works once
for n in squared: pass # empty — exhausted
Same logic, opposite tradeoffs: list = random access + reusable + more memory | generator = sequential + one-shot + minimal memory
A list comprehension is like a batch oven — it bakes all the cookies before anyone eats. A generator is like a toaster — it makes one piece the moment you put it in, and does nothing until you do. If you only need a few items, or if the dataset is too large to hold all at once, the toaster wins.
Under the hood: how yield suspends execution
When Python compiles a generator function, it sets a flag (CO_GENERATOR) on the code object. Calling the function doesn't execute a single line — it returns a generator object. The first next() call runs the body until yield, at which point the frame (local variables, instruction pointer, evaluation stack) is frozen and the yielded value is returned. The next next() call thaws the frame exactly where it paused. You can inspect this:
import dis
def count_up(n):
for i in range(n):
yield i
dis.dis(count_up)
# RESUME
# ...
# YIELD_VALUE <-- suspends here, returns i to caller
# RESUME <-- caller calls next() again, resumes here
# ...
The YIELD_VALUE opcode is the mechanism behind lazy evaluation. No value is computed until the frame is explicitly resumed by a next() call — which is exactly what a for loop does on every iteration.
Generator Functions with yield
PEP 255 (written by Neil Schemenauer, Tim Peters, and Magnus Lie Hetland, implemented for Python 2.2) introduced generator functions. The core idea is a function that can pause mid-execution and return an intermediate value to its caller, while keeping all local state intact so execution can resume exactly where it left off.
The PEP drew inspiration from generators in the Icon programming language. Guido van Rossum settled the syntax debate—whether generators should use a new keyword instead of def—with characteristic finality. Writing in PEP 255, he acknowledged that neither side had a fully convincing argument, then deferred to his language designer's intuition, concluding that the proposed syntax was exactly right.
Here's a practical generator for reading large CSV files without loading them entirely into memory:
def read_csv_rows(filepath, delimiter=","):
"""Yield one parsed row at a time from a CSV file."""
with open(filepath, "r") as f:
headers = f.readline().strip().split(delimiter)
for line in f:
values = line.strip().split(delimiter)
yield dict(zip(headers, values))
# Process a million-row file with constant memory usage
for row in read_csv_rows("transactions.csv"):
if float(row["amount"]) > 10000:
print(f"Large transaction: {row['id']} - ${row['amount']}")
When read_csv_rows() is called, no code executes immediately. Instead, it returns a generator-iterator object. Each time next() is called on it (implicitly by the for loop), execution resumes from where it last yielded. Per PEP 255, when a generator yields, its entire execution state is frozen—local variable bindings, the instruction pointer, and the internal evaluation stack are all preserved until the next next() call thaws them.
yield from: Delegating to Sub-generators
PEP 380 (Python 3.3) added yield from, which simplifies delegation to sub-generators:
def flatten(nested):
"""Recursively flatten nested iterables."""
for item in nested:
if isinstance(item, (list, tuple)):
yield from flatten(item)
else:
yield item
data = [1, [2, 3, [4, 5]], 6, [7, [8, [9]]]]
print(list(flatten(data))) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
Without yield from, you'd need to write an explicit inner loop: for x in flatten(item): yield x. The yield from syntax is more efficient and readable.
Iteration with itertools: The Power Tools
The problem it solved: Developers kept reinventing the same iteration patterns — chaining sequences, slicing iterators, generating combinations. Raymond Hettinger's itertools module collected these patterns into a single, fast, composable toolkit so they'd never need to be rewritten again.
Python's itertools module (standard library) provides a collection of fast, memory-efficient iteration building blocks. These are implemented in C for performance and follow the iterator protocol, meaning they compose naturally with for loops and other iteration constructs.
import itertools
# Chain multiple iterables seamlessly
logs_monday = ["event_a", "event_b"]
logs_tuesday = ["event_c", "event_d"]
logs_wednesday = ["event_e"]
for event in itertools.chain(logs_monday, logs_tuesday, logs_wednesday):
print(event)
# Generate all 2-character combinations from a set
chars = "ABCD"
for combo in itertools.combinations(chars, 2):
print(combo)
# ('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D')
# Group sorted data
data = [
{"dept": "engineering", "name": "Alice"},
{"dept": "engineering", "name": "Bob"},
{"dept": "marketing", "name": "Cara"},
{"dept": "marketing", "name": "Dan"},
]
for dept, members in itertools.groupby(data, key=lambda x: x["dept"]):
print(f"{dept}: {[m['name'] for m in members]}")
Key tools worth knowing: chain() for treating multiple iterables as one, islice() for slicing iterators (which don't support standard slicing), combinations() and permutations() for combinatorics, groupby() for grouping sorted data, and count() for infinite counting sequences.
Think of itertools as the plumbing fittings of iteration. Your generators are the pipes — they carry data. itertools provides the connectors, splitters, valves, and flow meters. chain() joins two pipes end-to-end. islice() is a valve that lets only the first N gallons through. groupby() is a sorting chamber. You don't build these from scratch — you snap them together.
itertools.groupby() does not work like SQL GROUP BY. It groups consecutive identical keys — if the same key appears in two non-adjacent positions, you get two separate groups. Always sort on the same key before calling groupby(), or you'll get silently fragmented results that look correct until the data changes order.
itertools vs. a Generator Function: How to Choose
Both itertools and generator functions produce lazy iterators, so when do you reach for one over the other? The answer comes down to whether the problem is a composition of known building blocks or custom logic.
Use itertools when you're combining, slicing, grouping, or applying standard combinatorial operations to existing iterables. These functions are implemented in C, well-tested, and communicate intent with a vocabulary that experienced Python developers recognize instantly. itertools.chain(a, b) says exactly what it means.
Use a generator function when the logic requires state, conditionals, or transformations that don't fit a standard pattern. If you find yourself stringing together several itertools calls and it's becoming hard to read, that's usually the signal to write an explicit generator with yield instead.
import itertools
# Prefer itertools: standard composition, no custom logic
log_batches = [["a", "b"], ["c", "d"], ["e"]]
for event in itertools.chain.from_iterable(log_batches):
print(event)
# Prefer a generator: custom logic that doesn't map to itertools
def parse_log_lines(filepath):
"""Yield structured records, skipping comments and blank lines."""
with open(filepath) as f:
for line in f:
line = line.strip()
if not line or line.startswith("#"):
continue
level, _, message = line.partition(":")
yield {"level": level.strip(), "message": message.strip()}
Functional-Style Iteration: map(), filter(), and functools.reduce()
The problem it solved: Before comprehensions existed, map() and filter() were the primary tools for applying a function across an iterable without a written loop. They're inherited from functional programming traditions and remain useful when you already have a named function to apply.
Python supports functional-style iteration through map() and filter() (builtins) and functools.reduce(). These apply functions across iterables without explicit loops.
from functools import reduce
temperatures_f = [32, 68, 72, 85, 100, 212]
# Convert Fahrenheit to Celsius
temperatures_c = list(map(lambda f: round((f - 32) * 5/9, 1), temperatures_f))
print(temperatures_c) # [0.0, 20.0, 22.2, 29.4, 37.8, 100.0]
# Filter to only comfortable temperatures
comfortable = list(filter(lambda c: 18 <= c <= 26, temperatures_c))
print(comfortable) # [20.0, 22.2]
# Sum all comfortable temperatures
total = reduce(lambda a, b: a + b, comfortable)
print(total) # 42.2
For the same reasons van Rossum expressed, many Python developers prefer comprehensions and explicit loops over map() with lambda and filter() with lambda. The list comprehension equivalents are typically more readable:
Many developers assume map() and filter() are "more functional" or inherently faster than comprehensions. Neither is true in modern Python. Both return lazy iterators in Python 3, and performance is comparable. The real distinction is readability: map() with a named function reads cleanly (map(str.upper, words)), but map() with a lambda is almost always harder to read than the equivalent comprehension. Reach for map() when you already have the function; write a comprehension when you'd need a lambda.
temperatures_c = [round((f - 32) * 5/9, 1) for f in temperatures_f]
comfortable = [c for c in temperatures_c if 18 <= c <= 26]
However, when you already have a named function, map() can be cleaner than a comprehension: list(map(str.upper, words)) is arguably more readable than [w.upper() for w in words].
Custom Iterators: Building Your Own Loop Targets
The problem it solved: Generators cover most lazy iteration needs, but some objects need to carry their own iteration state — think a database cursor class, a paginated API wrapper, or a sensor stream with internal buffering. The iterator protocol gives you a direct path to make any class a first-class citizen in a for loop.
Any Python class can be made iterable by implementing the iterator protocol defined in PEP 234. You need two methods: __iter__() on the container to return an iterator, and __next__() on the iterator to produce the next value and raise StopIteration when done.
The naming of __next__() has its own history. PEP 3114 (accepted by Guido van Rossum on March 6, 2007) renamed the original next() method to __next__() for consistency with Python's double-underscore conventions. PEP 234 itself includes a retrospective note acknowledging that using __next__() from the beginning—paired with a next() built-in that called it—would have been the cleaner design, but by the time this was recognized the Python 2.2 implementation had already shipped. Python 3 finally made this correction.
class SensorStream:
"""Iterate over sensor readings, stopping at a threshold."""
def __init__(self, readings, threshold):
self.readings = readings
self.threshold = threshold
self.index = 0
def __iter__(self):
return self
def __next__(self):
while self.index < len(self.readings):
value = self.readings[self.index]
self.index += 1
if value <= self.threshold:
return value
# Skip readings above threshold
raise StopIteration
readings = [22.1, 23.5, 45.0, 21.8, 99.9, 24.3, 22.0]
for temp in SensorStream(readings, threshold=30.0):
print(f"Normal reading: {temp}C")
For simpler cases, a generator function achieves the same result with far less boilerplate—but understanding the protocol gives you the power to build complex, stateful iterators when generators aren't sufficient.
A generator function is a shortcut to the iterator protocol — Python writes the __iter__() and __next__() methods for you behind the scenes. A custom iterator class is the same thing written out by hand. You'd choose the class when the iterator needs to hold complex state across pauses, accept method calls between iterations, or be subclassed. You'd choose the generator when none of that is needed.
Async Iteration: async for
The problem it solved: A regular for loop over a network source would block the entire thread while waiting for each item. async for allows the event loop to do other work while waiting — so a single thread can service hundreds of I/O-bound operations concurrently.
Python 3.5 (PEP 492) introduced async for to support asynchronous iteration. This is critical for I/O-bound applications that use asyncio—web scrapers, API clients, database query streams, and real-time data processors.
import asyncio
async def fetch_pages(urls):
"""Simulate fetching pages asynchronously."""
for url in urls:
await asyncio.sleep(0.1) # Simulated network delay
yield f"Content from {url}"
async def main():
urls = ["https://api.example.com/1", "https://api.example.com/2"]
async for content in fetch_pages(urls):
print(content)
asyncio.run(main())
Async generators (async def with yield) were added in Python 3.6 via PEP 525, and async comprehensions ([x async for x in aiter]) arrived in the same release via PEP 530.
Think of async for the way a hospital triage dispatcher works. A regular loop is one nurse who knocks on door 1, waits for an answer, then knocks on door 2. An async loop is a dispatcher who knocks on all the doors at once, then handles each room as it responds — same one person, far less waiting. The dispatcher doesn't do more work; they just stop standing idle between knocks.
Performance: Choosing the Right Loop
Not all loops are equal in performance. Here is the general hierarchy from fastest to slowest for simple transformations:
Built-in functions like sum(), min(), max() operate in C and are the fastest. Generator expressions and comprehensions are compiled to specialized bytecode and avoid function call overhead. PEP 709 (Python 3.12) further improved comprehension performance by inlining them—the PEP reports comprehensions becoming "up to 2x faster" by eliminating the creation of a nested function object. Explicit for loops are slower because each iteration involves Python-level bytecode execution. while loops carry similar overhead but add the cost of re-evaluating the condition each iteration.
# Performance comparison (conceptual ranking)
import timeit
numbers = list(range(100_000))
# Fastest: C-level built-in
t1 = timeit.timeit(lambda: sum(numbers), number=100)
# Fast: generator expression (no intermediate list)
t2 = timeit.timeit(lambda: sum(x for x in numbers), number=100)
# Moderate: list comprehension (builds list, then sums)
t3 = timeit.timeit(lambda: sum([x for x in numbers]), number=100)
# Slower: explicit for loop
def loop_sum():
total = 0
for x in numbers:
total += x
return total
t4 = timeit.timeit(loop_sum, number=100)
print(f"sum(): {t1:.4f}s")
print(f"Generator expr: {t2:.4f}s")
print(f"List comp: {t3:.4f}s")
print(f"For loop: {t4:.4f}s")
Use the most Pythonic construct that fits your problem. Write for clarity first, then optimize the bottlenecks you measure.
Quick Reference: Which Loop When?
lambdaWhen in doubt: write for clarity first, then measure. A comprehension that's hard to read is worse than a for loop that's easy to follow. The constructs above are tools, not status symbols.
The PEPs That Shaped Python Loops
Here is a consolidated reference of the PEPs discussed in this article, in chronological order:
| PEP | Year / Author(s) | What it introduced | Python version |
|---|---|---|---|
| PEP 202 | July 2000 — Barry Warsaw | List Comprehensions. Declarative loop-as-expression syntax. | 2.0 |
| PEP 234 | 2001 — Ka-Ping Yee, Guido van Rossum | Iterators. The __iter__()/__next__() protocol the for loop uses today. |
2.2 |
| PEP 255 | May 2001 — Schemenauer, Peters, Hetland | Simple Generators. Introduced yield and generator functions. |
2.2 |
| PEP 279 | January 2002 — Raymond Hettinger | The enumerate() built-in. Solved the loop counter problem. |
2.3 |
| PEP 274 | 2001 — Barry Warsaw | Dict Comprehensions. | 2.7 / 3.0 |
| PEP 289 | 2002 — Raymond Hettinger | Generator Expressions. Memory-efficient alternative to list comprehensions. | 2.4 |
| PEP 3114 | 2007 — Ka-Ping Yee | Renamed iterator.next() to iterator.__next__(). |
3.0 |
| PEP 380 | 2009 — Gregory Ewing | yield from for delegating to sub-generators. |
3.3 |
| PEP 572 | February 2018 — Chris Angelico | Assignment Expressions (the walrus operator :=). |
3.8 |
| PEP 618 | 2020 — Brandt Bucher | zip() with strict parameter. |
3.10 |
| PEP 709 | 2023 — Carl Meyer | Inlined Comprehensions. Performance optimization. | 3.12 |
Looping in Python has never been about raw syntax. It's about expressing intent clearly. From the foundational iterator protocol in PEP 234, to the elegance of comprehensions, to the memory efficiency of generators, to the modern convenience of the walrus operator—each tool exists because Python's designers identified a real problem and crafted a targeted solution for it.
The next time you write a loop, you're building on over two decades of deliberate language design. Write it well.
Loop Pitfalls Reference
A consolidated list of common loop mistakes in Python, for quick reference:
Mutating a list during iteration causes skipped elements. Iterate over a copy (list[:]) or build a new collection with a comprehension.
Treating an exhausted iterator as reusable produces empty results silently. Generators, file objects, and zip() results are one-shot. Convert to a list if you need to iterate more than once.
Expecting break to exit nested loops leads to bugs. break exits only the innermost loop. Refactor into a function and use return.
Using range(len(seq)) to get index and value when enumerate(seq) exists. This is the pattern enumerate() was specifically designed to replace.
Using a mutable default in a comprehension filter when you meant to filter on identity. [x for x in items if x] drops any falsy value—empty strings, zeros, empty lists. Use explicit comparisons if those values are valid: if x is not None.
Forgetting to sort before groupby(). itertools.groupby() groups consecutive identical keys. If your data isn't sorted, groups will be fragmented. Always sort on the same key before calling groupby().