Dictionary comprehensions let you build, transform, and filter Python dictionaries in a single expression. They were formally proposed back in 2001 through PEP 274, authored by Barry Warsaw, but did not land in the language until Python 2.7 and Python 3.0. Today they are one of Python's signature tools for writing concise, readable code — and understanding exactly how they work, including the edge cases that bite developers in production, will make you a sharper programmer.
If you have used list comprehensions before, dictionary comprehensions will feel immediately familiar. The core idea is the same: replace a multi-line for loop with a compact expression that produces a collection in one shot. The difference is that instead of generating a list of single values, you generate a dictionary of key-value pairs. As the official Python tutorial states, dict comprehensions can create dictionaries from arbitrary key and value expressions.
The Zen of Python, authored by Tim Peters and codified in PEP 20, includes the principle "Readability counts." Dictionary comprehensions embody this principle by collapsing verbose initialization logic into a clean, declarative statement. But as with all one-liner constructs, readability has an upper limit. This article covers the full range of dictionary comprehension usage — from fundamental syntax to nested structures — and explains exactly when comprehensions improve your code and when they start to hurt it.
The Syntax and How It Works
A dictionary comprehension is enclosed in curly braces {} and contains a key-value expression separated by a colon, followed by a for clause and an optional if clause. The general form looks like this:
{key_expression: value_expression for item in iterable}
The Python interpreter evaluates this expression by iterating over every element in the iterable. For each element, it computes the key expression and the value expression, then inserts the resulting pair into a new dictionary. The entire dictionary is returned once the iteration completes.
Here is a minimal example that maps integers to their squares:
squares = {x: x ** 2 for x in range(1, 6)}
print(squares)
# Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
The equivalent code using a traditional for loop would require four lines:
squares = {}
for x in range(1, 6):
squares[x] = x ** 2
print(squares)
# Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
Both approaches produce identical results. The comprehension version is more compact, and for simple mappings like this one, arguably easier to parse at a glance.
Curly braces {} are shared between dictionaries and sets in Python. An empty {} always creates a dictionary, not a set. To create an empty set, you must use set(). A comprehension with a colon in the expression ({k: v ...}) produces a dictionary; one without a colon ({x ...}) produces a set.
Building Dictionaries from Scratch
The simplest use of a dictionary comprehension is constructing a new dictionary from a single iterable. You have already seen the squares example. Here are other common patterns.
From Two Separate Lists Using zip()
When you have parallel lists of keys and values, zip() pairs them together element by element, and a dictionary comprehension captures each pair:
countries = ["Japan", "Brazil", "Germany", "Kenya"]
capitals = ["Tokyo", "Brasilia", "Berlin", "Nairobi"]
country_capitals = {country: capital for country, capital in zip(countries, capitals)}
print(country_capitals)
# Output: {'Japan': 'Tokyo', 'Brazil': 'Brasilia', 'Germany': 'Berlin', 'Kenya': 'Nairobi'}
This pattern is useful when data arrives in columnar form, such as when reading headers and row values from a CSV file. The comprehension replaces both a loop and a manual call to dict(zip(keys, values)), though the latter is also a valid and readable alternative.
From a Range with a Computed Value
You can apply any function or expression to produce the values. For example, mapping ASCII codes to their corresponding characters:
ascii_map = {i: chr(65 + i) for i in range(26)}
print(ascii_map[0]) # Output: A
print(ascii_map[25]) # Output: Z
This example mirrors one of the original code samples from PEP 274, written by Barry Warsaw when he proposed the feature in October 2001.
From a String
Strings are iterables, so you can build dictionaries directly from individual characters. Here is a comprehension that counts the position of each unique letter in a word:
word = "python"
positions = {char: idx for idx, char in enumerate(word)}
print(positions)
# Output: {'p': 0, 'y': 1, 't': 2, 'h': 3, 'o': 4, 'n': 5}
If the string contains duplicate characters, the last occurrence wins because dictionaries cannot have duplicate keys. Each subsequent assignment for the same key silently overwrites the previous value.
Filtering with Conditionals
Adding an if clause to the end of a dictionary comprehension lets you include only those items that satisfy a condition. The filtered form looks like this:
{key_expr: value_expr for item in iterable if condition}
Single Condition
Suppose you have a dictionary of exam scores and you want to keep only those students who scored above 70:
scores = {"Alice": 85, "Bob": 62, "Clara": 91, "Dan": 70, "Eve": 55}
passing = {name: score for name, score in scores.items() if score > 70}
print(passing)
# Output: {'Alice': 85, 'Clara': 91}
The .items() method returns each key-value pair as a tuple, which the comprehension unpacks into name and score. Only pairs where score > 70 evaluates to True are included in the resulting dictionary.
Multiple Conditions
You can chain multiple if clauses. They behave like a logical and — every condition must be true for the item to be included:
scores = {"Alice": 85, "Bob": 62, "Clara": 91, "Dan": 70, "Eve": 55}
honor_roll = {name: score for name, score in scores.items() if score > 70 if len(name) > 3}
print(honor_roll)
# Output: {'Alice': 85, 'Clara': 91}
Here, both score > 70 and len(name) > 3 must pass. "Bob" fails the score check. "Dan" passes the name-length check but fails the score threshold. Only "Alice" and "Clara" meet both criteria.
Conditional Value Assignment with if/else
When you want to transform values rather than filter items out entirely, you place the if/else in the value expression itself, before the for keyword:
scores = {"Alice": 85, "Bob": 62, "Clara": 91, "Dan": 70, "Eve": 55}
result = {name: ("pass" if score >= 70 else "fail") for name, score in scores.items()}
print(result)
# Output: {'Alice': 'pass', 'Bob': 'fail', 'Clara': 'pass', 'Dan': 'pass', 'Eve': 'fail'}
The placement of if determines its purpose. An if after the for clause filters which items enter the dictionary. An if/else inside the key or value expression controls what value each item receives. Mixing up these two positions is one of the common mistakes encountered when writing comprehensions.
Transforming Existing Dictionaries
Dictionary comprehensions are well suited for creating new dictionaries that are modified versions of existing ones. This section covers the patterns that appear frequently in production code.
Inverting Keys and Values
Swapping keys and values is one of the canonical use cases demonstrated in PEP 274 itself. The inversion is a single expression:
original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print(inverted)
# Output: {1: 'a', 2: 'b', 3: 'c'}
This only works correctly when all values in the original dictionary are unique and hashable. If two keys share the same value, the last one processed will overwrite the earlier entry in the inverted dictionary, and Python dictionaries maintain insertion order as of Python 3.7 (this was an implementation detail in CPython 3.6 that became a guaranteed language feature in 3.7, as documented in the Python 3.7 release notes).
Transforming Keys or Values
You can apply any function to either the key, the value, or both. A common scenario is normalizing string keys:
raw_data = {"First Name": "Ada", "Last Name": "Lovelace", "FIELD": "Mathematics"}
normalized = {k.lower().replace(" ", "_"): v for k, v in raw_data.items()}
print(normalized)
# Output: {'first_name': 'Ada', 'last_name': 'Lovelace', 'field': 'Mathematics'}
Another frequent pattern involves converting data types for values, such as casting string representations of numbers into actual integers:
string_prices = {"apple": "120", "banana": "45", "cherry": "310"}
int_prices = {fruit: int(price) for fruit, price in string_prices.items()}
print(int_prices)
# Output: {'apple': 120, 'banana': 45, 'cherry': 310}
Merging and Selecting from Multiple Dictionaries
While Python 3.9 introduced the merge operator (|) for dictionaries, comprehensions give you fine-grained control when you need to merge selectively. For instance, combining two dictionaries but keeping only keys that appear in both:
inventory_a = {"widget": 50, "gadget": 30, "gizmo": 20}
inventory_b = {"widget": 10, "gadget": 5, "doohickey": 15}
combined = {k: inventory_a[k] + inventory_b[k] for k in inventory_a if k in inventory_b}
print(combined)
# Output: {'widget': 60, 'gadget': 35}
Nested Dictionary Comprehensions
Just as you can nest for loops, you can nest dictionary comprehensions to create dictionaries of dictionaries. The outer comprehension produces the top-level keys and values, and the value of each top-level key is itself another comprehension.
multiplication_table = {
row: {col: row * col for col in range(1, 6)}
for row in range(1, 4)
}
print(multiplication_table)
# Output:
# {1: {1: 1, 2: 2, 3: 3, 4: 4, 5: 5},
# 2: {1: 2, 2: 4, 3: 6, 4: 8, 5: 10},
# 3: {1: 3, 2: 6, 3: 9, 4: 12, 5: 15}}
When Python encounters a nested comprehension, it evaluates the outer loop first, then the inner one for each iteration of the outer loop. The inner comprehension {col: row * col for col in range(1, 6)} runs completely for every value of row, producing a new inner dictionary each time.
You can also use nested comprehensions to restructure flat data. Imagine you have a list of tuples representing student grades across subjects:
records = [
("Alice", "math", 92),
("Alice", "science", 88),
("Bob", "math", 76),
("Bob", "science", 81),
]
# Get unique student names
students = {name for name, _, _ in records}
# Build nested structure
gradebook = {
student: {subject: grade for name, subject, grade in records if name == student}
for student in students
}
print(gradebook)
# Output: {'Alice': {'math': 92, 'science': 88}, 'Bob': {'math': 76, 'science': 81}}
Nested comprehensions can become difficult to read quickly. If the expression spans more than about 80 characters or requires mental effort to trace the data flow, consider refactoring into a helper function or a traditional loop. The Zen of Python reminds us that "flat is better than nested" and "if the implementation is hard to explain, it's a bad idea."
The Walrus Operator Inside Comprehensions
Python 3.8 introduced the walrus operator (:=), which assigns a value to a variable and returns that value in the same expression. Inside a comprehension, this matters when your key or value expression is computationally expensive and you need to reference the result twice — once in the value and once in a filter condition — without computing it twice.
Consider a case where you are processing a list of strings and want to keep only those whose uppercase version starts with a specific letter, storing the uppercase version as the value:
words = ["banana", "cherry", "avocado", "blueberry", "apricot"]
# Without walrus: .upper() is called twice per item
result = {w: w.upper() for w in words if w.upper().startswith("B")}
# With walrus: .upper() is called once, stored as upper_w
result = {w: upper_w for w in words if (upper_w := w.upper()).startswith("B")}
print(result)
# Output: {'banana': 'BANANA', 'blueberry': 'BLUEBERRY'}
The walrus version avoids a redundant call by assigning the uppercase string to upper_w in the filter clause and then reusing that variable in the value expression. For cheap operations like .upper(), the difference is negligible. For expensive operations — a database call, a regex match, an API round-trip wrapped in a function — the walrus operator can meaningfully reduce the work your comprehension does on each iteration.
The walrus operator leaks its variable into the enclosing scope (unlike loop variables in comprehensions, which are properly scoped). In a dict comprehension, a name assigned with := will exist in the outer function or module scope after the comprehension completes. This is expected behavior per PEP 572, but it can surprise developers who assume all comprehension variables are local. Name your walrus variables carefully to avoid collisions.
Duplicate Keys: What Actually Happens
The duplicate key behavior of Python dictionaries is well known in the abstract — later values overwrite earlier ones — but the implications inside a comprehension deserve more careful attention than they typically receive.
The simplest case: if your iterable produces the same key twice, the second value silently wins.
pairs = [("a", 1), ("b", 2), ("a", 99)]
result = {k: v for k, v in pairs}
print(result)
# Output: {'a': 99, 'b': 2}
No error. No warning. The first ("a", 1) pair is simply overwritten. In small examples this is obvious, but in real code where the iterable is a database result set, a CSV with user-supplied data, or a joined list from two sources, silent key collisions can produce subtly wrong outputs that are difficult to debug.
The deeper question is: do you know whether your data can produce duplicate keys? If you are not certain, you may want to check explicitly rather than relying on last-write-wins behavior:
from collections import Counter
pairs = [("a", 1), ("b", 2), ("a", 99)]
# Check for duplicates before building
key_counts = Counter(k for k, _ in pairs)
duplicates = [k for k, count in key_counts.items() if count > 1]
if duplicates:
raise ValueError(f"Duplicate keys detected: {duplicates}")
result = {k: v for k, v in pairs}
Alternatively, if you intentionally want to collect all values for each key rather than overwrite, a comprehension is the wrong structure. Use collections.defaultdict or a groupby pattern instead:
from collections import defaultdict
pairs = [("a", 1), ("b", 2), ("a", 99)]
# Collect all values per key
grouped = defaultdict(list)
for k, v in pairs:
grouped[k].append(v)
print(dict(grouped))
# Output: {'a': [1, 99], 'b': [2]}
Understanding this distinction prevents a class of bug that comprehension-heavy code is particularly susceptible to: the assumption that a one-to-one mapping exists in data that is actually one-to-many.
Comprehension vs. dict.fromkeys()
A question that does not come up often enough: when should you use dict.fromkeys() instead of a comprehension? The two approaches overlap for certain cases, but they are not equivalent.
dict.fromkeys(iterable, value) creates a dictionary from an iterable of keys, assigning every key the same value. It is the right tool when you want a uniform default across all keys:
# Initialize all counters to zero
fields = ["views", "clicks", "conversions", "bounces"]
analytics = dict.fromkeys(fields, 0)
print(analytics)
# Output: {'views': 0, 'clicks': 0, 'conversions': 0, 'bounces': 0}
A comprehension produces the same result here, but with unnecessary ceremony:
# Equivalent but more verbose
analytics = {field: 0 for field in fields}
Use dict.fromkeys() for uniform initialization. The intent is clearer and the code is shorter.
However, dict.fromkeys() has a critical gotcha with mutable default values. It assigns the same object to every key, not independent copies:
# DANGER: every key points to the same list object
students = ["Alice", "Bob", "Clara"]
gradebook = dict.fromkeys(students, [])
gradebook["Alice"].append(95)
print(gradebook)
# Output: {'Alice': [95], 'Bob': [95], 'Clara': [95]}
# Appending to one list modified all of them!
A comprehension creates a fresh object on each iteration, so it does not have this problem:
# CORRECT: each student gets their own independent list
gradebook = {student: [] for student in students}
gradebook["Alice"].append(95)
print(gradebook)
# Output: {'Alice': [95], 'Bob': [], 'Clara': []}
Use dict.fromkeys(iterable, value) for immutable defaults (integers, strings, None, tuples). Use a comprehension for mutable defaults (lists, dicts, sets, or any object you want to be independent per key). Getting this wrong produces one of Python's more confusing runtime bugs.
Performance: Comprehension vs. For Loop vs. dict()
Performance is a practical question that every Python developer should understand, even though readability should come first for all but the tightest inner loops.
Sebastian Witowski ran benchmarks comparing three approaches to building a dictionary that maps numbers to their squares for 1,000 elements. His results, published on switowski.com, showed that a dictionary comprehension and a traditional for loop performed nearly identically (roughly 31-32 microseconds per call on Python 3.11), while the dict() constructor fed a list of tuples was approximately 60% slower.
The reason comprehensions and direct for loops are close in speed is that both use the same underlying iteration mechanism. The reason dict(list_of_tuples) is slower is that it forces Python to first build an entire list of tuples in memory before passing them to the constructor, adding both memory overhead and extra function call costs. As Witowski explains, the core advantage is that comprehensions avoid creating an intermediate data structure — they build the dictionary directly.
A related factor is method lookup overhead. A for loop that calls dict.update() or resolves d[k] = v through attribute lookup on every iteration pays a small cost per pass that a comprehension avoids by using the interpreter's internal BUILD_MAP opcode path. The gains are real but modest, which is why benchmarks consistently show comprehensions at parity with — rather than dramatically ahead of — explicit loops.
One development worth knowing: starting with Python 3.12, list, dict, and set comprehensions are inlined by the interpreter, eliminating the function call overhead that previous versions incurred when creating the implicit inner scope. This means the performance characteristics described by Witowski's Python 3.11 benchmarks are still a valid baseline, but Python 3.12 and later versions are slightly faster still for comprehension-heavy code. The practical implication: if you are on Python 3.12+, there is even less reason to avoid comprehensions on performance grounds.
Guido van Rossum, creator of Python, has framed the language as an experiment in finding the right level of programmer freedom — too much and code becomes unreadable, too little and expressiveness suffers. — attributed to Guido van Rossum
That tension is precisely what dictionary comprehensions navigate. They grant expressive power to condense dictionary creation, but overusing that freedom with deeply nested or heavily conditional comprehensions can make your code unreadable.
When Not to Use Dictionary Comprehensions
Dictionary comprehensions are not always the right tool. Here are specific situations where a traditional loop or a different approach serves you better.
When the logic requires side effects. If your loop body needs to print, log, write to a file, or modify external state, a comprehension is the wrong structure. Comprehensions are designed to produce a value, not to execute a sequence of actions.
When the expression is too complex. If the key or value expression involves multiple function calls, nested ternary operators, or long method chains, the one-liner will be harder to debug and harder for other developers to understand. As Python core developer Raymond Hettinger has frequently emphasized in his talks, the goal of Pythonic code is clarity above cleverness.
When you need to handle exceptions per item. There is no way to wrap individual iterations of a comprehension in a try/except block. If you need to catch errors while building the dictionary (for example, when parsing unreliable input data), you need an explicit loop. A common approach is to extract the risky operation into a helper function that returns a sentinel value on failure, then filter that sentinel out in a second step:
def safe_parse_int(s):
try:
return int(s)
except (ValueError, TypeError):
return None
raw = {"a": "10", "b": "not_a_number", "c": "30", "d": None}
# Two-step: parse with sentinel, then filter
parsed = {k: safe_parse_int(v) for k, v in raw.items()}
clean = {k: v for k, v in parsed.items() if v is not None}
print(clean)
# Output: {'a': 10, 'c': 30}
This keeps the comprehension structure while handling the exception logic in a testable, named function. If the per-item error handling is complex, however, an explicit loop remains the cleaner choice.
When memory is a concern for very large datasets. A dictionary comprehension constructs the entire dictionary in memory at once. If you are processing millions of records and only need to access them sequentially, consider a generator expression or an iterative approach that processes items one at a time.
When you already have a clean alternative. Python's dict(zip(keys, values)) is perfectly readable for the simple case of combining two parallel sequences. Not every dictionary creation needs to be a comprehension.
Van Rossum has described the ideal Python coding experience as one where concise, readable classes do real work — not one where trivial code bores the reader. — attributed to Guido van Rossum
Key Takeaways
- Syntax follows a clear pattern:
{key: value for item in iterable}is the base form. Addifafter theforclause to filter, or useif/elsein the expression to assign conditional values. - Dictionary comprehensions landed in Python 2.7 and 3.0: They were originally proposed in PEP 274 by Barry Warsaw in October 2001, initially intended for Python 2.3 but withdrawn, and then implemented years later alongside set comprehensions. The PEP was formally accepted on April 9, 2012 to reflect the already-shipped implementation.
- Common patterns include inverting dictionaries, normalizing keys, filtering entries, and combining parallel lists: Each of these replaces a multi-line loop with a single expression.
- Duplicate keys are silently overwritten: The last value for a given key wins. If your data can produce duplicate keys unexpectedly, validate before building the dictionary or use
collections.defaultdictto collect all values. - The walrus operator (
:=) lets you avoid recomputing expensive expressions: Assign a value in the filter clause and reuse it in the key or value expression. Be aware that walrus-assigned names leak into the enclosing scope. - Use
dict.fromkeys()for immutable uniform defaults, a comprehension for mutable ones:dict.fromkeys(iterable, [])gives every key the same list object. A comprehension gives every key a fresh one. - Performance is comparable to a for loop and faster than dict(list_of_tuples): Verified benchmarks on Python 3.11 by Sebastian Witowski show comprehension and for loop at roughly 31–32 microseconds per call for 1,000 elements, while
dict(list_of_tuples)runs approximately 60% slower. Python 3.12 and later inline comprehensions in the interpreter, reducing overhead further. - Nesting is possible but should be used sparingly: Nested comprehensions can create dictionaries of dictionaries, but readability degrades rapidly once the expression grows beyond a single line.
- Readability is the deciding factor: If a comprehension is hard to explain or takes more than a few seconds to parse visually, refactor it into a loop or a helper function. The Zen of Python's guidance — "Simple is better than complex" — applies directly.
Dictionary comprehensions are one of Python's tools for writing expressive, compact code. They trace their origins to PEP 274 and the broader comprehension syntax that Python borrowed from mathematical set-builder notation. Understanding them fully means knowing not just the syntax, but also the gotchas: how duplicate keys behave, when the walrus operator earns its place, why dict.fromkeys() is dangerous with mutable defaults, and how Python 3.12 changed the interpreter's treatment of comprehension scopes. When used appropriately — for straightforward mappings, transformations, and filters — they produce code that is both faster to write and faster to read. When overused or over-nested, they produce the opposite. The goal is always the same: write code that you and your teammates can understand at a glance, six months from now.