Creating Dictionaries from Iterables in Python

Python's dictionary is one of the language's defining data structures: fast, flexible, and deeply integrated into the runtime itself. Knowing how to build dictionaries efficiently from iterables—rather than assembling them one assignment at a time—is one of the clearest signals that a developer has moved past the basics and started writing genuinely idiomatic Python.

This article covers every major technique for building a Python dictionary from an iterable source: the dict() constructor, zip(), dictionary comprehensions, enumerate(), dict.fromkeys(), and the merging operators introduced in Python 3.9. Along the way it addresses what actually happens under the hood, what the edge cases are, and when to reach for one approach over another. All examples are verified against Python 3.13, which is the current stable release as of early 2026.

What Makes a Dictionary and What Makes an Iterable

Before building dictionaries from iterables, it helps to be precise about what both terms mean in Python. A dictionary (dict) is a mutable mapping type that stores key-value pairs. Since Python 3.7, dictionaries maintain insertion order as a language guarantee—not just an implementation detail. The official documentation states:

"Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6." — Python 3 Documentation, Built-in Types: dict

An iterable is any object that can return its elements one at a time. Formally, an object is iterable if it implements __iter__() or __getitem__(). Lists, tuples, strings, sets, generators, files, and range objects are all iterables. The key constraint when building a dictionary from an iterable is that you need a way to produce pairs: one element becomes the key, and another becomes the value.

Python provides several distinct mechanisms to satisfy that constraint, each suited to a different starting shape of data.

Note

Dictionary keys must be hashable. This means they must implement both __hash__() and __eq__(), and their hash value must not change over their lifetime. Integers, floats, strings, and tuples of hashables are all valid keys. Lists, sets, and other dictionaries are not.

The dict() Constructor and Key-Value Pairs

The most direct route is the dict() constructor. According to the Python documentation, dict() accepts three kinds of input: another mapping, an iterable of key-value pairs, or keyword arguments. When you pass an iterable, each element must itself be an iterable of exactly two items—a two-element sequence that Python unpacks into a key and a value.

"If a positional argument is given and it is a mapping object, a dictionary is created with the same key-value pairs as the mapping object. Otherwise, the positional argument must be an iterable object. Each item in the iterable must itself be an iterable with exactly two objects. The first object of each item becomes a key in the new dictionary, and the second object the corresponding value." — Python 3 Documentation, Built-in Types: dict

The most common source is a list of two-element tuples, sometimes called an association list in other language communities:

# List of (key, value) tuples passed directly to dict()
pairs = [("name", "Alice"), ("age", 30), ("city", "Boston")]
profile = dict(pairs)
print(profile)
# {'name': 'Alice', 'age': 30, 'city': 'Boston'}

Two-element lists work equally well, because the constructor only requires that each inner element be a two-item iterable:

# Inner items can be lists instead of tuples
pairs = [["name", "Alice"], ["age", 30], ["city", "Boston"]]
profile = dict(pairs)
print(profile)
# {'name': 'Alice', 'age': 30, 'city': 'Boston'}

You can also pass a generator expression directly, which avoids materializing the intermediate list in memory—a meaningful advantage when the source data is large:

# Generator expression passed to dict()
# Squares of numbers 0 through 4, keyed by the number itself
squares = dict((n, n ** 2) for n in range(5))
print(squares)
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Pro Tip

Passing a generator to dict() is often more memory-efficient than building a list of pairs first, especially when transforming large datasets. The generator produces pairs on demand and dict() consumes them one at a time without storing the full sequence.

The keyword argument form of dict() is a related but distinct technique. It produces the same mapping type, but the keys are restricted to valid Python identifiers and are always strings:

# Keyword arguments: keys must be valid identifiers
config = dict(host="localhost", port=5432, user="admin")
print(config)
# {'host': 'localhost', 'port': 5432, 'user': 'admin'}

You can combine both forms. The positional iterable is consumed first, and then keyword arguments are applied. If the same key appears in both, the keyword argument wins:

base = [("host", "10.0.0.1"), ("port", 5432)]
config = dict(base, port=9999)
print(config)
# {'host': '10.0.0.1', 'port': 9999}

Combining Two Iterables with zip()

A very common real-world scenario is that keys and values live in two separate iterables—perhaps two columns in a CSV file, or two lists returned from an API. The built-in zip() function solves this by lazily pairing corresponding elements from two or more iterables into tuples, which can then be fed directly to dict().

keys   = ["name", "age", "city"]
values = ["Alice", 30, "Boston"]

profile = dict(zip(keys, values))
print(profile)
# {'name': 'Alice', 'age': 30, 'city': 'Boston'}

In Python 3, zip() returns an iterator—it does not produce a list. This is a deliberate efficiency choice. The iterator is consumed by dict() one pair at a time, so the full cross-product of both sequences never needs to exist in memory simultaneously.

Note

zip() stops at the shortest iterable. If your keys list has five elements and your values list has three, you get a dictionary with three entries and no error. If you need mismatched lengths to raise an exception instead, use zip(keys, values, strict=True), available from Python 3.10 onward.

The strict=True parameter was introduced in Python 3.10 as part of PEP 618. It makes length mismatches explicit rather than silent:

# strict=True raises ValueError if lengths differ (Python 3.10+)
keys   = ["a", "b", "c"]
values = [1, 2]

try:
    result = dict(zip(keys, values, strict=True))
except ValueError as e:
    print(e)
# zip() has argument 2 of length 2 which is shorter than argument 1

zip() is not limited to two sequences. When you pass three or more iterables, each resulting tuple has one element per source. This is useful for building dictionaries where values are themselves tuples of multiple attributes:

names   = ["Alice", "Bob", "Carol"]
scores  = [92, 85, 97]
grades  = ["A", "B", "A+"]

# Values are (score, grade) tuples
report = dict(zip(names, zip(scores, grades)))
print(report)
# {'Alice': (92, 'A'), 'Bob': (85, 'B'), 'Carol': (97, 'A+')}

Dictionary Comprehensions

Dictionary comprehensions were introduced in Python 2.7 and Python 3.0 and have been idiomatic Python ever since. They follow the same syntax as list comprehensions but use curly braces and require a colon to separate key and value expressions:

# General form: {key_expr: value_expr for item in iterable if condition}

# From a list of words, map each word to its character count
words = ["python", "dict", "iterable", "comprehension"]
lengths = {word: len(word) for word in words}
print(lengths)
# {'python': 6, 'dict': 4, 'iterable': 8, 'comprehension': 13}

Comprehensions are especially powerful when you need to transform or filter an existing iterable at the same time as building the dictionary. The optional if clause acts as a guard, including only elements that satisfy a condition:

# Only include words longer than 5 characters
long_words = {word: len(word) for word in words if len(word) > 5}
print(long_words)
# {'python': 6, 'iterable': 8, 'comprehension': 13}

You can also swap keys and values from an existing dictionary in a single expression. This is a common interview question and a genuinely practical operation when you need a reverse-lookup table:

original = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in original.items()}
print(inverted)
# {1: 'a', 2: 'b', 3: 'c'}
Warning

Inverting a dictionary is only safe when values are unique. If two keys share the same value, the last key to be processed wins, and data is silently lost. Always verify uniqueness before inverting, or use a different structure such as a dict of lists to handle collisions.

Dictionary comprehensions can iterate over any iterable that yields items, not just sequences. Iterating over a string produces characters, which can be used as keys:

# Character-to-position mapping (last occurrence wins for duplicates)
text = "hello"
positions = {char: idx for idx, char in enumerate(text)}
print(positions)
# {'h': 0, 'e': 1, 'l': 3, 'o': 4}

Nested comprehensions are syntactically legal but should be used with restraint. A two-level nest is often readable; three or more levels usually warrants a refactor into named variables or helper functions.

"Flat is better than nested." — Tim Peters, The Zen of Python (PEP 20)

Using enumerate() to Generate Index-Based Dictionaries

enumerate() is a built-in that pairs each element of an iterable with its zero-based index (or a custom start value), yielding two-tuples of (index, value). Passing its output to dict() produces an integer-keyed dictionary:

fruits = ["apple", "banana", "cherry"]
indexed = dict(enumerate(fruits))
print(indexed)
# {0: 'apple', 1: 'banana', 2: 'cherry'}

The optional start parameter shifts the index base. This is useful when your domain convention uses 1-based numbering, such as question numbers in a quiz or row numbers in a spreadsheet:

questions = [
    "What is a variable?",
    "What is a loop?",
    "What is a function?"
]
quiz = dict(enumerate(questions, start=1))
print(quiz)
# {1: 'What is a variable?', 2: 'What is a loop?', 3: 'What is a function?'}

More often you will want the value as the key and derive the value from computation. That is a job for a dictionary comprehension that calls enumerate() internally:

# Map each fruit to its 1-based rank
rank_map = {fruit: idx for idx, fruit in enumerate(fruits, start=1)}
print(rank_map)
# {'apple': 1, 'banana': 2, 'cherry': 3}

This pattern appears constantly in natural language processing when building vocabulary mappings, where each unique word in a corpus needs to map to an integer token ID. It also shows up in any situation where you need fast O(1) reverse-lookup from a pre-ordered list.

dict.fromkeys(): Shared Default Values

dict.fromkeys(iterable, value=None) is a class method that creates a dictionary where every key from the iterable maps to the same value. When no value is given, all keys map to None. This is the right tool when you need to initialize a set of known keys before populating values later:

# Initialize all fields to None
fields = ["username", "email", "role", "active"]
user_record = dict.fromkeys(fields)
print(user_record)
# {'username': None, 'email': None, 'role': None, 'active': None}

# Initialize all fields to a specific default
counters = dict.fromkeys(["hits", "misses", "errors"], 0)
print(counters)
# {'hits': 0, 'misses': 0, 'errors': 0}
Warning

Never pass a mutable default like an empty list [] or empty dict {} as the value argument to dict.fromkeys(). All keys will share a reference to the same single object, so mutating one entry's value mutates all of them. Use a dictionary comprehension with a fresh literal if you need independent mutable defaults: {k: [] for k in fields}.

The iterable passed to fromkeys() can be any iterable of hashable values. A string is iterable, so passing a string produces a dictionary keyed by individual characters—often not what you intend, but occasionally exactly what you need:

# String characters become keys
vowel_flags = dict.fromkeys("aeiou", False)
print(vowel_flags)
# {'a': False, 'e': False, 'i': False, 'o': False, 'u': False}

Because fromkeys() iterates the source and uses a set-like deduplication strategy internally (only the last occurrence of a duplicate key survives, consistent with all other dict-building approaches), passing a sequence with repeated values produces a dictionary with unique keys only:

dupes = ["x", "y", "x", "z", "y"]
d = dict.fromkeys(dupes, 0)
print(d)
# {'x': 0, 'y': 0, 'z': 0}

Merging and Updating from Iterables

Python 3.9 introduced two new operators for dictionary merging: | (union) and |= (in-place union), described in PEP 584. These operators work on existing dictionaries, but they complement the iterable-construction techniques above by providing a clean way to combine multiple sources after initial construction.

# Python 3.9+: merge with the | operator (non-destructive)
defaults = {"timeout": 30, "retries": 3, "verbose": False}
overrides = {"retries": 5, "verbose": True}

config = defaults | overrides
print(config)
# {'timeout': 30, 'retries': 5, 'verbose': True}

# In-place update with |=
defaults |= overrides
print(defaults)
# {'timeout': 30, 'retries': 5, 'verbose': True}

The dict.update() method predates these operators and does the same in-place operation. Crucially, update() accepts not just another dictionary but any iterable of key-value pairs—making it a live channel for feeding iterables into an existing dictionary:

config = {"host": "localhost", "port": 5432}

# Feed additional pairs from a list of tuples
config.update([("user", "admin"), ("password", "secret")])
print(config)
# {'host': 'localhost', 'port': 5432, 'user': 'admin', 'password': 'secret'}

# Feed pairs from a generator
config.update((k.upper(), v) for k, v in config.items())
# This doubles every key in uppercase form (keys already lowercase still present)

For more complex aggregations—building a dictionary where each key maps to a list of values rather than a single value—the standard library's collections.defaultdict is the appropriate tool. It eliminates the need to check whether a key already exists before appending:

from collections import defaultdict

# Group words by their first letter
words = ["apple", "apricot", "banana", "blueberry", "cherry", "avocado"]
by_letter = defaultdict(list)

for word in words:
    by_letter[word[0]].append(word)

print(dict(by_letter))
# {'a': ['apple', 'apricot', 'avocado'], 'b': ['banana', 'blueberry'], 'c': ['cherry']}

Handling Duplicate Keys and Ordering Guarantees

All dictionary-construction mechanisms in Python follow the same rule for duplicate keys: the last value assigned to a key wins. This is not an error; it is specified behavior. When building from an iterable, whichever pair appears latest in the iteration sequence determines the stored value:

# Last occurrence of a duplicate key wins
pairs = [("status", "pending"), ("status", "active"), ("status", "closed")]
d = dict(pairs)
print(d)
# {'status': 'closed'}

# Same behavior with a comprehension
d2 = {k: v for k, v in pairs}
print(d2)
# {'status': 'closed'}

This behavior is sometimes exploited deliberately. When you need to apply a sequence of overrides from multiple configuration files, constructing a dictionary from the concatenated list of pairs naturally gives precedence to later files.

On ordering: as noted above, dictionaries in Python 3.7+ preserve insertion order as a language guarantee. When building from an iterable, insertion order matches iteration order of the source. When using zip(), pairs are inserted in the order that zip() produces them, which is left-to-right positional. Dictionary comprehensions follow the iteration order of the for clause.

Note

When iterating over a set to build a dictionary, ordering is arbitrary and non-reproducible across Python runs when hash randomization is enabled (which it is by default since Python 3.3). If order matters in the output dictionary, do not use a bare set as the source iterable.

Performance Considerations

All of the techniques covered here eventually reduce to the same underlying CPython dictionary insertion loop—they differ in how much Python-level overhead they carry before reaching it. Some general benchmarking guidance:

dict() with a list of tuples vs. a comprehension: For straightforward one-to-one mappings with no transformation logic, dict(zip(keys, values)) is typically faster than an equivalent comprehension because more of the work happens in C rather than bytecode. For transformations or filters, the comprehension is more readable and the performance difference is usually negligible at the scales where Python is the right tool.

Generator expressions vs. list comprehensions as arguments to dict(): Passing a generator to dict() uses less peak memory than building an intermediate list, but it does not make the final dictionary smaller—the dictionary must still store all pairs. The benefit is entirely in the construction phase. For small collections this difference is immeasurable; for millions of pairs it can matter.

dict.fromkeys() for initialization: When you need a fixed set of keys all mapped to the same sentinel value, dict.fromkeys() is implemented in C and measurably faster than an equivalent comprehension like {k: None for k in keys} for large key lists.

import timeit

keys = list(range(10_000))

# fromkeys() is generally faster for same-value initialization
t1 = timeit.timeit(lambda: dict.fromkeys(keys, 0), number=1000)
t2 = timeit.timeit(lambda: {k: 0 for k in keys}, number=1000)

print(f"fromkeys: {t1:.3f}s")
print(f"comprehension: {t2:.3f}s")

The Python documentation's data model section notes that dictionary lookups are O(1) on average, and so is insertion. Constructing a dictionary from an iterable of n pairs is O(n) overall, regardless of which construction technique you use.

"The average case assumes the keys used in parameters are selected uniformly at random from the set of all keys." — Python 3 Documentation, Built-in Types: dict (Time Complexity Notes)

In practice, hash collisions can degrade individual insertions toward O(n), but CPython's hash perturbation algorithm makes worst-case collision scenarios extremely unlikely with typical key distributions.

Choosing the Right Method: A Quick Reference

  • Iterable of two-element sequences already in hand: dict(pairs) or dict(generator)
  • Two separate parallel iterables (keys and values): dict(zip(keys, values))
  • Transformation or filtering logic needed during construction: {k: v for k, v in ...} (dictionary comprehension)
  • Index-keyed dictionary from a sequence: dict(enumerate(seq))
  • Value-to-index reverse mapping from a sequence: {v: i for i, v in enumerate(seq)}
  • All keys sharing the same default value: dict.fromkeys(keys, default)
  • Adding pairs to an existing dict from an iterable: existing.update(pairs)
  • Merging two dicts (Python 3.9+): d1 | d2

Key Takeaways

  1. The dict() constructor is the fundamental building block. Any iterable of two-element sequences can be passed directly to dict(). Generator expressions are valid and avoid materializing intermediate lists. This is the behavior specified in the official Python documentation for the dict constructor.
  2. zip() is the canonical way to pair two parallel iterables. In Python 3, zip() is lazy. Use strict=True (Python 3.10+) to catch length mismatches that would otherwise be silently truncated.
  3. Dictionary comprehensions offer the most expressive control. They support per-element transformation and filtering in a single pass and are the preferred form whenever construction logic goes beyond a direct mapping.
  4. dict.fromkeys() is the right tool for same-value initialization. It is faster than an equivalent comprehension for large key sets and makes intent clear. Never use a mutable object as the default value.
  5. Duplicate keys do not raise errors; last value wins. This is consistent behavior across all construction mechanisms. Design data pipelines around it deliberately rather than discovering it accidentally.
  6. Insertion order is guaranteed from Python 3.7 onward. The output dictionary preserves the iteration order of the source iterable, making the construction process predictable.
  7. dict.update() and the | operator extend dictionaries from iterables after construction. update() accepts any iterable of pairs, not just another dictionary, and is the standard mechanism for post-hoc population from iterable sources.

Knowing which construction technique to reach for in a given situation is not just about brevity. It communicates intent to the next reader of your code, and it avoids the intermediate data structures that accumulate when developers default to explicit loops. Python's dictionary construction vocabulary is intentionally rich, and each tool in it earns its place.

Sources and Further Reading

back to articles