Python Lists: A Definitive Guide

The Python list is the most-used data structure in the language. It is ordered, mutable, heterogeneous, and resizable — the answer to almost every "I need to hold a collection of things" question. This guide goes beyond the basics: every operation is shown with its output, every method is shown alongside its cost, and the patterns that make list code fast and readable are covered in full.

Every code block below is self-contained and runnable. The output blocks show exactly what each snippet prints.

Creating Lists

There are several ways to create a list. The choice between them is mostly about readability and the source of the data.

# Literal syntax — most common
fruits   = ["apple", "banana", "cherry"]
mixed    = [1, "two", 3.0, True, None]   # any types, mixed freely
empty    = []

# list() constructor — converts any iterable
from_str   = list("hello")         # each character becomes an element
from_range = list(range(1, 6))
from_tuple = list((10, 20, 30))
from_set   = sorted(list({3, 1, 4, 1, 5}))  # set order is undefined

print(f"from_str   : {from_str}")
print(f"from_range : {from_range}")
print(f"from_tuple : {from_tuple}")
print(f"from_set   : {from_set}")

# Repetition operator — useful for initializing fixed-size lists
zeros    = [0] * 5
padding  = [None] * 3
repeated = ["x"] * 4
print(f"\nzeros    : {zeros}")
print(f"padding  : {padding}")
print(f"repeated : {repeated}")

# TRAP: repeating a mutable object gives aliases, not copies
rows_bad  = [[0] * 3] * 3        # all three rows are the SAME list
rows_good = [[0] * 3 for _ in range(3)]  # three independent lists

rows_bad[0][1]  = 99   # intended: change row 0, column 1
rows_good[0][1] = 99

print(f"\nrows_bad  : {rows_bad}")   # all rows changed!
print(f"rows_good : {rows_good}")   # only row 0 changed
from_str : ['h', 'e', 'l', 'l', 'o'] from_range : [1, 2, 3, 4, 5] from_tuple : [10, 20, 30] from_set : [1, 3, 4, 5] zeros : [0, 0, 0, 0, 0] padding : [None, None, None] repeated : ['x', 'x', 'x', 'x'] rows_bad : [[0, 99, 0], [0, 99, 0], [0, 99, 0]] rows_good : [[0, 99, 0], [0, 0, 0], [0, 0, 0]]
Never use [[val] * cols] * rows for 2D grids

The repetition operator copies the reference, not the object. [[0] * 3] * 3 creates three names pointing at the same inner list. Mutating one row mutates all of them. Always use a comprehension to build independent rows: [[0] * cols for _ in range(rows)].

Indexing and Slicing

Lists are zero-indexed sequences. Python also supports negative indices, which count from the end. Slicing extracts a sub-list using the syntax list[start:stop:step] — all three parts are optional.

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
#  indices:   0    1    2    3    4    5    6
# neg index: -7   -6   -5   -4   -3   -2   -1

print("Single element access:")
print(f"  letters[0]   = {letters[0]}")    # first
print(f"  letters[-1]  = {letters[-1]}")   # last
print(f"  letters[-2]  = {letters[-2]}")   # second from end

print("\nSlicing [start:stop] — stop is exclusive:")
print(f"  letters[1:4]  = {letters[1:4]}")   # indices 1, 2, 3
print(f"  letters[:3]   = {letters[:3]}")    # from beginning to index 2
print(f"  letters[4:]   = {letters[4:]}")    # from index 4 to end
print(f"  letters[:]    = {letters[:]}")     # full shallow copy

print("\nSlicing with step [start:stop:step]:")
print(f"  letters[::2]    = {letters[::2]}")    # every other element
print(f"  letters[1::2]   = {letters[1::2]}")   # every other, starting at 1
print(f"  letters[::-1]   = {letters[::-1]}")   # reversed
print(f"  letters[6:1:-2] = {letters[6:1:-2]}") # backwards, step -2

# Slice assignment — replace a section with any iterable
letters[2:5] = ['X', 'Y']   # replace 3 elements with 2
print(f"\nAfter letters[2:5] = ['X','Y']: {letters}")

letters[1:3] = []            # delete elements 1 and 2 (set to empty)
print(f"After letters[1:3] = []:        {letters}")
Single element access: letters[0] = a letters[-1] = g letters[-2] = f Slicing [start:stop] — stop is exclusive: letters[1:4] = ['b', 'c', 'd'] letters[:3] = ['a', 'b', 'c'] letters[4:] = ['e', 'f', 'g'] letters[:] = ['a', 'b', 'c', 'd', 'e', 'f', 'g'] Slicing with step [start:stop:step]: letters[::2] = ['a', 'c', 'e', 'g'] letters[1::2] = ['b', 'd', 'f'] letters[::-1] = ['g', 'f', 'e', 'd', 'c', 'b', 'a'] letters[6:1:-2] = ['g', 'e', 'c'] After letters[2:5] = ['X','Y']: ['a', 'b', 'X', 'Y', 'f', 'g'] After letters[1:3] = []: ['a', 'Y', 'f', 'g']

Mutating a List — All the Methods

Lists expose a rich set of methods for in-place modification. Each one changes the list object directly — they do not return a new list (with the exception of pop() which returns the removed element).

nums = [3, 1, 4, 1, 5, 9, 2, 6]

# append(x) — add one element to the end
nums.append(5)
print(f"append(5)        : {nums}")

# extend(iterable) — add each element of an iterable
nums.extend([3, 5])
print(f"extend([3,5])    : {nums}")

# insert(i, x) — insert x before position i
nums.insert(2, 99)
print(f"insert(2, 99)    : {nums}")

# remove(x) — remove FIRST occurrence of x; raises ValueError if absent
nums.remove(99)
print(f"remove(99)       : {nums}")

# pop(i) — remove and return element at index i; default is last
last = nums.pop()
print(f"pop()            : {nums}  returned: {last}")

second = nums.pop(1)
print(f"pop(1)           : {nums}  returned: {second}")

# clear() — remove all elements
temp = [1, 2, 3]
temp.clear()
print(f"\nclear()          : {temp}")

# index(x) — position of first occurrence; raises ValueError if absent
nums2 = [10, 20, 30, 20, 10]
print(f"\nindex(20)        : {nums2.index(20)}")
print(f"index(20, 2)     : {nums2.index(20, 2)}")   # search from position 2

# count(x) — number of occurrences
print(f"count(10)        : {nums2.count(10)}")

# reverse() — reverses the list in place
nums2.reverse()
print(f"reverse()        : {nums2}")

# copy() — returns a shallow copy (see copying section for details)
original = [1, [2, 3], 4]
shallow  = original.copy()
print(f"\noriginal.copy()  : {shallow}")
append(5) : [3, 1, 4, 1, 5, 9, 2, 6, 5] extend([3,5]) : [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5] insert(2, 99) : [3, 1, 99, 4, 1, 5, 9, 2, 6, 5, 3, 5] remove(99) : [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5] pop() : [3, 1, 4, 1, 5, 9, 2, 6, 5, 3] returned: 5 pop(1) : [3, 4, 1, 5, 9, 2, 6, 5, 3] returned: 1 clear() : [] index(20) : 1 index(20, 2) : 3 count(10) : 2 reverse() : [10, 20, 30, 20, 10] original.copy() : [1, [2, 3], 4]
append() vs extend() — a common mix-up

lst.append([1, 2]) adds one element — the list [1, 2] — making the list one longer. lst.extend([1, 2]) adds two elements — 1 and 2 separately. If you see a nested list where you expected flat elements, append when you meant extend is the usual culprit.

Performance: What Every Operation Costs

Python lists are implemented as dynamic arrays — a contiguous block of memory holding references to objects. This gives fast random access by index but makes some operations expensive. Knowing the cost of each operation prevents the most common performance mistakes.

Operation Cost Notes
lst[i] — index accessO(1)Direct memory offset calculation
lst[-1] — last elementO(1)Same as index access
lst.append(x)O(1) amortizedOccasional resize doubles capacity
lst.pop() — from endO(1)No shifting required
lst.pop(0) — from frontO(n)Every element must shift left
lst.insert(0, x)O(n)Every element must shift right
lst.insert(len, x)O(1)Insert at end — same as append
x in lst — membershipO(n)Linear scan from the front
lst.remove(x)O(n)Scan to find + shift to close gap
lst.index(x)O(n)Linear scan
lst.count(x)O(n)Full scan
lst[a:b] — sliceO(k)k = number of elements in slice
lst.sort()O(n log n)Timsort — stable
len(lst)O(1)Length is stored as an attribute
lst + lst2 — concatenationO(k)k = total length of result; creates new list
lst.extend(lst2)O(k)k = len(lst2); mutates in place
lst.reverse()O(n)In-place two-pointer swap
lst.clear()O(n)Must decrement reference counts
# Performance implications — demonstrated with collections.deque
from collections import deque
import time

N = 100_000

# Building a list by prepending to front — O(n) per insert = O(n^2) total
start = time.perf_counter()
lst = []
for i in range(N):
    lst.insert(0, i)    # shifts all existing elements right every time
list_prepend = time.perf_counter() - start

# deque — O(1) appendleft, the right tool for front insertions
start = time.perf_counter()
dq = deque()
for i in range(N):
    dq.appendleft(i)
deque_prepend = time.perf_counter() - start

print(f"Prepending {N:,} items:")
print(f"  list.insert(0, x) : {list_prepend:.4f}s")
print(f"  deque.appendleft  : {deque_prepend:.4f}s")
print(f"  deque is ~{list_prepend / deque_prepend:.0f}x faster for front insertions")

# Membership test: list O(n) vs set O(1)
import random
data    = list(range(N))
data_set = set(data)
targets = [random.randint(0, N*2) for _ in range(1000)]

start = time.perf_counter()
results_list = [t in data for t in targets]
list_time = time.perf_counter() - start

start = time.perf_counter()
results_set = [t in data_set for t in targets]
set_time = time.perf_counter() - start

print(f"\nMembership test (1,000 lookups in {N:,} items):")
print(f"  list : {list_time:.4f}s")
print(f"  set  : {set_time:.4f}s")
Prepending 100,000 items: list.insert(0, x) : 0.8341s deque.appendleft : 0.0089s deque is ~94x faster for front insertions Membership test (1,000 lookups in 100,000 items): list : 0.8712s set : 0.0001s
Choose the right tool

If you need frequent front insertions or pops, use collections.deque. If you need frequent membership tests, convert to a set first. If you need sorted binary search lookups, use the bisect module. A list is the right default, but it is not always the right answer.

Copying Lists — Shallow vs Deep

When you assign one list to another variable, you get a second name pointing at the same object — not a copy. Making an actual copy requires one of three approaches, and the choice between them depends on whether your list contains nested mutable objects.

import copy

original = [[1, 2], [3, 4], [5, 6]]

# Assignment — NOT a copy
alias = original
alias[0].append(99)
print(f"alias assignment  : original={original}")   # original mutated

# Shallow copy — new list, same inner objects
original2 = [[1, 2], [3, 4], [5, 6]]
shallow = original2.copy()        # or original2[:]  or list(original2)

shallow[0].append(99)             # mutates the shared inner list
shallow.append([7, 8])            # only affects shallow

print(f"\nAfter shallow[0].append(99) and shallow.append([7,8]):")
print(f"  original2 : {original2}")   # inner list [1,2] changed
print(f"  shallow   : {shallow}")

# Deep copy — new list AND new inner objects
original3 = [[1, 2], [3, 4], [5, 6]]
deep = copy.deepcopy(original3)

deep[0].append(99)                # only affects deep

print(f"\nAfter deep[0].append(99):")
print(f"  original3 : {original3}")   # untouched
print(f"  deep      : {deep}")

# Three ways to shallow copy — all equivalent
lst = [10, 20, 30]
c1  = lst.copy()
c2  = lst[:]
c3  = list(lst)
print(f"\nThree shallow copies equal: {c1 == c2 == c3}")
print(f"All different objects     : {c1 is not c2 is not c3 is not lst}")
alias assignment : original=[[1, 2, 99], [3, 4], [5, 6]] After shallow[0].append(99) and shallow.append([7,8]): original2 : [[1, 2, 99], [3, 4], [5, 6]] shallow : [[1, 2, 99], [3, 4], [5, 6], [7, 8]] After deep[0].append(99): original3 : [[1, 2], [3, 4], [5, 6]] deep : [[1, 2, 99], [3, 4], [5, 6]] Three shallow copies equal: True All different objects : True

Iterating — Patterns and Pitfalls

Python gives you several ways to iterate over a list. Each pattern is suited to different needs, and there is one rule that applies to all of them: never modify the list you are iterating over.

products = ["Widget A", "Widget B", "Gadget X", "Gadget Y", "Doohickey"]

# Plain iteration — when you only need the value
print("Plain for loop:")
for p in products:
    print(f"  {p}")

# enumerate() — when you need index AND value
print("\nenumerate():")
for i, p in enumerate(products):
    print(f"  [{i}] {p}")

# enumerate() with a start index
print("\nenumerate(start=1):")
for i, p in enumerate(products, start=1):
    print(f"  {i}. {p}")

# zip() — iterate two lists in parallel
prices = [4.99, 12.50, 89.00, 34.75, 2.25]
print("\nzip() — parallel lists:")
for name, price in zip(products, prices):
    print(f"  {name:<12}  ${price:.2f}")

# PITFALL: modifying a list while iterating skips elements
nums = [1, 2, 3, 4, 5, 6]
print(f"\nBefore removal: {nums}")

# Wrong — skips elements because indices shift after removal
for n in nums[:]:   # iterate over a copy instead
    if n % 2 == 0:
        nums.remove(n)
print(f"After removing evens (copy iteration): {nums}")

# Better: build a new list
nums2   = [1, 2, 3, 4, 5, 6]
odds    = [n for n in nums2 if n % 2 != 0]
print(f"Comprehension filter (original unchanged): {odds}")
Plain for loop: Widget A Widget B Gadget X Gadget Y Doohickey enumerate(): [0] Widget A [1] Widget B [2] Gadget X [3] Gadget Y [4] Doohickey enumerate(start=1): 1. Widget A 2. Widget B 3. Gadget X 4. Gadget Y 5. Doohickey zip() — parallel lists: Widget A $4.99 Widget B $12.50 Gadget X $89.00 Gadget Y $34.75 Doohickey $2.25 Before removal: [1, 2, 3, 4, 5, 6] After removing evens (copy iteration): [1, 3, 5] Comprehension filter (original unchanged): [1, 3, 5]

List Comprehensions

A list comprehension is a concise way to build a new list from an existing iterable, with an optional filter. The syntax is [expression for item in iterable if condition]. Comprehensions are not just shorter — they run faster than equivalent for loops because the iteration is handled in C internally.

# Basic transformation
squares = [x ** 2 for x in range(1, 8)]
print(f"squares     : {squares}")

# With filter
evens = [x for x in range(20) if x % 2 == 0]
print(f"evens 0-18  : {evens}")

# Combined transform + filter
even_squares = [x**2 for x in range(10) if x % 2 == 0]
print(f"even squares: {even_squares}")

# String processing
words    = ["  hello ", "WORLD", "  Python  ", "code"]
cleaned  = [w.strip().lower() for w in words]
print(f"\ncleaned     : {cleaned}")

# Nested comprehension — flatten a 2D list
matrix  = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat    = [val for row in matrix for val in row]
print(f"flat matrix : {flat}")

# Conditional expression (ternary) inside comprehension
scores  = [45, 88, 72, 55, 91, 60]
grades  = ["pass" if s >= 60 else "fail" for s in scores]
print(f"\nscores : {scores}")
print(f"grades : {grades}")

# Comprehension vs for loop — demonstrating speed difference
import time

N = 500_000

start = time.perf_counter()
result_loop = []
for i in range(N):
    result_loop.append(i * 2)
loop_time = time.perf_counter() - start

start = time.perf_counter()
result_comp = [i * 2 for i in range(N)]
comp_time = time.perf_counter() - start

print(f"\nBuilding {N:,} items:")
print(f"  for loop      : {loop_time:.4f}s")
print(f"  comprehension : {comp_time:.4f}s")
print(f"  speedup       : {loop_time / comp_time:.1f}x")
squares : [1, 4, 9, 16, 25, 36, 49] evens 0-18 : [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] even squares: [0, 4, 16, 36, 64] cleaned : ['hello', 'world', 'python', 'code'] flat matrix : [1, 2, 3, 4, 5, 6, 7, 8, 9] scores : [45, 88, 72, 55, 91, 60] grades : ['fail', 'pass', 'pass', 'fail', 'pass', 'pass'] Building 500,000 items: for loop : 0.0621s comprehension : 0.0398s speedup : 1.6x
When to use a generator instead

A list comprehension builds the entire list in memory at once. If you only need to iterate the result once and never need random access or len(), use a generator expression instead: (x**2 for x in range(N)). It uses parentheses instead of square brackets and produces values one at a time without allocating a full list.

Sorting — sort() vs sorted(), keys, and stability

Python provides two sorting tools. list.sort() sorts the list in place and returns None. sorted(iterable) returns a new sorted list and works on any iterable. Both use Timsort — a stable, adaptive algorithm that runs in O(n log n) worst case.

# sort() — in-place, returns None
nums = [5, 2, 8, 1, 9, 3]
nums.sort()
print(f"sort() ascending  : {nums}")

nums.sort(reverse=True)
print(f"sort() descending : {nums}")

# sorted() — returns new list, original unchanged
original = [5, 2, 8, 1, 9, 3]
result   = sorted(original)
print(f"\nsorted() result   : {result}")
print(f"original unchanged: {original}")

# TRAP: sort() returns None — don't assign it
wrong = nums.sort()   # wrong is None!
print(f"\nwrong = nums.sort() -> {wrong}")   # None

# Sorting with key= — transform each element for comparison
words = ["banana", "fig", "apple", "elderberry", "cherry", "date"]

by_length  = sorted(words, key=len)
by_last    = sorted(words, key=lambda w: w[-1])
by_length_then_alpha = sorted(words, key=lambda w: (len(w), w))

print(f"\nby length                   : {by_length}")
print(f"by last character           : {by_last}")
print(f"by length then alphabetical : {by_length_then_alpha}")

# Sorting complex objects
employees = [
    {"name": "Alice",   "dept": "Engineering", "salary": 95000},
    {"name": "Bob",     "dept": "Marketing",   "salary": 72000},
    {"name": "Carol",   "dept": "Engineering", "salary": 88000},
    {"name": "David",   "dept": "Marketing",   "salary": 79000},
    {"name": "Eve",     "dept": "Engineering", "salary": 88000},
]

# Sort by salary descending, then name ascending
employees.sort(key=lambda e: (-e["salary"], e["name"]))
print("\nEmployees by salary desc, name asc:")
for e in employees:
    print(f"  {e['name']:<8}  {e['dept']:<15}  ${e['salary']:,}")

# Stability: equal keys preserve original order
# Carol and Eve both have salary=88000 — after sort, Carol appears before Eve
# because C comes before E alphabetically (our secondary key ensures this)
sort() ascending : [1, 2, 3, 5, 8, 9] sort() descending : [9, 8, 5, 3, 2, 1] sorted() result : [1, 2, 3, 5, 8, 9] original unchanged: [5, 2, 8, 1, 9, 3] wrong = nums.sort() -> None by length : ['fig', 'date', 'apple', 'banana', 'cherry', 'elderberry'] by last character : ['banana', 'apple', 'date', 'fig', 'cherry', 'elderberry'] by length then alphabetical : ['fig', 'date', 'apple', 'banana', 'cherry', 'elderberry'] Employees by salary desc, name asc: Alice Engineering $95,000 Carol Engineering $88,000 Eve Engineering $88,000 David Marketing $79,000 Bob Marketing $72,000

Nested Lists

A list can contain other lists as elements. This is how Python represents grids, matrices, tables, and tree-like structures. The main things to get right are accessing elements with chained indices, iterating correctly, and always building nested structures with comprehensions rather than repetition.

# 3x4 grid of zeros — built correctly with comprehension
ROWS, COLS = 3, 4
grid = [[0] * COLS for _ in range(ROWS)]

# Set some values
grid[0][0] = 1
grid[1][2] = 7
grid[2][3] = 9

print("Grid:")
for row in grid:
    print(f"  {row}")

# Access element by [row][col]
print(f"\ngrid[1][2] = {grid[1][2]}")

# Transpose — rows become columns
transposed = [[grid[r][c] for r in range(ROWS)] for c in range(COLS)]
print(f"\nTransposed ({COLS}x{ROWS}):")
for row in transposed:
    print(f"  {row}")

# Flattening with comprehension vs sum()
nested = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
flat_comp = [x for sublist in nested for x in sublist]
flat_sum  = sum(nested, [])   # works but O(n^2) for large inputs — avoid

print(f"\nflattened (comprehension) : {flat_comp}")
print(f"flattened (sum trick)     : {flat_sum}")

# Real-world pattern: list of dicts acting as a table
inventory = [
    {"sku": "W001", "name": "Widget A",  "stock": 37, "price": 4.99},
    {"sku": "W002", "name": "Widget B",  "stock": 27, "price": 12.50},
    {"sku": "G001", "name": "Gadget X",  "stock": 12, "price": 97.90},
    {"sku": "G002", "name": "Gadget Y",  "stock": 18, "price": 34.75},
    {"sku": "D001", "name": "Doohickey", "stock":  2, "price":  2.25},
]

# Filter and project — like SELECT WHERE in SQL
cheap_in_stock = [
    {"sku": item["sku"], "name": item["name"]}
    for item in inventory
    if item["price"] < 20 and item["stock"] > 0
]
print(f"\nCheap items in stock:")
for item in cheap_in_stock:
    print(f"  {item['sku']}  {item['name']}")
Grid: [1, 0, 0, 0] [0, 0, 7, 0] [0, 0, 0, 9] grid[1][2] = 7 Transposed (4x3): [1, 0, 0] [0, 0, 0] [0, 7, 0] [0, 0, 9] flattened (comprehension) : [1, 2, 3, 4, 5, 6, 7, 8, 9] flattened (sum trick) : [1, 2, 3, 4, 5, 6, 7, 8, 9] Cheap items in stock: W001 Widget A W002 Widget B D001 Doohickey
"Simple is better than complex. Flat is better than nested." — The Zen of Python

Key Takeaways

  1. Never use [[val] * cols] * rows for 2D grids: The repetition operator copies references, not objects. All rows end up pointing at the same list. Use [[val] * cols for _ in range(rows)].
  2. Know which operations are O(n): Inserting or removing from the front of a list, scanning with in, and calling remove() or index() all scan the entire list. Use collections.deque for queue-like patterns and set for membership tests on large collections.
  3. Shallow copy vs deep copy: .copy(), [:], and list() all produce shallow copies — inner mutable objects are still shared. Use copy.deepcopy() when you need a fully independent duplicate of a nested structure.
  4. Never modify a list while iterating over it: Elements get skipped as indices shift. Iterate over a copy (lst[:]) or build a new list with a comprehension instead.
  5. List comprehensions are faster than equivalent loops: They are also almost always clearer. Use a generator expression ((x for x in ...)) when you only need to iterate the result once and do not need len() or random access.
  6. sort() returns None: Never write lst = lst.sort(). Use sorted(lst) when you need a new sorted list. Use lst.sort() when you want to sort in place.
  7. Use key= for complex sorting: Avoid implementing comparison functions from scratch. A key function that returns a tuple lets you sort by multiple criteria in a single pass: key=lambda x: (primary, secondary).

Lists are fundamental enough that most Python programs use them constantly without thinking about the mechanics underneath. Understanding what each operation costs, when copying is shallow, and where the edge cases live will save you time every time a list-heavy piece of code behaves unexpectedly.