Python's Descriptor Protocol and Attribute Lookup Machinery

Every time you write obj.x in Python, a sophisticated lookup chain fires behind the scenes. At the heart of that chain sits the descriptor protocol — the mechanism that powers property, classmethod, staticmethod, method binding, __slots__, functools.cached_property, and the field definitions in frameworks like Django and SQLAlchemy. Understanding descriptors is what separates Python users from Python engineers.

When you access an attribute on an object in Python, the language does not simply look up a value in a dictionary and hand it back. Instead, Python runs an attribute resolution algorithm embedded in object.__getattribute__() that consults the class hierarchy, checks for specially defined protocol methods, and decides whether to invoke custom behavior or return a stored value. The descriptor protocol is the specification that makes all of this possible, and it is one of the foundations of Python's entire object model.

The official Python documentation describes descriptors as a "powerful, general purpose protocol" that serves as "the mechanism behind properties, methods, static methods, class methods, and super()."

Source: Python 3 Descriptor HowTo Guide

What Is a Descriptor?

A descriptor is any Python object that defines at least one of the following methods: __get__(self, obj, objtype=None), __set__(self, obj, value), or __delete__(self, obj). These three methods collectively form the descriptor protocol. When an instance of a class that implements any of these methods is stored as a class variable in another class, Python will invoke the appropriate protocol method instead of performing a standard dictionary lookup whenever that attribute is accessed, assigned to, or deleted.

This is a critical distinction: descriptors only work when they are assigned as class-level attributes, not instance-level attributes. If you place a descriptor object inside an instance's __dict__, Python will not invoke its protocol methods. The machinery that triggers descriptor invocation lives inside type.__getattribute__() and object.__getattribute__(), and it only scans the class and its parents in the MRO (Method Resolution Order), not the instance dictionary, when looking for descriptors. You can verify this in the CPython source at PyObject_GenericGetAttr() in Objects/object.c and _PyType_Lookup() in Objects/typeobject.c.

Here is the simplest possible descriptor. It always returns a constant value when accessed:

class ConstantTen:
    def __get__(self, obj, objtype=None):
        return 10

class MyClass:
    x = 5              # regular class attribute
    y = ConstantTen()   # descriptor instance

a = MyClass()
print(a.x)  # 5  - standard dictionary lookup
print(a.y)  # 10 - descriptor's __get__ was invoked

In this example, accessing a.y does not return the ConstantTen object itself. Python detects that ConstantTen defines __get__, so it calls ConstantTen.__get__(descriptor_instance, a, MyClass) and returns the result. This interception of attribute access is the fundamental power that descriptors provide.

Think of it this way: without descriptors, Python's attribute system would be a passive dictionary lookup. With descriptors, it becomes an active dispatch system — a single dot operator triggers method calls, validation logic, caching, logging, or any computation you want to run. This is why the Python documentation describes attribute access as having "binding behavior" when descriptors are involved.

The Attribute Lookup Chain

To fully understand descriptors, you need to understand the order in which Python resolves attribute access. When you write obj.attr, the object.__getattribute__() method executes an algorithm with a specific priority sequence.

First, Python searches the class and its bases (following the MRO) for an attribute named attr. If it finds one and it is a data descriptor (an object that defines __set__ or __delete__) and it also defines __get__, Python calls its __get__ method immediately and returns the result. Data descriptors have the highest priority in the entire lookup chain.

Second, if no data descriptor is found, Python checks the instance's own __dict__ for the attribute. If the attribute exists there, Python returns it directly with no descriptor invocation.

Third, if the instance dictionary does not contain the attribute, Python searches the class hierarchy again. If it finds a non-data descriptor (one that defines only __get__), it invokes __get__ and returns the result. If it finds a plain class attribute, it returns that value directly.

Finally, if none of the above steps produce a result, Python calls __getattr__() on the instance if that method is defined. If __getattr__ is also not defined, an AttributeError is raised.

Note

This entire mechanism is embedded inside the __getattribute__ methods of object, type, and super(). Classes inherit this machinery automatically. You can disable descriptor invocation entirely by overriding __getattribute__, though doing so is rarely advisable in production code. The Python documentation states that overriding __getattribute__() "prevents automatic descriptor calls."

Source: Python 3 Descriptor HowTo Guide

Here is a pseudocode representation of the lookup algorithm for instance attribute access:

def object_getattribute(obj, name):
    # Step 1: Search the class hierarchy for the name
    cls_var = None
    for cls in type(obj).__mro__:
        if name in cls.__dict__:
            cls_var = cls.__dict__[name]
            break

    # Step 2: If found, check if it is a data descriptor
    if cls_var is not None:
        descr_get = getattr(type(cls_var), '__get__', None)
        if descr_get is not None and (
            hasattr(type(cls_var), '__set__') or
            hasattr(type(cls_var), '__delete__')
        ):
            return descr_get(cls_var, obj, type(obj))  # data descriptor

    # Step 3: Check the instance dictionary
    if hasattr(obj, '__dict__') and name in obj.__dict__:
        return obj.__dict__[name]

    # Step 4: Non-data descriptor or plain class attribute
    if cls_var is not None:
        if getattr(type(cls_var), '__get__', None) is not None:
            return type(cls_var).__get__(cls_var, obj, type(obj))
        return cls_var  # plain class attribute

    # Step 5: Fall back to __getattr__
    if hasattr(type(obj), '__getattr__'):
        return type(obj).__getattr__(obj, name)

    raise AttributeError(name)

This is a simplified representation. The actual CPython implementation lives in PyObject_GenericGetAttr() in Objects/object.c and _PyType_Lookup() in Objects/typeobject.c, but the logic follows the same precedence chain described here. Note a subtlety: the pseudocode checks type(cls_var) — not cls_var itself — for descriptor methods. This matches the actual CPython behavior, which looks up __get__, __set__, and __delete__ on the type of the class variable, not on the instance. This distinction matters because an object could have a __get__ attribute in its instance dictionary without actually implementing the descriptor protocol at the class level. Also note: the classification of an object as a data descriptor depends solely on whether its type defines __set__ or __delete__. A data descriptor without __get__ still blocks instance dictionary access during assignment (Step 2 of the assignment protocol), even though it cannot return a value during attribute lookup.

There is an important subtlety to this algorithm that catches experienced developers off guard: class-level attribute lookup follows a different path. When you access MyClass.attr rather than obj.attr, the lookup goes through type.__getattribute__() instead. This version searches the metaclass hierarchy for data descriptors first, then the class dictionary and its bases, and finally the metaclass hierarchy for non-data descriptors. The full C implementation lives in type_getattro() in Objects/typeobject.c. This two-layer lookup is what allows constructs like classmethod to work correctly when accessed from the class itself.

The Attribute Assignment Protocol

The lookup chain described above only covers attribute reading. Attribute assignment follows a different and often overlooked protocol. When you write obj.attr = value, Python does not simply insert the value into obj.__dict__. Instead, object.__setattr__() first searches the class hierarchy for a data descriptor with the given name. If one is found, its __set__ method is called. Otherwise, the value is stored directly in the instance dictionary.

def object_setattr(obj, name, value):
    # Search the class hierarchy for a data descriptor
    for cls in type(obj).__mro__:
        if name in cls.__dict__:
            attr = cls.__dict__[name]
            if hasattr(type(attr), '__set__'):
                # Data descriptor intercepts the assignment
                type(attr).__set__(attr, obj, value)
                return

    # No data descriptor found: store in instance dict
    obj.__dict__[name] = value

This means that if you define a descriptor with __get__ and __set__, every assignment to that attribute name on any instance of the owning class is routed through your descriptor's __set__ method. The instance dictionary is never touched unless the descriptor explicitly writes to it. This is why property setters work: the assignment obj.x = 5 calls property.__set__, which in turn calls your setter function.

Attribute deletion follows the same pattern. When you write del obj.attr, Python's object.__delattr__() searches for a data descriptor with a __delete__ method before falling back to removing the key from the instance's __dict__.

Understanding this three-way protocol (get, set, delete) is what separates a surface-level understanding of descriptors from a working command of Python's attribute system.

Data Descriptors vs Non-Data Descriptors

The distinction between data descriptors and non-data descriptors is not merely a naming convention. It determines whether a descriptor can be overridden by an entry in the instance's __dict__.

A data descriptor is any object that defines __set__ or __delete__ (or both). A data descriptor does not need to define __get__ to earn this classification, though in practice it almost always does. Data descriptors always take priority over the instance dictionary. Even if you manually insert a value into obj.__dict__ with the same name as a data descriptor, the descriptor wins. This is exactly how property works: the property object defines __get__, __set__, and __delete__, which means you cannot bypass a property by writing directly to the instance dictionary through normal attribute access.

A non-data descriptor defines only __get__. Non-data descriptors have lower priority than instance dictionary entries. If an attribute name exists both as a non-data descriptor on the class and as a key in the instance's __dict__, the instance dictionary value is returned. This is by design and is the mechanism that makes functools.cached_property work (discussed in its own section below).

The Python documentation makes this priority clear: if a descriptor defines __set__() or __delete__(), it is considered a data descriptor, and data descriptors take precedence over instance dictionaries.

Source: Python 3 Descriptor HowTo Guide

class NonDataDescriptor:
    def __get__(self, obj, objtype=None):
        print("Non-data __get__ called")
        return 42

class DataDescriptor:
    def __get__(self, obj, objtype=None):
        print("Data __get__ called")
        return 99
    def __set__(self, obj, value):
        print(f"Data __set__ called with {value}")

class Demo:
    non_data = NonDataDescriptor()
    data = DataDescriptor()

d = Demo()

# Non-data descriptor can be overridden by instance dict
d.__dict__['non_data'] = "instance value"
print(d.non_data)  # "instance value" - instance dict wins

# Data descriptor cannot be overridden by instance dict
d.__dict__['data'] = "instance value"
print(d.data)  # 99 - data descriptor wins
Pro Tip

To create a read-only data descriptor, define both __get__ and __set__, but have __set__ raise an AttributeError. The presence of __set__ makes it a data descriptor, preventing instance dictionary overrides, while the exception prevents any value from being assigned. Alternatively, defining __get__ and __delete__ (without __set__) also creates a data descriptor, since the presence of __delete__ alone is sufficient for data descriptor status.

The __set_name__ Hook

One practical challenge with descriptors has always been that the descriptor object itself did not automatically know the name of the class attribute it was assigned to. If you wrote age = Validator() inside a class body, the Validator descriptor instance had no built-in way to learn that it was bound to the name "age." This made it difficult to store per-instance data in the instance's __dict__ under a predictable key without passing the name explicitly to the constructor.

Python 3.6 solved this problem by introducing __set_name__(self, owner, name), as specified in PEP 487. When a new class is created, the type metaclass scans the class dictionary. For every entry that defines __set_name__, that method is called with two arguments: the owner class where the descriptor lives, and the name of the class variable the descriptor was assigned to.

PEP 487 describes the rationale clearly: descriptors "do not know anything about that class" and "do not even know the name they are accessed with," making it impossible to store per-instance values under predictable keys. The PEP added __set_name__ to resolve this directly within type.__new__.

Source: PEP 487 -- Simpler customisation of class creation

class ValidatedString:
    def __set_name__(self, owner, name):
        self.public_name = name
        self.private_name = f'_{name}'

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return getattr(obj, self.private_name, '')

    def __set__(self, obj, value):
        if not isinstance(value, str):
            raise TypeError(
                f'{self.public_name} must be a string, '
                f'got {type(value).__name__}'
            )
        setattr(obj, self.private_name, value)

class User:
    name = ValidatedString()
    email = ValidatedString()

    def __init__(self, name, email):
        self.name = name
        self.email = email

u = User("Alice", "alice@example.com")
print(u.name)   # Alice
print(u._name)  # Alice - stored in instance dict under _name

When Python creates the User class, it detects that ValidatedString defines __set_name__ and calls it automatically. The first descriptor receives owner=User, name='name' and the second receives owner=User, name='email'. This eliminates the need for redundant declarations like name = ValidatedString('name').

An important nuance often missed: __set_name__ is not exclusive to descriptors. The official documentation clarifies that it is called for any class attribute that defines the method, whether or not the object implements __get__, __set__, or __delete__. This means you can use __set_name__ as a general-purpose class-creation hook for any object that needs to know its assigned name.

Source: Python 3 Descriptor HowTo Guide (see also CPython Issue #89361)

Warning

The __set_name__ hook is only called during class creation by the type metaclass. If you add a descriptor to a class after the class has already been created (for example, MyClass.new_attr = SomeDescriptor()), you must call __set_name__ manually: SomeDescriptor.__set_name__(descriptor_instance, MyClass, 'new_attr'). Forgetting this step is a common source of AttributeError bugs in dynamic class construction.

Building Real-World Descriptors

Descriptors become powerful when you build reusable validation and transformation logic that can be shared across multiple classes. Consider a scenario where you need to enforce type and range constraints on numeric fields. Without descriptors, you would need to write separate @property definitions for every validated attribute. With descriptors, you write the validation logic once and reuse it everywhere.

class RangeValidated:
    """A reusable descriptor that enforces type and range constraints."""

    def __init__(self, min_value=None, max_value=None, expected_type=int):
        self.min_value = min_value
        self.max_value = max_value
        self.expected_type = expected_type

    def __set_name__(self, owner, name):
        self.public_name = name
        self.private_name = f'_{name}'

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return getattr(obj, self.private_name, None)

    def __set__(self, obj, value):
        if not isinstance(value, self.expected_type):
            raise TypeError(
                f'{self.public_name} requires {self.expected_type.__name__}, '
                f'got {type(value).__name__}'
            )
        if self.min_value is not None and value < self.min_value:
            raise ValueError(
                f'{self.public_name} must be >= {self.min_value}'
            )
        if self.max_value is not None and value > self.max_value:
            raise ValueError(
                f'{self.public_name} must be <= {self.max_value}'
            )
        setattr(obj, self.private_name, value)

    def __delete__(self, obj):
        delattr(obj, self.private_name)


class NetworkDevice:
    port = RangeValidated(min_value=1, max_value=65535)
    timeout = RangeValidated(min_value=0, max_value=300)
    retries = RangeValidated(min_value=0, max_value=10)

    def __init__(self, port, timeout, retries):
        self.port = port
        self.timeout = timeout
        self.retries = retries

device = NetworkDevice(port=8080, timeout=30, retries=3)
print(device.port)  # 8080

try:
    device.port = 99999  # Exceeds max
except ValueError as e:
    print(e)  # port must be <= 65535

This pattern is the foundation of how ORM frameworks define model fields. In Django, for instance, when you write title = CharField(max_length=255) inside a model class, that CharField is a descriptor whose __get__ and __set__ methods handle data retrieval and validation. The metaclass used by Django models calls __set_name__ on each field to register its column name, and the field descriptors route attribute access to the underlying database layer.

Notice how the RangeValidated descriptor stores values in the instance's __dict__ using setattr(obj, self.private_name, value). This is a deliberate choice. Since the private name (_port) differs from the public name (port), the setattr call does not re-trigger the descriptor — it writes directly to the instance dictionary under a different key. If you accidentally used the same name (e.g., setattr(obj, 'port', value)), you would trigger infinite recursion because setattr would call __set__ again.

How Python's Builtins Use Descriptors

Several of Python's built-in constructs are implemented using the descriptor protocol. Understanding this removes much of the "magic" from the language and reveals a consistent, elegant design.

property

The property built-in is a data descriptor. It stores references to getter, setter, and deleter functions and invokes them through its __get__, __set__, and __delete__ methods. Because it defines both __get__ and __set__, it always takes priority over instance dictionary entries. A simplified pure-Python equivalent looks like this:

class Property:
    """Emulate PyProperty_Type() in Objects/descrobject.c"""

    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __set_name__(self, owner, name):
        self.__name__ = name

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError("unreadable attribute")
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(obj)

    def setter(self, fset):
        return type(self)(self.fget, fset, self.fdel, self.__doc__)

    def deleter(self, fdel):
        return type(self)(self.fget, self.fset, fdel, self.__doc__)

Notice how the setter and deleter methods return an entirely new Property instance rather than mutating the existing one. This is why the @property decorator pattern works: @name.setter replaces the old descriptor with a new one that has both the getter and the setter registered. The __set_name__ method records the attribute name when the class is created, making it available for error messages. The actual CPython C implementation in Objects/descrobject.c mirrors this structure. The full pure-Python equivalent in the Descriptor HowTo Guide includes additional details like the getter method and full __doc__ handling.

Source: Python 3 Descriptor HowTo Guide — pure Python equivalents

staticmethod and classmethod

Both staticmethod and classmethod are non-data descriptors. They define only __get__ and use it to control how the underlying function is returned when accessed through the class or an instance.

A staticmethod descriptor simply returns the original function without binding it to anything. It strips away the automatic self or cls argument that Python normally provides:

import functools

class StaticMethod:
    """Emulate PyStaticMethod_Type() in Objects/funcobject.c"""

    def __init__(self, f):
        self.f = f
        functools.update_wrapper(self, f)

    def __get__(self, obj, objtype=None):
        return self.f

    def __call__(self, *args, **kwds):
        return self.f(*args, **kwds)

A classmethod descriptor captures the class (from the cls parameter of __get__) and returns a bound method that binds the underlying function to the class using MethodType:

from types import MethodType
import functools

class ClassMethod:
    """Emulate PyClassMethod_Type() in Objects/funcobject.c"""

    def __init__(self, f):
        self.f = f
        functools.update_wrapper(self, f)

    def __get__(self, obj, cls=None):
        if cls is None:
            cls = type(obj)
        return MethodType(self.f, cls)

The pure-Python equivalents above use functools.update_wrapper() to carry forward the wrapped function's __name__, __qualname__, __doc__, __annotations__, and __module__ attributes, plus a __wrapped__ attribute pointing to the original function. The __wrapped__ attribute was added in Python 3.10 for both staticmethod and classmethod. The StaticMethod equivalent also defines __call__, making the descriptor itself directly callable — this matches a Python 3.10 change that made static methods callable as regular functions. The actual C implementations in Objects/funcobject.c mirror this structure. The full pure-Python equivalents in the Descriptor HowTo Guide include a few additional details like __annotations__ property forwarding on StaticMethod.

Regular Methods and Function Binding

Regular Python functions are themselves non-data descriptors. Every function object defines a __get__ method. When you access a function through an instance (as in obj.method), Python calls function.__get__(obj, type(obj)), which returns a bound method. That bound method stores a reference to both the original function and the instance, which is where the automatic self argument comes from. When you later call obj.method(arg), Python is actually calling function(obj, arg) behind the scenes.

You can confirm this behavior directly in an interactive session:

class Example:
    def greet(self):
        return "hello"

e = Example()

# The function is a non-data descriptor
print(type(Example.__dict__['greet']))  # <class 'function'>
print(hasattr(Example.__dict__['greet'], '__get__'))  # True

# Accessing through an instance triggers __get__
bound = e.greet
print(type(bound))       # <class 'method'>
print(bound.__self__)    # <__main__.Example object ...>
print(bound.__func__)    # <function Example.greet ...>

# These two calls are equivalent
print(e.greet())                           # hello
print(Example.__dict__['greet'](e))        # hello

This is why methods do not exist as a separate concept in Python's data model. They emerge naturally from the interaction between functions (which are non-data descriptors) and the attribute lookup chain.

__slots__ as Descriptors

One of the lesser-known applications of the descriptor protocol is the __slots__ mechanism. When a class defines __slots__, Python replaces the per-instance __dict__ with a fixed-length array of slot values. Internally, each slot name declared in __slots__ becomes a member_descriptor object on the class — a data descriptor implemented in C that reads and writes directly to a pre-allocated memory offset within the instance structure.

class Point:
    __slots__ = ['x', 'y']

# Each slot is a data descriptor on the class
print(type(Point.x))  # <class 'member_descriptor'>
print(hasattr(Point.x, '__get__'))     # True
print(hasattr(Point.x, '__set__'))     # True
print(hasattr(Point.x, '__delete__'))  # True

p = Point()
p.x = 10     # calls Point.x.__set__(p, 10)
print(p.x)   # calls Point.x.__get__(p, Point) -> 10

Because member_descriptor defines __set__, it is a data descriptor with the highest lookup priority. This means slot attributes cannot be overridden by instance dictionary entries (and typically, instances of slotted classes have no __dict__ at all). In CPython, the member_descriptor uses a PyMemberDef structure that stores the attribute name, type code, and byte offset within the instance's memory layout. This allows attribute access to be resolved via direct memory address computation rather than hash table lookup, which is why slotted attribute access is faster than dictionary-based access.

Source: Python 3 Descriptor HowTo Guide — __slots__ section

This has practical implications for descriptor authors. If your custom descriptor needs to work with classes that use __slots__, you cannot assume that instances have a __dict__. The per-instance __dict__ storage pattern (used in RangeValidated above) will fail on slotted classes unless the class also includes '__dict__' in its __slots__ declaration. For slotted classes, the WeakKeyDictionary approach (discussed later) is the appropriate alternative.

How cached_property Exploits Non-Data Descriptors

The functools.cached_property decorator (introduced in Python 3.8) is an elegant application of the data vs. non-data descriptor distinction. It defines only __get__, making it a non-data descriptor. On first access, __get__ computes the value, stores it in the instance's __dict__ under the same attribute name, and returns it. On every subsequent access, the instance dictionary entry takes priority over the non-data descriptor, so __get__ is never called again.

from functools import cached_property

class ExpensiveResource:
    @cached_property
    def data(self):
        print("Computing (this only runs once)...")
        return sum(range(1_000_000))

r = ExpensiveResource()
print(r.data)  # Computing (this only runs once)... -> 499999500000
print(r.data)  # 499999500000 (no recomputation)

# The computed value lives in the instance dict
print('data' in r.__dict__)  # True

# Deleting the cache forces recomputation on next access
del r.__dict__['data']
print(r.data)  # Computing (this only runs once)... -> 499999500000

The official Python documentation explains the mechanics: a cached_property "only runs on lookups and only when an attribute of the same name doesn't exist" in the instance dictionary. Once it writes a value there, "subsequent attribute reads and writes take precedence."

Source: functools documentation — cached_property

This design has an important constraint: cached_property requires the instance to have a mutable __dict__. It will not work with classes that define __slots__ without including __dict__ as one of the declared slots. This limitation is explicitly documented and is a direct consequence of the non-data descriptor mechanism — without an instance dictionary to write to, the caching step cannot occur.

There is also a subtle thread-safety consideration. Prior to Python 3.12, cached_property included an undocumented per-property lock (shared across all instances of the class) intended to ensure the getter ran only once per instance. However, this per-property locking caused high lock contention because the lock was class-wide rather than per-instance. Python 3.12 removed this lock entirely, meaning the getter function can now run more than once for a single instance if multiple threads race, with the last writer's result becoming the cached value. The official documentation recommends implementing your own locking inside the getter if synchronization is required.

Source: functools documentation — cached_property (Changed in version 3.12)

Per-Instance Storage and Memory Safety

One of the trickiest aspects of writing custom descriptors is managing per-instance storage. Since the descriptor itself is a class attribute shared across all instances, you cannot store instance-specific values as simple attributes on the descriptor without those values being shared (and overwritten) between instances.

There are four common solutions to this problem, each with different trade-offs.

Solution 1: Instance __dict__ with mangled names. The first and simplest approach (shown in the RangeValidated example above) is to store values in the instance's own __dict__ using a mangled private name. The descriptor uses __set_name__ to learn its assigned name, then stores data under a prefixed key like _port in the instance's dictionary. This approach is straightforward and works well in the majority of cases. However, it silently adds internal attributes to the instance that appear in vars(obj) and may confuse introspection tools. More critically, it fails on classes that use __slots__ without including __dict__.

Solution 2: WeakKeyDictionary. The second approach uses a WeakKeyDictionary from the weakref module. The descriptor maintains a dictionary keyed by instance objects. Because it uses weak references for keys, entries are automatically cleaned up when an instance is garbage collected, preventing memory leaks:

import weakref

class Cached:
    def __init__(self):
        self.data = weakref.WeakKeyDictionary()

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return self.data.get(obj, None)

    def __set__(self, obj, value):
        self.data[obj] = value

class Item:
    price = Cached()

a = Item()
b = Item()
a.price = 19.99
b.price = 29.99
print(a.price)  # 19.99
print(b.price)  # 29.99
# When 'a' is deleted, its entry is automatically removed

The WeakKeyDictionary approach is useful when you cannot modify the instance's __dict__ (for example, when the class uses __slots__) or when you want the descriptor to store data without the owning class being aware of additional internal attributes. However, it adds overhead compared to direct dictionary storage, and not all Python objects can serve as weak reference targets. Basic types like int, str, and tuple cannot be weakly referenced.

Solution 3: Identity-keyed dictionary. The third approach is to store data directly on the descriptor using instance identity (via id(obj)) as a key in a regular dictionary. This avoids the weak-reference limitation but requires manual cleanup to prevent memory leaks, making it the least recommended pattern for general use. In CPython, id() returns the memory address of an object, and addresses can be reused after an object is garbage collected, meaning stale entries could accidentally map to new objects.

Solution 4: Double-underscore name mangling. A more robust variant of Solution 1 uses Python's name mangling. Instead of a single underscore prefix, the descriptor stores values under a double-underscore name like __descriptor_fieldname. While this does not make the attribute truly private, it reduces the risk of accidental collisions when multiple descriptor types are used on the same class. The mangling produces a name like _DescriptorClass__fieldname, which is unlikely to conflict with user-defined attributes.

Descriptors and Metaclasses

Descriptors and metaclasses are complementary tools in Python's object model. Metaclasses control how classes are created, while descriptors control how attributes are accessed. When combined, they enable declarative class definitions similar to those found in ORMs and serialization frameworks.

Before Python 3.6 introduced __set_name__, developers relied on metaclasses to scan a new class's dictionary and inform descriptors of their assigned names. The metaclass would iterate over the class attributes in __new__ and call a custom initialization method on each descriptor it found. PEP 487 formalized this pattern by building it directly into the type metaclass, eliminating the need for a custom metaclass in many common scenarios.

The interaction between descriptors and metaclasses extends further. When you access an attribute through a class rather than an instance (for example, MyClass.attr rather than obj.attr), the lookup is handled by type.__getattribute__(). This method follows a similar but not identical algorithm: it searches the metaclass hierarchy for data descriptors, then the class dictionary and its bases, and finally the metaclass hierarchy for non-data descriptors. This two-layer lookup is what allows things like classmethod descriptors to work correctly when accessed from the class itself.

The super() built-in also participates in the descriptor protocol. When you write super(B, obj).m, Python searches obj.__class__.__mro__ for the base class immediately following B, finds the attribute m in that class's dictionary, and if it is a descriptor, calls m.__get__(obj, B). This ensures that methods resolved through super() are properly bound to the calling instance. The CPython implementation lives in super_getattro() in Objects/typeobject.c.

class LoggedAccess:
    """Descriptor that logs every attribute access."""

    def __set_name__(self, owner, name):
        self.name = name
        self.private = f'_{name}'

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        value = getattr(obj, self.private, 'UNSET')
        print(f'[LOG] Read {self.name} = {value}')
        return value

    def __set__(self, obj, value):
        print(f'[LOG] Write {self.name} = {value}')
        setattr(obj, self.private, value)

class Base:
    x = LoggedAccess()

    def __init__(self):
        self.x = 0

class Child(Base):
    def increment(self):
        # super().__init__() and self.x both trigger descriptors
        self.x = self.x + 1

c = Child()
# [LOG] Write x = 0
c.increment()
# [LOG] Read x = 0
# [LOG] Write x = 1

This demonstrates that descriptor behavior is fully inherited. Child does not define x itself, but because LoggedAccess is a data descriptor in Base, it intercepts all access to x on Child instances through the MRO.

Debugging Descriptor Behavior

Descriptors can be difficult to debug because they operate invisibly — the dot operator looks like a simple dictionary lookup but may trigger arbitrary code. Here are the specific pitfalls that cause the hardest-to-diagnose bugs, along with techniques for investigating them.

Pitfall 1: Infinite recursion in __set__. If your descriptor's __set__ method writes to the instance using the same attribute name that the descriptor manages, it triggers __set__ again, creating infinite recursion. The fix is to use a different storage name (as in self.private_name) or write directly to obj.__dict__ instead of using setattr:

# WRONG: causes infinite recursion
class Bad:
    def __set_name__(self, owner, name):
        self.name = name
    def __set__(self, obj, value):
        setattr(obj, self.name, value)  # calls __set__ again!

# RIGHT: write to __dict__ directly or use a different key
class Good:
    def __set_name__(self, owner, name):
        self.name = name
    def __set__(self, obj, value):
        obj.__dict__[self.name] = value  # bypasses descriptor

Pitfall 2: Descriptors placed on instances instead of classes. If you assign a descriptor object to an instance attribute rather than a class attribute, Python treats it as an ordinary value. The __get__ method is never called. This mistake is especially common when dynamically constructing objects:

class Desc:
    def __get__(self, obj, objtype=None):
        return "descriptor active"

class MyClass:
    pass

obj = MyClass()
obj.attr = Desc()     # instance attribute - descriptor NOT invoked
print(obj.attr)       # <__main__.Desc object ...>  (just the object)

MyClass.attr = Desc() # class attribute - descriptor IS invoked
print(obj.attr)       # "descriptor active"

Pitfall 3: Forgetting the if obj is None guard. When a descriptor is accessed through the class itself (e.g., MyClass.attr), __get__ receives None for the obj parameter. If your __get__ tries to access obj.__dict__ without checking for None first, it raises an AttributeError. The convention is to return self (the descriptor object) when obj is None, which allows inspection of the descriptor from the class level.

Diagnostic technique: When you suspect descriptor interference, bypass the descriptor protocol entirely by accessing the class's __dict__ directly. This lets you see the raw descriptor object rather than the value it computes:

class MyClass:
    x = property(lambda self: 42)

obj = MyClass()
print(obj.x)                          # 42 (property invoked)
print(type(MyClass.__dict__['x']))    # <class 'property'> (raw descriptor)
print(MyClass.__dict__['x'].fget)     # <function ...> (the getter)

Key Takeaways

  1. Descriptors are the engine of Python's attribute access: The __get__, __set__, and __delete__ methods give descriptors the ability to intercept and customize attribute operations. This protocol underpins property, classmethod, staticmethod, method binding, __slots__, cached_property, and framework-level field definitions.
  2. Data descriptors take priority over instance dictionaries: If a descriptor defines __set__ or __delete__, it becomes a data descriptor and cannot be overridden by entries in obj.__dict__. Non-data descriptors (only __get__) yield to instance dictionary entries, which is the mechanism behind cached_property.
  3. Attribute assignment and deletion have their own descriptor protocols: Writing obj.x = value checks for data descriptors before writing to the instance dictionary. The assignment protocol is the hidden counterpart to the more widely discussed lookup protocol, and understanding both is necessary for complete mastery of Python's object model.
  4. __set_name__ eliminates boilerplate: Introduced in Python 3.6 via PEP 487, this hook allows descriptors to automatically learn the name of the class variable they were assigned to, removing the need for custom metaclasses in many validation and ORM patterns. It also works on non-descriptor class attributes.
  5. __slots__ are implemented as data descriptors: Each slot name becomes a member_descriptor that reads and writes to pre-allocated memory offsets, providing faster access and lower memory usage than dictionary-based storage.
  6. Per-instance storage requires careful design: Since descriptors are shared class attributes, instance-specific data should be stored either in the instance's __dict__ (using a mangled name from __set_name__), in a WeakKeyDictionary (for slotted classes), or through direct __dict__ writes to avoid memory leaks and cross-instance contamination.
  7. Descriptors and metaclasses are complementary: Metaclasses control class creation, while descriptors control attribute access. Together they enable the declarative patterns seen in Django models, SQLAlchemy mappings, and Pydantic schemas.

The descriptor protocol is not an exotic corner of the language reserved for framework authors. It is the foundational layer that makes Python's object model work the way it does. Every method call, every property access, every @staticmethod and @classmethod invocation, every __slots__ attribute, and every cached_property lookup passes through this protocol. Once you understand how descriptors resolve attribute lookups — and assignments — the behavior of Python's class system stops being mysterious and starts being predictable. The patterns shown in this article — validated fields, logged access, cached computation, reusable type constraints, slot-aware storage, and recursive-safe naming — are immediately applicable to production codebases, and they will make you a more effective Python engineer.

back to articles