Free-Threaded Python: How No-GIL Rewires the CPython Interpreter

For three decades, Python threads have been a polite fiction: you could create as many as you liked, but only one could actually run at a time. That constraint was baked into CPython's Global Interpreter Lock, and it shaped how an entire generation of Python developers wrote concurrent software. Python 3.13, released in October 2024, began to dismantle that constraint with a new optional build that removes the GIL entirely. Python 3.14, released in October 2025, promoted that build from experimental to officially supported. What follows is a precise account of what changed, how CPython's internals were rewired to make it possible, and what the road ahead looks like.

The announcement of PEP 703 — "Making the Global Interpreter Lock Optional in CPython" — in July 2023 was, as the Python Steering Council noted in its acceptance, arguably the most ambitious internal change to CPython in decades. The author, Sam Gross, a software engineer at Meta, had spent years developing a proof-of-concept no-GIL fork. Rather than simply deleting the lock, the proposal reimagined how CPython manages object lifetimes across threads. Understanding what free-threaded Python truly means requires understanding what the GIL was protecting in the first place.

What the GIL Actually Did (And Why Removing It Is Hard)

The Global Interpreter Lock is a mutex — a mutual exclusion lock — that CPython has carried since its earliest releases. Its job was narrow but load-bearing: it guaranteed that only one thread could execute Python bytecode at any given moment. This made CPython's reference counting memory model safe without any per-object locking, because no two threads could simultaneously modify an object's reference count.

Reference counting is how CPython knows when to free an object. Every Python object carries an integer counter tracking how many references point to it. When that counter hits zero, the memory is reclaimed. Without the GIL, two threads incrementing or decrementing the same object's reference count simultaneously could corrupt that count, causing either premature deallocation (a use-after-free crash) or a memory leak. The GIL prevented that race condition entirely.

The cost was parallelism. Even on a machine with dozens of cores, Python threads could not run Python code simultaneously. For I/O-bound work this rarely mattered — threads yield the GIL while waiting for the operating system, so network and disk concurrency worked fine. But for CPU-bound work — matrix operations, text processing, image manipulation, machine learning pipeline orchestration — multiple threads provided essentially no benefit over a single thread.

Note

The GIL is a CPython implementation detail, not a Python language requirement. Jython and PyPy-STM have historically operated without one. But CPython is the reference implementation that runs the overwhelming majority of production Python, which is why the GIL's presence mattered so much in practice.

Previous attempts to remove the GIL confirmed the difficulty. Greg Stein's 1999 effort replaced the lock with fine-grained per-object locking, but the added overhead made single-threaded code significantly slower — an unacceptable trade-off in an era when most machines had a single core. Larry Hastings' 2015 "Gilectomy" project encountered the same problem: removing the GIL introduced enough overhead in the common single-threaded case that the Python community declined to accept it. Sam Gross's contribution was finding a way to make thread safety cheap enough that single-threaded performance remained competitive.

PEP 703: The Three-Part Memory Management Rewrite

PEP 703 does not simply delete the GIL and hope for the best. It replaces the protection the GIL provided with three interlocking mechanisms: biased reference counting, immortal objects, and deferred reference counting. Together these cover the spectrum of objects a running Python program actually touches, each with a strategy tuned to its access pattern.

Biased Reference Counting. The key insight behind biased reference counting, first described in a 2018 paper by Jiho Choi, Thomas Shull, and Josep Torrellas, is that the vast majority of objects in a multi-threaded Python program are still accessed primarily by the thread that created them. PEP 703 exploits this by giving every object two reference count fields: ob_ref_local for the owning thread, and ob_ref_shared for all other threads. The owning thread can increment and decrement its local count without any atomic operations — the same low-cost path as the original GIL-protected count. Only when another thread needs to modify the count does the slower, atomic shared path kick in. The object's true reference count is the sum of both fields. Each object also gains a new ob_tid field recording which thread owns it, and a single-byte ob_mutex field providing a per-object lock for cases where finer synchronization is needed.

Immortal Objects. Some objects live for the entire lifetime of the interpreter: the built-in constants True, False, and None; small integers; interned strings; statically allocated type objects. Modifying their reference counts from multiple threads simultaneously would create unnecessary contention on these globally shared objects. The solution is to mark them immortal by setting ob_ref_local to UINT32_MAX. The Py_INCREF and Py_DECREF macros become no-ops for immortal objects, eliminating any reference count traffic on them entirely. As of Python 3.14, immortalization covers code constants (numeric literals, string literals, and tuple literals composed of other constants) and strings interned via sys.intern().

Deferred Reference Counting. A third category sits between the two: objects like top-level functions, code objects, modules, and methods that are widely shared across threads but are not literally immortal. For these, PEP 703 uses deferred reference counting, which postpones immediate atomic updates and batches them, reducing contention for objects that are read frequently but not modified often.

"Most objects are only accessed by a single thread, even in multi-threaded programs." — PEP 703 (peps.python.org), describing the observation that motivates biased reference counting

Garbage collection required rethinking as well. CPython's cycle collector had always operated under the GIL's protection. The free-threaded build replaces that with two "stop-the-world" passes that pause all threads briefly: the first pass identifies objects eligible for collection, the second handles any finalizers that ran during the first pass. This is conceptually similar to stop-the-world pauses in JVM garbage collectors, though CPython's passes are short because Python's cycle detector handles only cyclic reference structures rather than all allocations.

The free-threaded build also adopts mimalloc, a high-performance memory allocator from Microsoft Research, which provides better per-thread allocation performance and reduces contention on the global heap.

Pro Tip

To check whether the GIL is actually disabled at runtime — not just whether the build supports it — use sys._is_gil_enabled(). A C extension that does not declare free-threading support can silently re-enable the GIL when imported, so this runtime check is more reliable than inspecting the binary name alone.

From Experimental to Officially Supported: 3.13 Through 3.14

The Python Steering Council accepted PEP 703 with a deliberate three-phase roadmap. Phase I, delivered in Python 3.13 (October 2024), made the free-threaded build available as an explicitly experimental, opt-in mode. The build used a separate binary with a "t" suffix — python3.13t on most platforms — and carried clear documentation warnings that it was not production-ready and that the surrounding ecosystem had not yet caught up.

PEP 779 — "Criteria for Supported Status for Free-Threaded Python" — was accepted by the Steering Council in June 2025. It formalized what Phase II would require: a stable API design, successfully used by a meaningful number of third-party packages, with performance overhead not prohibitive for practical use. Those criteria were met, and Python 3.14 (October 2025) entered Phase II: the free-threaded build is no longer considered experimental and is officially supported, though it remains a separate, optional build rather than the default.

Python 3.14 also brought substantive technical improvements to the free-threaded build. The specializing adaptive interpreter introduced in PEP 659 — the mechanism that hot-patches bytecode with type-specialized instructions at runtime — was enabled for the free-threaded build in 3.14 after being disabled in 3.13 due to thread safety concerns. That alone delivered meaningful performance gains for free-threaded code. Temporary workarounds in the interpreter were replaced with permanent solutions, and the C API changes described in PEP 703 were completed.

"The performance penalty on single-threaded code in free-threaded mode is now roughly 5-10%, depending on the platform and C compiler used." — Python 3.14 What's New documentation (docs.python.org)

That 5–10% single-threaded overhead figure represents a significant improvement over the early 3.13 builds, which showed higher penalties before the adaptive interpreter was enabled. The gap is expected to narrow further in Python 3.15 and beyond as the CPython team continues tuning.

Phase III — making the free-threaded build the default, and eventually the only build — has no confirmed timeline. The Steering Council has stated that this decision will depend on community adoption, demonstrated benefit across real workloads, and the maturity of the C extension ecosystem. A future PEP will govern that transition when the time comes.

Installing and Identifying a Free-Threaded Build

Free-threaded Python carries a distinct ABI tag: "t". This distinguishes the binary and its compiled extension wheel files from standard GIL-enabled builds. On Windows, the official Python.org installer places the free-threaded binary at python3.14t.exe while the standard interpreter remains python.exe. On macOS and Linux, the free-threaded executable has a python3.14t alias; whether the unqualified python3 command points to it depends on the installation method.

When building CPython from source, the configure option --disable-gil produces a free-threaded build. The resulting interpreter prints free-threading build in the output of python -VV and in sys.version.

python
# Check whether the GIL is currently enabled at runtime
import sys

if hasattr(sys, '_is_gil_enabled'):
    if sys._is_gil_enabled():
        print("GIL is enabled (standard build or GIL re-enabled by extension)")
    else:
        print("GIL is disabled — free-threaded build running without GIL")
else:
    print("This interpreter does not support free-threading (pre-3.13)")

# Also check the build configuration
import sysconfig
gil_disabled = sysconfig.get_config_var("Py_GIL_DISABLED")
print(f"Build compiled with GIL disabled: {bool(gil_disabled)}")

The PYTHON_GIL environment variable and the -X gil command-line flag can override GIL behavior at startup. Setting PYTHON_GIL=0 disables it; setting it to 1 forces it on even in a free-threaded build. This override matters because C extension modules that are not explicitly marked as free-threading-safe will cause the interpreter to automatically re-enable the GIL when they are imported, printing a warning. The override allows developers to suppress that automatic re-enablement for extensions they know are safe to use from a single thread.

Note

There is a known limitation in the free-threaded build: accessing frame.f_locals from a frame object that is currently executing in another thread is not safe and may crash the interpreter. Similarly, sharing a single iterator object across multiple threads concurrently is not thread-safe — threads may observe duplicate or missing elements.

Performance: What the Numbers Actually Show

Benchmarks for free-threaded Python require careful interpretation because the gains and losses are workload-dependent and the numbers have shifted substantially between 3.13 and 3.14.

For CPU-bound multi-threaded code, the potential is genuine. Meta's internal benchmarks, reported by Sam Gross, have shown near-linear scaling with core count on suitable workloads. A practical illustration from JetBrains' PyCharm blog compared a prime-counting benchmark on an 8-core machine: the standard Python 3.13.5 build running with 4 threads achieved essentially the same time as a single thread (a speedup of 0.98x, reflecting GIL overhead). The same benchmark on Python 3.13.5t showed a real parallel speedup proportional to the core count. For the mathematical formula, a realistic estimate for well-written CPU-bound code on an N-core machine is approximately N × 0.8, accounting for synchronization overhead.

The StaticFrame library, which provides immutable DataFrame structures, demonstrated row-wise function application dropping from 21.3 ms on the standard 3.13 build to 7.89 ms on 3.13t with four threads — a reduction of over 60%. At larger scales (100 million elements), the outperformance exceeded 70% in multiple DataFrame configurations.

Workload Type Standard Build (GIL on) Free-Threaded Build (GIL off)
CPU-bound, 4 threads ~1.0x (no speedup over single thread) ~3.2x (near-linear multi-core scaling)
I/O-bound (network, disk) Effective concurrency already works No change; I/O yields GIL anyway
Single-threaded (Python 3.14) Baseline (1.0x) 5–10% overhead vs. GIL build
Memory-bandwidth-bound GIL limits thread concurrency Hardware bandwidth becomes the ceiling
DataFrame row-wise apply (1M elements) 21.3 ms (StaticFrame benchmark) 7.89 ms with 4 threads (StaticFrame 3.2)

The single-threaded overhead deserves emphasis because it affects all code running under the free-threaded build, not just code that uses threads. The 5–10% figure from Python 3.14's official documentation is a significant improvement over early 3.13t builds, where the adaptive interpreter was disabled and overheads were correspondingly higher. Memory overhead is also real: per-object atomic bookkeeping increases memory traffic, and on workloads that are already bandwidth-bound rather than compute-bound, adding more cores may produce diminishing returns rather than linear speedup.

Ecosystem Compatibility and the Road Ahead

The free-threaded build's practical usefulness depends heavily on whether the libraries a given project actually uses support it. A C extension module that does not declare free-threading support causes the interpreter to re-enable the GIL automatically when imported — a safety measure, but one that silently removes the entire benefit if even one critical dependency lacks support.

Progress has been steady. NumPy 2.3.0, released in June 2025, shipped improved compatibility with the free-threaded interpreter. Scikit-learn added support around the same period, as tracked at py-free-threading.github.io, a community-maintained site operated primarily by Quansight Labs in collaboration with Meta's Python runtime team. As of early 2026, packages with confirmed free-threading support include NumPy (2.1+), pydantic (2.7+), cryptography (42+), aiohttp (3.10+), and httpx (0.27+). Pandas and matplotlib remain in progress. The site py-free-threading.github.io maintains a live tracking table updated by package maintainers.

PHASE I Python 3.13 Experimental Optional build Oct 2024 PHASE II Python 3.14 Officially supported Still optional Oct 2025 PHASE III Future version Default build No timeline set TBD PEP 703 Roadmap — Free-Threaded Python Adoption Phases Source: peps.python.org/pep-0703, peps.python.org/pep-0779
PEP 703 three-phase roadmap as specified by the Python Steering Council. Phase II was confirmed by PEP 779 (accepted June 2025) and delivered in Python 3.14.

For developers writing pure Python code, the situation is more favorable. The free-threaded build guarantees that pure Python code is thread-safe at the same level it was under the GIL — the interpreter will not crash from concurrent pure Python execution. What it does not guarantee is that your application logic is race-condition-free: if your code uses global state for configuration, implements a cache using a plain dict accessed from multiple threads without locking, or relies on operations you assumed to be atomic, those assumptions require review.

The threading module, concurrent.futures.ThreadPoolExecutor, and asyncio all work with the free-threaded build. The difference with ThreadPoolExecutor is that CPU-bound tasks submitted to the pool now run in parallel across cores, rather than serializing behind the GIL. For web frameworks like FastAPI that already use thread pools for synchronous route handlers, the practical change is that CPU-intensive handlers that previously serialized due to the GIL can now run concurrently. The magnitude of the benefit depends heavily on the workload's compute-to-I/O ratio and the number of available cores.

The Quansight Labs and Meta collaboration that drives py-free-threading.github.io presented at PyCon US 2025 in Pittsburgh, where the overarching theme of the Python Language Summit was concurrency. Sessions covered the current state of free-threaded Python, subinterpreter APIs, and early discussions of Java-style virtual threads as a potential future direction.

PEP 803: The Stable ABI Problem and the abi3t Solution

One of the most concrete obstacles slowing ecosystem adoption is a packaging problem: extension wheels compiled for the standard GIL-enabled build cannot be loaded by the free-threaded interpreter, and vice versa. This means every library that ships compiled C extensions currently needs to build and publish two separate sets of wheels — one for the standard build and one for the free-threaded build, identified by the "t" ABI tag. For large, widely-distributed packages like NumPy, SciPy, and scikit-learn, that doubles the CI matrix and the PyPI storage burden.

Python's Stable ABI (known as abi3, defined in PEP 384) already provides a partial solution for standard builds: an extension compiled against the abi3 interface is compatible with all subsequent CPython minor versions. But as of Python 3.14, abi3 is not available to free-threaded builds. Extensions fail to compile when both Py_LIMITED_API and Py_GIL_DISABLED are defined simultaneously.

PEP 803, authored by Petr Viktorin, proposes abi3t — a Stable ABI for free-threaded builds — targeted at Python 3.15. The PEP went through two public revision rounds, with a second iteration posted in February 2026. The core mechanism involves making PyObject and its substructures opaque in the Limited API: extensions can no longer directly access struct fields like ob_type or assume the size of a PyObject. Instead, they call accessor functions. This trade-off — giving up direct struct access in exchange for multi-version and multi-build compatibility — is the same bargain that abi3 already asks of GIL-enabled extensions, extended now to cover free-threaded builds as well.

In its acceptance post for PEP 779, the Python Steering Council stated explicitly that it expects a Stable ABI for free-threading to be prepared and defined for Python 3.15. Early testing of the abi3t mechanism has shown that bindings generators PyO3, CFFI, and Cython are all expected to work with it, which covers a large fraction of the extension ecosystem. When abi3t ships, a single compiled wheel tagged abi3.abi3t will load on both GIL-enabled and free-threaded builds of CPython 3.15 and later — a significant packaging simplification that should accelerate library adoption.

Extension Authors: What This Means Now

On Python 3.14 and earlier, free-threaded wheels require a separate build. The flag Py_MOD_GIL_NOT_USED must be set in your module definition to signal to CPython that your extension is thread-safe; without it, the interpreter re-enables the GIL on import. From Python 3.15 forward, PEP 803's abi3t aims to let a single wheel work across both build types, provided the extension uses the Limited API.

Key Takeaways

  1. The GIL is optional, not gone: Free-threaded Python is a separate CPython build, identified by the "t" ABI suffix. The standard GIL-enabled build remains the default and will continue to be until Phase III of PEP 703's roadmap is confirmed by a future PEP.
  2. Memory management was fundamentally rewritten: PEP 703 replaces GIL-protected reference counting with biased reference counting (cheap for single-owner objects, first described by Choi, Shull, and Torrellas at PACT 2018), immortal objects (no-op increments/decrements for constants and singletons), and deferred reference counting (batched updates for widely-shared objects like modules and functions).
  3. Python 3.14 crossed the experimental threshold: With PEP 779 accepted in June 2025, the free-threaded build became officially supported in Python 3.14. The specializing adaptive interpreter is now enabled in free-threaded mode, cutting single-threaded overhead to roughly 5–10% according to the official Python 3.14 documentation.
  4. Benefits are workload-dependent: CPU-bound multi-threaded code can achieve near-linear multi-core scaling. I/O-bound code sees no benefit. All code running under the free-threaded build pays the 5–10% single-threaded overhead, so the trade-off only makes sense for programs that can exploit true parallelism.
  5. Ecosystem compatibility is advancing but incomplete: NumPy, pydantic, cryptography, aiohttp, and httpx have confirmed free-threading support. Pandas and matplotlib have active tracking issues. Any C extension that has not declared free-threading support via Py_MOD_GIL_NOT_USED automatically re-enables the GIL on import.
  6. PEP 803 targets the packaging bottleneck: A proposed abi3t Stable ABI for Python 3.15 would allow a single compiled wheel to load on both GIL-enabled and free-threaded interpreters, removing the doubled build matrix that currently slows library adoption.

Free-threaded Python is not a drop-in replacement for everything that came before. It is a structural change to how CPython manages concurrent execution, one that took years of research into memory management to make viable. For the workloads that can use it — CPU-bound threading, parallel data processing, AI pipeline orchestration — the performance ceiling has moved substantially. For everything else, the standard build remains the right tool. The trajectory is clear: the free-threaded build is now officially supported, the ecosystem is catching up, the packaging story is being resolved in 3.15, and the question of when it becomes the default is a matter of community adoption rather than technical readiness.

Frequently Asked Questions

What is free-threaded Python?
Free-threaded Python is an optional CPython build that removes the Global Interpreter Lock, allowing multiple threads to execute Python bytecode simultaneously on separate CPU cores. It is identified by the "t" ABI suffix — python3.14t — and became officially supported starting in Python 3.14.
Is free-threaded Python the default in Python 3.14?
No. In Python 3.14 (Phase II of PEP 703), the free-threaded build is officially supported but remains a separate, optional build. The standard GIL-enabled build is still the default. A future PEP will govern the Phase III transition, which would make the free-threaded build the default. No timeline for Phase III has been confirmed.
What is biased reference counting and why does it matter?
Biased reference counting, first described by Jiho Choi, Thomas Shull, and Josep Torrellas in their PACT 2018 paper, splits each object's reference count into a local field for the owning thread and a shared field for all other threads. The owning thread updates its local count without any atomic instructions — the same low-cost path as the original GIL-protected count. Only when another thread modifies the count does the slower atomic path activate. This is the mechanism that makes single-threaded overhead competitive without the GIL.
What is the performance overhead of running under the free-threaded build?
According to the official Python 3.14 What's New documentation, single-threaded code in free-threaded mode carries roughly 5–10% overhead compared to the GIL-enabled build, depending on platform and C compiler. This is a substantial improvement over early Python 3.13t builds, where the specializing adaptive interpreter was disabled and overheads were higher.
How do I check at runtime whether the GIL is actually disabled?
Use sys._is_gil_enabled() in Python 3.13 and later. It returns False if the GIL is disabled and True if it is active. Importing a C extension that has not declared free-threading support via Py_MOD_GIL_NOT_USED can silently re-enable the GIL, so this runtime check is more reliable than inspecting the binary name alone.
What happens when a C extension without free-threading support is imported?
CPython automatically re-enables the GIL when it encounters a C extension that has not been marked with Py_MOD_GIL_NOT_USED. This is a safety measure to prevent memory corruption from thread-unsafe C code, but it silently removes the parallelism benefit for the entire process as long as that extension is loaded. The sys._is_gil_enabled() function will confirm the GIL was re-enabled.
What is PEP 803 and why does it matter for free-threaded Python?
PEP 803 proposes an abi3t Stable ABI for free-threaded CPython, targeted at Python 3.15. Currently, extension authors must publish separate compiled wheels for the free-threaded build (tagged "t") and the standard GIL-enabled build. PEP 803 aims to allow a single wheel tagged abi3.abi3t to load on both interpreter types, reducing the packaging burden that is currently slowing library adoption. The Python Steering Council stated in its PEP 779 acceptance post that it expects a stable ABI for free-threading to be ready for Python 3.15.
Does free-threaded Python replace multiprocessing?
No, and the two approaches serve different design goals. Free-threaded Python uses threads that share memory within a single process, which is efficient for data-sharing workloads but requires careful attention to race conditions. Multiprocessing uses separate OS processes with no shared memory by default, which avoids data-race risk but carries process-spawn overhead and requires explicit data serialization. Python 3.14 also introduced concurrent.interpreters, a third model using isolated subinterpreters within the same process, which provides parallelism with opt-in sharing rather than the thread model's default-shared memory.

Sources: PEP 703 — Making the Global Interpreter Lock Optional in CPython | PEP 779 — Criteria for Supported Status for Free-Threaded Python | PEP 803 — abi3t: Stable ABI for Free-Threaded Builds | Python 3.14 Free-Threading HOWTO (docs.python.org) | What's New in Python 3.14 (docs.python.org) | py-free-threading.github.io (Quansight Labs) | Choi, Shull, Torrellas — Biased Reference Counting (PACT 2018, ACM) | StaticFrame benchmark — Liberating Performance with Immutable DataFrames (Towards Data Science)