Python Decorator for Timing Function Execution

Q: Why should I use time.perf_counter() instead of time.time() in a timing decorator?

time.perf_counter() uses a monotonic clock with the highest available resolution, meaning it never goes backward and is not affected by system clock adjustments (such as NTP syncs or daylight saving changes). time.time() uses the system wall clock, which is adjustable and can produce negative elapsed times if the clock is set backward between measurements. For benchmarking function execution, perf_counter is the correct choice.

Q: What is the difference between perf_counter and process_time for timing?

perf_counter measures wall-clock time (real elapsed time including sleep and I/O waits). process_time measures only CPU time consumed by the current process, excluding sleep. Use perf_counter when you want to know how long a user waits for a function to complete. Use process_time when you want to measure how much CPU work the function performed.

Q: Is the profiled timing decorator thread-safe?

The simple profiled decorator shown with a shared timings list is not guaranteed to be thread-safe under concurrent calls. While CPython's GIL makes list.append atomic for simple cases, relying on this is an implementation detail. For production thread-safe timing, protect the shared list with threading.Lock, or collect timings in thread-local storage. In async code running on a single event loop, a shared list is safe without locking.

Knowing how long a function takes to execute is fundamental to performance work. Python's time module provides the clocks, and decorators provide the mechanism to attach timing logic to any function without modifying its source code. This article builds a timing decorator from the simplest possible version through progressively more capable implementations, covering clock selection, functools.wraps, logging integration, threshold-based reporting, cumulative statistics, async function support, and dual-use as a context manager.

How to Build It: Step by Step

Step 1: Choose time.perf_counter() as your clock -- it is monotonic, high-resolution, and includes sleep and I/O time. Step 2: Build a minimal decorator with @functools.wraps to preserve function metadata. Step 3: Add configurable options using the func=None pattern for threshold filtering and logging. Step 4: Collect cumulative statistics across calls with a report() method. Step 5: Detect async functions with inspect.iscoroutinefunction and provide an async wrapper. Step 6: Combine decorator and context manager in a single Timer class. Step 7: Add exception safety with try/finally so timing is recorded even when functions raise.

A timing decorator wraps a function with logic that records a timestamp before the function runs, records another timestamp after it returns, and reports the difference. The concept is simple, but the implementation details matter: which clock to use, how to preserve the original function's metadata, how to handle functions that accept arguments, and how to route the timing output to the right destination.

Choosing the Right Clock

Python's time module provides several clock functions. Choosing the wrong one produces unreliable measurements. Here is what each clock measures and when to use it:

time.perf_counter()

MEASURESWall-clock time, highest resolution

MONOTONICYes

USE CASEBenchmarking function execution

time.perf_counter_ns()

MEASURESWall-clock time in nanoseconds (int)

MONOTONICYes

USE CASESub-microsecond precision without float loss

time.monotonic()

MEASURESWall-clock time, lower resolution

MONOTONICYes

USE CASETimeouts and scheduling

time.process_time()

MEASURESCPU time only (excludes sleep/IO)

MONOTONICYes

USE CASEMeasuring CPU-bound computation

time.time()

MEASURESSystem wall clock (epoch-based)

MONOTONICNo

USE CASETimestamps, not benchmarking

For timing function execution, time.perf_counter() is the correct default. It uses the highest-resolution clock available on the platform, it is monotonic (it never goes backward), and it includes time spent sleeping or waiting on I/O, which reflects the real wall-clock duration a caller experiences.

import time

# Inspect the properties of each clock
for clock_name in ["perf_counter", "monotonic", "process_time", "time"]:
    info = time.get_clock_info(clock_name)
    print(f"{clock_name:.<20} resolution={info.resolution}  "
          f"monotonic={info.monotonic}  adjustable={info.adjustable}")

Warning

Do not use time.time() for benchmarking. It uses the system wall clock, which is adjustable. If the system clock is corrected by NTP or changed for daylight saving during your measurement, you can get negative elapsed times or wildly inaccurate results.

The Minimal Timing Decorator

The simplest version captures a start time, calls the function, captures an end time, prints the difference, and returns the function's result:

import functools
import time

def timer(func):
    """Print the execution time of the decorated function."""
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        print(f"{func.__name__} took {elapsed:.6f}s")
        return result
    return wrapper

@timer
def sort_data(data):
    """Sort a list using Python's built-in sort."""
    return sorted(data)

import random
data = [random.randint(0, 1_000_000) for _ in range(500_000)]
sorted_data = sort_data(data)
# sort_data took 0.218347s

Three details matter in this implementation. First, @functools.wraps(func) copies the original function's __name__, __doc__, __module__, and __qualname__ onto the wrapper. Without it, sort_data.__name__ would return "wrapper" and help(sort_data) would show the wrapper's docstring instead of the original's. Second, *args and **kwargs ensure the wrapper accepts any combination of positional and keyword arguments, making the decorator work with any function signature. Third, the wrapper returns result, preserving the original function's return value.

Pro Tip

If you need nanosecond precision (for timing very fast operations), use time.perf_counter_ns() instead. It returns an integer in nanoseconds, avoiding the floating-point precision loss that occurs with perf_counter() at very small time scales.

Adding Configurable Options

A production timing decorator benefits from configurable output, threshold filtering, and logging integration. This version supports all three calling conventions (@timer, @timer(), and @timer(threshold=0.5)):

import functools
import logging
import time

logger = logging.getLogger(__name__)

def timer(func=None, *, threshold=0.0, level=logging.DEBUG, label=None):
    """Time function execution and log the result.

    Args:
        func: The function to decorate (auto-filled when used without parens).
        threshold: Minimum seconds before the timing is reported.
        level: Logging level for the timing message.
        label: Custom label; defaults to the function's qualified name.
    """
    if func is None:
        return functools.partial(
            timer, threshold=threshold, level=level, label=label
        )

    display_name = label or func.__qualname__

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        if elapsed >= threshold:
            logger.log(level, "%s completed in %.6fs", display_name, elapsed)
        return result
    return wrapper

This version uses functools.partial for the optional-parentheses pattern. When func is None (the decorator was called with arguments), it returns a partial application of itself with the configuration pre-filled. When func is provided (the decorator was used bare), it wraps immediately. This same pattern appears in standard library decorators like @dataclass, which accepts optional configuration arguments.

Here is how each calling form works:

logging.basicConfig(level=logging.DEBUG)

# Bare: reports all calls
@timer
def fast_function():
    return sum(range(1000))

# With threshold: only reports if execution exceeds 0.1 seconds
@timer(threshold=0.1)
def sometimes_slow(n):
    return sum(range(n))

# With custom label and log level
@timer(label="DB Fetch", level=logging.WARNING, threshold=0.5)
def fetch_records(query):
    time.sleep(0.6)  # simulated latency
    return [{"id": 1}]

fast_function()       # DEBUG: fast_function completed in 0.000023s
sometimes_slow(100)   # (silent -- under threshold)
sometimes_slow(10_000_000)  # DEBUG: sometimes_slow completed in 0.287s
fetch_records("SELECT *")   # WARNING: DB Fetch completed in 0.601s

The threshold parameter is particularly useful in production. Instead of flooding logs with timing data for every call, you configure it to report only when a function exceeds an acceptable response time. The logging integration routes output through Python's standard logging framework, which means it respects your application's log level configuration, handlers, and formatters.

Collecting Cumulative Statistics

For profiling across many calls, a timing decorator that accumulates statistics is more useful than one that logs each call individually. This version tracks call count, total time, minimum, maximum, and exposes a reporting method:

import functools
import time
import statistics

def profiled(func):
    """Collect execution time statistics across all calls."""
    timings = []

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        timings.append(elapsed)
        return result

    def report():
        if not timings:
            return "No calls recorded."
        return (
            f"{func.__name__} stats over {len(timings)} calls:\n"
            f"  total   = {sum(timings):.6f}s\n"
            f"  mean    = {statistics.mean(timings):.6f}s\n"
            f"  median  = {statistics.median(timings):.6f}s\n"
            f"  min     = {min(timings):.6f}s\n"
            f"  max     = {max(timings):.6f}s\n"
            f"  stdev   = {statistics.stdev(timings):.6f}s"
            if len(timings) > 1 else
            f"{func.__name__} stats over 1 call:\n"
            f"  total   = {timings[0]:.6f}s"
        )

    def reset():
        timings.clear()

    wrapper.timings = timings
    wrapper.report = report
    wrapper.reset = reset
    return wrapper

@profiled
def process_batch(items):
    return [item ** 2 for item in items]

import random
for _ in range(100):
    batch = [random.randint(1, 1000) for _ in range(random.randint(100, 50000))]
    process_batch(batch)

print(process_batch.report())

The timings list, report(), and reset() are attached directly to the wrapper function as attributes. Because functions are objects in Python, you can add arbitrary attributes to them. The caller accesses the statistics through process_batch.report() and can clear the accumulated data with process_batch.reset().

Supporting Async Functions

A standard timing decorator wraps a synchronous function. If you apply it to an async def function, the wrapper will time how long it takes to create the coroutine object (nearly zero), not how long the coroutine takes to execute. The fix is to detect async functions and define an async wrapper:

import asyncio
import functools
import inspect
import time

def timer(func=None, *, threshold=0.0):
    if func is None:
        return functools.partial(timer, threshold=threshold)

    if inspect.iscoroutinefunction(func):
        @functools.wraps(func)
        async def async_wrapper(*args, **kwargs):
            start = time.perf_counter()
            result = await func(*args, **kwargs)
            elapsed = time.perf_counter() - start
            if elapsed >= threshold:
                print(f"{func.__name__} took {elapsed:.6f}s")
            return result
        return async_wrapper

    @functools.wraps(func)
    def sync_wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        if elapsed >= threshold:
            print(f"{func.__name__} took {elapsed:.6f}s")
        return result
    return sync_wrapper

# Works with sync functions
@timer
def compute(n):
    return sum(range(n))

# Works with async functions
@timer(threshold=0.1)
async def fetch_data(url):
    await asyncio.sleep(0.3)  # simulated network delay
    return {"url": url, "status": 200}

compute(1_000_000)
asyncio.run(fetch_data("https://api.example.com"))
# compute took 0.021842s
# fetch_data took 0.300512s

inspect.iscoroutinefunction(func) returns True when the decorated function was defined with async def. In that case, the decorator returns an async def wrapper that uses await to call the original function. The timing captures the full duration of the coroutine's execution, including any await points inside it. Note that asyncio.iscoroutinefunction was deprecated in Python 3.14 and is scheduled for removal in 3.16; inspect.iscoroutinefunction is the correct replacement and has been available since Python 3.5.

Dual-Use: Decorator and Context Manager

Sometimes you need to time an arbitrary block of code, not just a function call. By combining the decorator with a context manager, you get a single tool that works in both scenarios:

import functools
import time

class Timer:
    """A timing tool that works as both a decorator and a context manager."""

    def __init__(self, label=None, threshold=0.0):
        self.label = label
        self.threshold = threshold
        self.elapsed = None

    def __call__(self, func):
        """Use as a decorator: @Timer() or @Timer(label='db')"""
        display = self.label or func.__qualname__

        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            start = time.perf_counter()
            result = func(*args, **kwargs)
            elapsed = time.perf_counter() - start
            if elapsed >= self.threshold:
                print(f"[{display}] {elapsed:.6f}s")
            return result
        return wrapper

    def __enter__(self):
        """Use as a context manager: with Timer('block'):"""
        self._start = time.perf_counter()
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.elapsed = time.perf_counter() - self._start
        display = self.label or "block"
        if self.elapsed >= self.threshold:
            print(f"[{display}] {self.elapsed:.6f}s")
        return False

# As a decorator
@Timer(label="sorting")
def sort_numbers(data):
    return sorted(data)

# As a context manager
with Timer("data processing", threshold=0.01) as t:
    data = [i ** 2 for i in range(500_000)]
    total = sum(data)

print(f"Elapsed time accessible after: {t.elapsed:.6f}s")

sort_numbers(list(range(1_000_000, 0, -1)))

The context manager form stores the elapsed time in self.elapsed, making it available after the with block completes. The decorator form prints the timing and returns the function's result transparently. Both forms respect the threshold parameter.

Note

The codetiming package on PyPI implements a production-grade version of this dual-use pattern with additional features like named timer aggregation and customizable output callbacks. If you need a ready-made solution rather than building your own, it is worth evaluating.

Exception Safety and Edge Cases

Every decorator shown so far has the same blind spot: if the decorated function raises an exception, the timing is never recorded. The elapsed calculation and the print/log call both sit after result = func(*args, **kwargs), so an exception skips them entirely. In a production system, the calls that fail are often the ones you want timing data for -- a database query that times out or an API call that raises after 30 seconds of waiting.

The fix is try/finally:

import functools
import logging
import time

logger = logging.getLogger(__name__)

def timer(func=None, *, threshold=0.0, level=logging.DEBUG):
    if func is None:
        return functools.partial(timer, threshold=threshold, level=level)

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        exc_raised = None
        try:
            result = func(*args, **kwargs)
            return result
        except Exception as e:
            exc_raised = e
            raise
        finally:
            elapsed = time.perf_counter() - start
            status = "raised" if exc_raised else "completed"
            if elapsed >= threshold:
                logger.log(level, "%s %s in %.6fs", func.__qualname__, status, elapsed)
    return wrapper

@timer
def fragile_operation():
    time.sleep(0.5)
    raise ConnectionError("server unreachable")

try:
    fragile_operation()
except ConnectionError:
    pass
# DEBUG: fragile_operation raised in 0.500312s

The finally block runs whether the function returns normally or raises. The decorator captures the exception in exc_raised to include the failure status in the log message, then re-raises it with raise so the caller still sees the original exception with its full traceback. This pattern is essential for production timing decorators because the functions that fail tend to be the slowest -- and those are exactly the calls you need visibility into.

Thread Safety

If the decorated function is called concurrently from multiple threads, the profiled decorator's shared timings list is not thread-safe. While CPython's GIL makes list.append atomic for simple cases, you should not rely on this -- use threading.Lock around shared state, or collect timings in thread-local storage. For async concurrency (a single event loop), the single-threaded nature of asyncio means shared lists are safe without locking.

Decorator Stacking Order

When multiple decorators are applied to a single function, the order determines what gets timed. Decorators apply bottom-up, so the innermost decorator wraps the function first:

# @timer wraps the result of @lru_cache wrapping fetch_user
# Timing includes cache lookup overhead
@timer
@functools.lru_cache(maxsize=128)
def fetch_user(user_id):
    time.sleep(0.3)  # simulated DB call
    return {"id": user_id, "name": "Alice"}

# First call: fetch_user completed in 0.300421s (cache miss)
# Second call: fetch_user completed in 0.000002s (cache hit)

If you reverse the order and place @timer below @lru_cache, the timer wraps the raw function and the cache wraps the timer -- meaning cache hits still call the timer wrapper (which always calls the original function), defeating the cache entirely. When stacking, place @timer on the outside so it measures the complete call path including any middleware, caching, or retry logic applied by inner decorators.

Spot the Bug

This timing decorator has a subtle but critical bug. Can you find it?

import functools import time def timer(func): @functools.wraps(func) def wrapper(*args, **kwargs): start = time.time() result = func(*args, **kwargs) elapsed = time.time() - start print(f"{func.__name__} took {elapsed:.6f}s") return result return wrapper

Check Your Understanding

1. What does @functools.wraps(func) do inside a decorator?

2. You apply a sync timing decorator to an async def function. What gets timed?

3. Why does the exception-safe timing decorator use try/finally instead of just try/except?

Key Takeaways

Always use time.perf_counter() for benchmarking. It is monotonic, not adjustable, and provides the highest resolution available. Use perf_counter_ns() when you need nanosecond precision as an integer. Reserve process_time() for measuring CPU-bound work that should exclude sleep and I/O.
Always apply @functools.wraps(func) to the wrapper. Without it, the decorated function loses its original name, docstring, module, and signature. This breaks introspection tools, documentation generators, and any code that relies on __name__ or __doc__.
Use the func=None pattern for optional-parentheses support. This lets your decorator work as @timer, @timer(), and @timer(threshold=0.5) without the caller needing to remember which form is correct.
Route output through logging in production. Printing to stdout is fine during development, but production code should use logging so timing output respects the application's log level, handlers, and formatters. Combine this with a threshold to log only slow calls.
Handle async functions explicitly. A synchronous wrapper applied to an async def function will time coroutine creation (near-zero), not coroutine execution. Check inspect.iscoroutinefunction(func) and provide an async wrapper that uses await.
Use try/finally for exception safety. Without it, a function that raises an exception produces no timing data. The finally block guarantees elapsed time is always recorded -- and in production, the calls that fail are often the slowest.
Mind your stacking order. Place @timer as the outermost decorator so it measures the full call path including caching, retries, and other middleware applied by inner decorators.

A timing decorator is one of the first tools to reach for when investigating performance. Starting from the minimal version and evolving it to match your needs -- adding thresholds, logging, statistics, async support, or context manager capability -- lets you measure what matters without cluttering the functions you are measuring.

Frequently Asked Questions

Why should I use time.perf_counter() instead of time.time() in a timing decorator?

time.perf_counter() uses a monotonic clock with the highest available resolution, meaning it never goes backward and is not affected by system clock adjustments (such as NTP syncs or daylight saving changes). time.time() uses the system wall clock, which is adjustable and can produce negative elapsed times if the clock is set backward between measurements. For benchmarking function execution, perf_counter is the correct choice.

How do I build a basic timing decorator in Python?

Define a function that takes a function as its argument. Inside it, define a wrapper that calls time.perf_counter() before and after calling the original function, computes the difference, and prints or logs the elapsed time. Use @functools.wraps on the wrapper to preserve the original function's name, docstring, and signature. Return the wrapper.

What is the difference between perf_counter and process_time for timing?

perf_counter measures wall-clock time (real elapsed time including sleep and I/O waits). process_time measures only CPU time consumed by the current process, excluding sleep. Use perf_counter when you want to know how long a user waits for a function to complete. Use process_time when you want to measure how much CPU work the function performed.

Can a timing decorator work with async functions?

A standard timing decorator will not work correctly with async functions because it wraps a coroutine rather than awaiting it. You need to detect whether the decorated function is a coroutine function (using inspect.iscoroutinefunction) and define an async wrapper that uses await to call the original function. The timing logic around the await captures the elapsed wall-clock time. Note that asyncio.iscoroutinefunction was deprecated in Python 3.14; inspect.iscoroutinefunction is the correct replacement.

How do I make the timing decorator only report slow functions?

Add a threshold parameter to the decorator. After measuring the elapsed time, compare it against the threshold before printing or logging. If the elapsed time is below the threshold, the decorator stays silent. This is useful in production where you want to be alerted only when a function exceeds an acceptable response time.

What happens if the decorated function raises an exception?

In a basic timing decorator without exception handling, the timing logic is skipped entirely because the exception jumps past the elapsed-time calculation. To fix this, wrap the function call in try/finally so the finally block always records and logs the elapsed time, regardless of whether the function returns normally or raises. This is especially important in production because failing calls (timeouts, connection errors) are often the slowest and are exactly the calls you need visibility into.

Is the profiled timing decorator thread-safe?

The simple profiled decorator shown with a shared timings list is not guaranteed to be thread-safe under concurrent calls. While CPython's GIL makes list.append atomic for simple cases, relying on this is an implementation detail. For production thread-safe timing, protect the shared list with threading.Lock, or collect timings in thread-local storage. In async code running on a single event loop, a shared list is safe without locking.