Understanding Byte Streams in Python

Every time you read a file in binary mode, download an image over HTTP, parse a network protocol, or work with a cryptographic hash, you are working with byte streams. Python's handling of binary data is one of the most important -- and frequently misunderstood -- aspects of the language.

The confusion around byte streams runs deep because Python 2 conflated text and binary data into one str type, and the clean separation that Python 3 introduced required developers to fundamentally rethink how they work with raw data. This article covers exactly how byte streams work in Python, from the foundational types (bytes, bytearray, memoryview) through the io module's stream abstractions, to the buffer protocol that makes zero-copy data sharing possible. Real code, real examples, real understanding of what is happening under the hood.

What Is a Byte Stream?

A byte stream is a sequence of bytes -- integers in the range 0 to 255 -- that flows from a source to a destination. The source might be a file on disk, a network socket, a chunk of memory, or a hardware device. The destination might be any of those same things. The key idea is that the data has no inherent meaning until you assign it one. The same byte stream could represent UTF-8 text, a JPEG image, a serialized Python object, or the firmware for a microcontroller. The interpretation is up to your code.

In Python, the fundamental unit of binary data is the bytes object. If you have worked with strings, the mental model is similar -- a bytes object is an immutable sequence -- except each element is an integer from 0 to 255 rather than a Unicode character.

# A bytes literal -- note the b prefix
data = b"Hello"
print(type(data))      # <class 'bytes'>
print(len(data))       # 5
print(data[0])         # 72 (the integer value of ASCII 'H')
print(data[1:3])       # b'el'
print(list(data))      # [72, 101, 108, 108, 111]

Note

Indexing a bytes object returns an integer, not a single-byte bytes object. Slicing returns bytes. This is by design and reflects the fact that individual bytes are fundamentally numbers.

The Python 3 Split: How We Got Here

Understanding byte streams in Python requires a brief detour through history, because the current design was a deliberate and sometimes contentious choice.

In Python 2, the str type was a sequence of raw bytes that happened to also be used for text. The unicode type handled actual Unicode text, but many programs never touched it. This meant you could freely mix binary data and text, which was convenient right up until it caused encoding errors that were extraordinarily painful to debug.

Python 3 drew a hard line. Guido van Rossum, in his June 2007 status update on the Python 3000 project, described the change as switching to a model where immutable text strings are Unicode and binary data is represented by a separate mutable bytes data type. The initial Python 3.0 alpha shipped with a mutable bytes type. But as van Rossum recounted in PEP 3137 (September 2007), pressure mounted to add a way to represent immutable bytes. Jeffrey Yasskin and Adam Hupp prepared a patch making bytes immutable, and the test suite fallout proved that surprisingly few places actually depended on mutability. This led van Rossum to propose what we have today: an immutable bytes type and a separate mutable bytearray type.

PEP 3137 notes that immutability brings concrete benefits: bytes objects can be used in code objects, serve as dictionary keys (because they are hashable), and simplify porting Python 2 code by allowing a direct substitution of str with bytes. (Guido van Rossum, PEP 3137, September 2007, peps.python.org/pep-3137)

The Three Core Binary Types

Python provides three built-in types for working with binary data. Each serves a distinct purpose, and understanding when to use which is essential.

Mental Model

Think of these three types like containers in a kitchen. bytes is a sealed jar -- you can read what's inside, but you cannot change it. bytearray is a mixing bowl -- you can add, remove, and rearrange contents freely. memoryview is a window cut into the side of the mixing bowl -- you see and touch the same contents without pouring anything into a new container. Same data, no copy, no duplication of effort.

bytes

Immutable sequence of integers, 0–255. The default binary type.

Mutable

Hashable

Yes — dict key safe

Slicing copies?

Yes — new object

Buffer protocol

Read-only

Reach for bytes when: you received data from a socket or file and won't modify it; you need a dictionary key; you're passing binary data to a function that should not be able to change it; you want to finalize a bytearray you've been building.

bytearray

Mutable sequence of integers, 0–255. The buffer you build into.

Mutable

Yes — in-place edits

Hashable

Slicing copies?

Yes — new bytearray

Buffer protocol

Read + write

Reach for bytearray when: you're accumulating data incrementally (building a protocol frame, appending chunks); you need to modify individual bytes in place; you're receiving data into a pre-allocated buffer with readinto() or recv_into(); you want to avoid O(n²) concatenation.

memoryview

A zero-copy window into another object's buffer. No data of its own.

Mutable

Depends on source

Hashable

Slicing copies?

No — another view

Buffer protocol

Mirrors source

Reach for memoryview when: you need many slices of a large buffer (10 KB+) and copies would be costly; you're doing repeated struct operations on different offsets; you're passing buffer slices to C extensions or I/O functions that accept bytes-like objects; you're working with recv_into() to advance a receive cursor without copying.

bytes: Immutable Binary Data

The bytes type is the workhorse for binary data in Python. It is immutable, hashable, and supports the full range of sequence operations. You will encounter it any time you read a file in binary mode, receive data from a socket, or encode a string.

# Creating bytes objects
from_literal = b"\x48\x65\x6c\x6c\x6f"     # hex escape sequences
from_string = "Hello".encode("utf-8")         # encoding text to bytes
from_list = bytes([72, 101, 108, 108, 111])   # from a list of ints
from_zero = bytes(10)                          # 10 zero bytes

print(from_literal)   # b'Hello'
print(from_string)    # b'Hello'
print(from_list)      # b'Hello'
print(from_zero)      # b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

# bytes are immutable
try:
    from_literal[0] = 74
except TypeError as e:
    print(f"Cannot mutate: {e}")

# bytes are hashable -- can be dictionary keys
lookup = {b"GET": "retrieve", b"POST": "create", b"DELETE": "remove"}
print(lookup[b"GET"])  # "retrieve"

bytearray: Mutable Binary Data

The bytearray type has the same interface as bytes but is mutable. This makes it the right choice when you need to build up binary data incrementally, modify bytes in place, or work with buffers that receive data from I/O operations.

# Creating a bytearray
buf = bytearray(b"Hello, World!")
print(buf)           # bytearray(b'Hello, World!')

# Mutation in place
buf[0] = 74          # Change 'H' (72) to 'J' (74)
print(buf)           # bytearray(b'Jello, World!')

# Slice assignment
buf[7:12] = b"Python"
print(buf)           # bytearray(b'Jello, Python!')

# Append and extend
buf.append(33)       # ASCII '!'
buf.extend(b" Rocks")
print(buf)           # bytearray(b'Jello, Python!! Rocks')

A common pattern is to use bytearray as a buffer for accumulating data, then convert to immutable bytes when you are done:

def build_packet(command, payload):
    buf = bytearray()
    buf.append(0x02)                          # STX (Start of Text)
    buf.extend(command.encode("ascii"))       # Command
    buf.append(0x1F)                          # Unit separator
    buf.extend(payload)                       # Binary payload
    buf.append(0x03)                          # ETX (End of Text)
    return bytes(buf)                         # Freeze as immutable

packet = build_packet("DATA", b"\x00\x01\x02\x03")
print(packet)        # b'\x02DATA\x1f\x00\x01\x02\x03\x03'
print(type(packet))  # <class 'bytes'>

memoryview: Zero-Copy Access

The memoryview type provides a way to access the internal data of an object that supports the buffer protocol -- without copying it. This is critical for performance when working with large binary data, because slicing a bytes or bytearray normally creates a new copy of the sliced data.

data = bytearray(b"Hello, World! This is a long byte sequence.")
view = memoryview(data)

# Slicing a memoryview does NOT copy data
chunk = view[7:12]
print(bytes(chunk))   # b'World'

# Modifying via memoryview changes the underlying data
chunk[0] = ord("E")   # Change 'W' to 'E'
print(data)            # bytearray(b'Hello, Eorld! This is a long byte sequence.')

This zero-copy behavior is what makes memoryview essential in high-performance scenarios. If you are processing a 100 MB binary file and need to examine specific sections, slicing a memoryview gives you a window into the data without allocating 100 MB of additional memory for copies.

import time

# Create a large bytearray (10 MB)
large_data = bytearray(10_000_000)

# Approach 1: Regular slicing (copies data)
start = time.perf_counter()
for i in range(1000):
    chunk = large_data[1000:9000]  # Creates a new bytearray each time
copy_time = time.perf_counter() - start

# Approach 2: memoryview slicing (zero-copy)
mv = memoryview(large_data)
start = time.perf_counter()
for i in range(1000):
    chunk = mv[1000:9000]  # No copy -- just a view
view_time = time.perf_counter() - start

print(f"Copy slicing: {copy_time:.4f}s")
print(f"View slicing: {view_time:.4f}s")
print(f"Speedup: {copy_time / view_time:.1f}x")

Pro Tip

The speedup from using memoryview over regular slicing is typically dramatic -- often 10x or more -- because it avoids allocating and copying memory entirely. Use it whenever you need repeated access to different sections of large binary data.

The Buffer Protocol: PEP 3118

Mental Model

The buffer protocol is like a lease agreement for memory. One object owns the memory and agrees to lend it out. A memoryview holds the lease — it can read (and sometimes write) through the window into the owner's storage. The key clause: the owner cannot resize or relocate while the lease is active. This is how a bytearray can refuse to resize while a memoryview of it is live. The protocol formalizes the rules of that lease at the C level, which is what allows NumPy arrays, bytes, bytearray, mmap, and custom C types to interoperate without copying.

The reason memoryview, bytes, bytearray, and types like NumPy arrays can share memory efficiently is the buffer protocol, formalized in PEP 3118. Authored by Travis Oliphant and Carl Banks, PEP 3118 redesigned how Python objects expose their underlying memory to other objects.

The buffer protocol defines a C-level API that allows an object to expose a pointer to its internal data, along with metadata about the data's format, shape, and strides. Any object that implements this protocol can participate in zero-copy data sharing. The bytes type exposes read-only buffers; bytearray exposes writable buffers.

PEP 688, authored by Jelle Zijlstra and accepted in March 2023, later made the buffer protocol accessible from Python code (not just C) in Python 3.12. PEP 688 introduced the __buffer__ dunder method (and its paired __release_buffer__ for cleanup) so that classes written in pure Python can participate in the buffer protocol. It added collections.abc.Buffer as a proper abstract base class for type annotation and runtime checking (use isinstance(obj, collections.abc.Buffer) to test buffer protocol support at runtime), and introduced inspect.BufferFlags to expose the flags that customize buffer creation. This matters practically because many functions throughout the standard library accept "bytes-like objects" -- meaning anything that implements the buffer protocol.

import hashlib

data_bytes = b"Hello"
data_array = bytearray(b"Hello")
data_view = memoryview(b"Hello")

# All three work -- hashlib accepts any bytes-like object
print(hashlib.sha256(data_bytes).hexdigest())
print(hashlib.sha256(data_array).hexdigest())
print(hashlib.sha256(data_view).hexdigest())
# All three produce the same hash

The io Module: Stream Abstractions

Python's io module provides a layered architecture for working with streams. There are three layers: raw I/O (unbuffered), buffered I/O, and text I/O. For byte streams, the first two layers are what matter.

BytesIO: In-Memory Byte Streams

io.BytesIO creates an in-memory binary stream that behaves exactly like a file opened in binary mode, but exists entirely in memory. It is the binary counterpart to io.StringIO.

import io

# Create an in-memory byte stream
stream = io.BytesIO()

# Write binary data
stream.write(b"First line\n")
stream.write(b"Second line\n")
stream.write(b"\x00\x01\x02\x03")  # Raw binary data

# Read it back -- must seek to the beginning first
stream.seek(0)
content = stream.read()
print(content)        # b'First line\nSecond line\n\x00\x01\x02\x03'
print(len(content))   # 28

# getvalue() returns all contents regardless of position
stream.write(b"More data")
print(stream.getvalue())  # Returns everything from start to end

BytesIO is indispensable in three situations: testing code that normally reads from or writes to files without touching the filesystem, working with libraries that expect file-like objects when your data is already in memory, and building binary payloads incrementally. Here is a real-world pattern -- generating a ZIP archive entirely in memory:

import io
import zipfile

# Build a ZIP file in memory
buffer = io.BytesIO()
with zipfile.ZipFile(buffer, "w", zipfile.ZIP_DEFLATED) as zf:
    zf.writestr("readme.txt", "This is the readme.")
    zf.writestr("data.csv", "name,value\nalice,42\nbob,17")

# The ZIP file is now in buffer
zip_bytes = buffer.getvalue()
print(f"ZIP size: {len(zip_bytes)} bytes")

# Read it back
buffer.seek(0)
with zipfile.ZipFile(buffer, "r") as zf:
    print(zf.namelist())   # ['readme.txt', 'data.csv']
    print(zf.read("data.csv").decode("utf-8"))

BufferedReader and BufferedWriter

When you open a file in binary mode with open("file", "rb"), Python does not give you raw file I/O. Instead, it wraps the raw stream in a BufferedReader, which reads data in chunks (typically 8 KB) and serves it from an internal buffer. This dramatically reduces the number of system calls, improving performance for workloads that read small amounts of data at a time.

import io

# Opening a file in binary mode returns a BufferedReader
with open("/dev/urandom", "rb") as f:
    print(type(f))       # <class '_io.BufferedReader'>
    data = f.read(16)    # Read 16 random bytes
    print(data.hex())    # Something like 'a3f7c2e19b...'

# 64 KB buffer for large sequential reads
with open("largefile.bin", "rb", buffering=65536) as f:
    chunk = f.read(4096)

RawIOBase: Unbuffered Access

For situations where you need direct, unbuffered access to the underlying I/O -- real-time data from a serial port, for instance -- you can use buffering=0:

# Unbuffered binary read (returns a FileIO object, a RawIOBase subclass)
with open("somefile.bin", "rb", buffering=0) as f:
    print(type(f))    # <class '_io.FileIO'>

Encoding and Decoding: The Bridge Between Bytes and Text

Why This Matters

Every web request, file read, and database response that contains text passes through this boundary. Getting the encoding wrong produces either a crash (UnicodeDecodeError) or silent data corruption — garbage characters that may not be caught until they reach a user's screen, a downstream system, or a security validator. Python 3's strict separation exists precisely so that this conversion is never implicit, never hidden, and always traceable to a specific line of code.

The most common source of confusion with byte streams is the relationship between bytes and str. In Python 3, these are fundamentally different types, and converting between them always requires an explicit encoding.

# Text to bytes: encoding
text = "Caf\u00e9"                    # 'Cafe' with an accent
utf8_bytes = text.encode("utf-8")     # b'Caf\xc3\xa9' (6 bytes)
latin1_bytes = text.encode("latin-1") # b'Caf\xe9'     (5 bytes)
utf16_bytes = text.encode("utf-16")   # 12 bytes with BOM

print(f"UTF-8:   {utf8_bytes!r}  ({len(utf8_bytes)} bytes)")
print(f"Latin-1: {latin1_bytes!r}  ({len(latin1_bytes)} bytes)")
print(f"UTF-16:  {utf16_bytes!r}  ({len(utf16_bytes)} bytes)")

# Bytes to text: decoding
restored = utf8_bytes.decode("utf-8")
print(restored)    # Cafe (with accent)

# Using the wrong encoding produces garbage or errors
try:
    wrong = utf8_bytes.decode("ascii")
except UnicodeDecodeError as e:
    print(f"Decode error: {e}")

Note

This explicitness is intentional. The Python 2 model of implicit conversion between bytes and text was responsible for countless bugs. Python 3 forces you to think about which encoding applies to your data, which is exactly what you want when working with network protocols, file formats, or any context where bytes cross system boundaries.

PEP 461: Formatting Bytes

One of the pain points that emerged from the Python 3 bytes/str split was working with wire protocols -- formats like HTTP, FTP, SMTP, and file formats like PDF and DBF that mix binary data with ASCII text segments. PEP 461, authored by Ethan Furman and shipped with Python 3.5, addressed this by adding %-formatting to bytes and bytearray. The PEP brought back a restricted form of %-interpolation that supports numeric formatting codes and a few string-related codes:

# Numeric formatting in bytes
status_line = b"HTTP/1.1 %d %s" % (200, b"OK")
print(status_line)    # b'HTTP/1.1 200 OK'

# Hex formatting
packet_id = b"ID: %04x" % 255
print(packet_id)      # b'ID: 00ff'

# Single byte insertion
separator = b"field1%cfield2" % 0x1F
print(separator)      # b'field1\x1ffield2'

Working with Binary File Formats

One of the standard library's most useful modules for byte stream work is struct, which lets you pack and unpack binary data according to format strings. This is how you read and write binary file formats at the byte level.

import struct

# Pack Python values into a binary structure
# '<' = little-endian, 'I' = unsigned int, 'H' = unsigned short, 'f' = float
header = struct.pack("<IHf", 1, 42, 3.14)
print(header.hex())       # e.g., '010000002a00c3f54840'  (use .hex(" ") for spaced output)
print(len(header))        # 10 bytes

# Unpack binary data back into Python values
version, count, value = struct.unpack("<IHf", header)
print(f"Version: {version}, Count: {count}, Value: {value:.2f}")

# Reading a BMP file header (first 14 bytes)
def read_bmp_header(filepath):
    with open(filepath, "rb") as f:
        header_data = f.read(14)

    magic, file_size, reserved1, reserved2, offset = struct.unpack(
        "<2sIHHI", header_data
    )
    return {
        "magic": magic,
        "file_size": file_size,
        "data_offset": offset,
    }

For more complex scenarios, you can use struct.iter_unpack to process repeated structures efficiently:

import struct

# Simulating a binary record format: 4-byte ID + 8-byte float value
records_binary = b""
for i in range(5):
    records_binary += struct.pack("<If", i + 1, (i + 1) * 1.5)

# Unpack all records in one pass
for record_id, value in struct.iter_unpack("<If", records_binary):
    print(f"Record {record_id}: {value:.1f}")

Network Byte Order and Endianness

Mental Model

Endianness is about where a number starts. When you write the number 256 as a 4-byte integer, you have four slots to fill. Big-endian puts the most significant byte first — like writing a date as year/month/day, which makes it naturally sortable. Little-endian puts the least significant byte first — like reading a number in reverse. Two machines can see the exact same four bytes and interpret them as completely different integers if they disagree on byte order. Network protocols standardize on big-endian ("network byte order") so every participant speaks the same language regardless of their CPU architecture.

When byte streams cross machine boundaries -- over a network, between architectures, or between languages -- byte order matters. A 32-bit integer like 256 is stored as \x00\x00\x01\x00 on a big-endian machine and \x00\x01\x00\x00 on a little-endian one. The struct module handles this with format prefixes, and the int type provides to_bytes and from_bytes methods:

import struct
import sys

print(f"System byte order: {sys.byteorder}")

value = 1024

# Using struct with explicit byte order
big_endian = struct.pack(">I", value)      # Network byte order (big-endian)
little_endian = struct.pack("<I", value)   # Little-endian

print(f"Big-endian:    {big_endian.hex()}")    # 00000400
print(f"Little-endian: {little_endian.hex()}")  # 00040000

# Using int methods
be = value.to_bytes(4, byteorder="big")
le = value.to_bytes(4, byteorder="little")
print(f"int.to_bytes big:    {be.hex()}")
print(f"int.to_bytes little: {le.hex()}")

# Parsing bytes back to int
restored = int.from_bytes(be, byteorder="big")
print(f"Restored: {restored}")  # 1024

Pro Tip

Network protocols almost universally use big-endian (also called "network byte order"), which is why struct uses > for big-endian and ! as a synonym specifically for network byte order.

Streaming Large Data: Chunked Reading

Real byte stream work often involves data too large to fit in memory. The standard pattern is chunked reading:

import hashlib

def hash_file(filepath, chunk_size=8192):
    """Hash a file of any size without loading it all into memory."""
    hasher = hashlib.sha256()
    with open(filepath, "rb") as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            hasher.update(chunk)
    return hasher.hexdigest()

# This works on files of any size -- 1 KB or 100 GB

The readinto method is even more efficient when you want to avoid allocating new buffers on each read:

def efficient_copy(src_path, dst_path, buffer_size=65536):
    """Copy a file using a pre-allocated buffer to minimize allocations."""
    buf = bytearray(buffer_size)
    with open(src_path, "rb") as src, open(dst_path, "wb") as dst:
        while True:
            n = src.readinto(buf)
            if not n:
                break
            if n < buffer_size:
                dst.write(buf[:n])
            else:
                dst.write(buf)

This pattern pre-allocates a single bytearray buffer and reuses it for every read, eliminating the overhead of creating and garbage-collecting thousands of temporary bytes objects.

Practical Example: Parsing a Custom Binary Protocol

To bring everything together, here is a complete example that reads, writes, and parses a simple binary protocol -- the kind of thing you might encounter with embedded devices, IoT sensors, or custom network services:

import struct
import io

# Protocol format:
# [1 byte: version] [2 bytes: message type] [4 bytes: payload length]
# [N bytes: payload] [2 bytes: CRC-16 checksum]

HEADER_FORMAT = ">BHI"  # version (B), msg_type (H), payload_len (I)
HEADER_SIZE = struct.calcsize(HEADER_FORMAT)  # 7 bytes

def build_message(msg_type, payload, version=1):
    """Build a binary protocol message."""
    header = struct.pack(HEADER_FORMAT, version, msg_type, len(payload))
    checksum = crc16(header + payload)
    return header + payload + struct.pack(">H", checksum)

def parse_message(stream):
    """Parse a binary protocol message from a stream."""
    header_bytes = stream.read(HEADER_SIZE)
    if len(header_bytes) < HEADER_SIZE:
        raise ValueError("Incomplete header")

    version, msg_type, payload_len = struct.unpack(HEADER_FORMAT, header_bytes)

    payload = stream.read(payload_len)
    if len(payload) < payload_len:
        raise ValueError("Incomplete payload")

    checksum_bytes = stream.read(2)
    received_crc = struct.unpack(">H", checksum_bytes)[0]
    expected_crc = crc16(header_bytes + payload)

    if received_crc != expected_crc:
        raise ValueError(f"CRC mismatch: expected {expected_crc:#06x}, got {received_crc:#06x}")

    return {"version": version, "type": msg_type, "payload": payload}

def crc16(data):
    """Simple CRC-16 implementation."""
    crc = 0xFFFF
    for byte in data:
        crc ^= byte
        for _ in range(8):
            if crc & 0x0001:
                crc = (crc >> 1) ^ 0xA001
            else:
                crc >>= 1
    return crc & 0xFFFF

# Build a message
sensor_data = struct.pack(">ff", 23.5, 65.2)  # temperature and humidity
message = build_message(msg_type=0x0010, payload=sensor_data)
print(f"Message: {message.hex()}")
print(f"Total size: {len(message)} bytes")

# Parse it back using BytesIO as our stream
stream = io.BytesIO(message)
parsed = parse_message(stream)
temp, humidity = struct.unpack(">ff", parsed["payload"])
print(f"Version: {parsed['version']}")
print(f"Type: {parsed['type']:#06x}")
print(f"Temperature: {temp}C, Humidity: {humidity}%")

This example uses every concept covered in the article: struct for packing and unpacking, BytesIO for stream abstraction, bytearray-style iteration for the CRC calculation, explicit byte order with > for big-endian, and chunked reading from a stream.

Decoding Errors and the errors Parameter

One of the most important and under-discussed aspects of byte stream work is what happens when decoding goes wrong. The .decode() method and the built-in open() function both accept an errors parameter that controls how decoding failures are handled. The default value, "strict", raises a UnicodeDecodeError on any byte sequence that does not map to a valid character in the target encoding. But that is not always the right behavior.

data = b"caf\xe9"   # 'café' in Latin-1, but invalid in UTF-8

# strict (default): raises UnicodeDecodeError
try:
    text = data.decode("utf-8")
except UnicodeDecodeError as e:
    print(f"Strict failed: {e}")

# ignore: silently drops bytes that cannot be decoded
print(data.decode("utf-8", errors="ignore"))    # 'caf'

# replace: substitutes U+FFFD (the replacement character)
print(data.decode("utf-8", errors="replace"))   # 'caf\ufffd'

# backslashreplace: produces a Python escape sequence
print(data.decode("utf-8", errors="backslashreplace"))  # 'caf\\xe9'

# surrogateescape: low-level round-trip mode for OS filenames
print(data.decode("utf-8", errors="surrogateescape"))   # 'caf\udce9'

Warning

The "ignore" error handler silently discards bytes it cannot decode. In security-sensitive code -- such as parsing HTTP headers, filenames, or JSON coming off a network socket -- this can mask malformed or adversarial input. In those contexts, prefer "strict" and handle the exception explicitly, or use "replace" only when you have a downstream plan for the replacement character.

The surrogateescape handler deserves special attention. Python uses it internally for OS-level file names, which may not be valid Unicode on every operating system. When you use os.listdir() on a directory containing a filename with bytes that are not valid in the system encoding, Python decodes those bytes using surrogateescape, producing surrogate characters (U+DC80 through U+DCFF). Pass the string back through encode("utf-8", errors="surrogateescape") and you recover the exact original bytes -- no data loss, no silent corruption. This is the mechanism that makes Python's filesystem operations lossless even in mixed-encoding environments.

import os

# Simulate a filename with non-UTF-8 bytes
raw_bytes = b"file_\xff_name.txt"

# surrogateescape round-trip
decoded = raw_bytes.decode("utf-8", errors="surrogateescape")
restored = decoded.encode("utf-8", errors="surrogateescape")
print(raw_bytes == restored)    # True -- exact round-trip

Memory-Mapped Files: mmap

Chunked reading is not always the right solution for large binary files. When your access pattern is random rather than sequential -- jumping to different sections of a large file based on data you have already read -- the mmap module offers a compelling alternative: a memory-mapped file behaves like a bytearray but is backed by a file on disk rather than RAM. The operating system handles page loading transparently; only the pages you actually touch are read from disk.

import mmap

# Memory-map a binary file for random access
with open("largefile.bin", "rb") as f:
    # Map the entire file into memory
    with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) as mm:
        # Access behaves like bytes -- slice, index, search
        print(mm[0:4])              # First 4 bytes
        print(len(mm))              # File size

        # Find a byte sequence anywhere in the file
        idx = mm.find(b"\x89PNG")
        if idx != -1:
            print(f"PNG header at offset {idx}")

        # Random access at arbitrary offset without seeking
        mm.seek(1024)
        chunk = mm.read(256)

For writable memory maps, you can modify file contents directly through the map without needing to track a file position or call seek() manually:

import mmap

# Modify a binary file in place using mmap
with open("patchable.bin", "r+b") as f:
    with mmap.mmap(f.fileno(), 0) as mm:
        # Patch a 4-byte value at offset 128
        mm[128:132] = b"\x01\x00\x00\x00"
        mm.flush()    # Ensure changes are written to disk

Pro Tip

mmap objects support the buffer protocol, so you can wrap them in a memoryview for zero-copy slicing. This combination -- memoryview(mmap_object) -- is the most memory-efficient way to process large binary files with random-access patterns, since you avoid both reading the entire file into RAM and allocating copies of slices.

The right mental model for mmap vs chunked reading: if you are processing a 500 MB binary log file from top to bottom, chunked reading is fine. If you are implementing a database engine, a binary search over a sorted index file, or a protocol parser that jumps backward and forward based on offset tables embedded in the file header, mmap is the sharper tool.

BytesIO.getbuffer(): Zero-Copy Access to the Internal Buffer

One underused method on io.BytesIO is .getbuffer(). It returns a memoryview over the BytesIO object's internal buffer without copying the contents. Mutating the view mutates the BytesIO transparently -- and critically, while the view is held, the underlying BytesIO object is locked against resizing operations.

import io

buf = io.BytesIO(b"Hello, World!")
view = buf.getbuffer()

# Read through the view -- no copy
print(bytes(view[7:12]))    # b'World'

# Write through the view -- modifies buf directly
view[7:12] = b"Earth"
print(buf.getvalue())       # b'Hello, Earth!'

# Must release the view before resizing buf
view.release()
buf.write(b" How are you?")
print(buf.getvalue())

This pattern is valuable when you are passing BytesIO contents to a C extension or another library that expects a writable buffer. Instead of extracting the bytes with .getvalue() (which copies), hand the library a getbuffer() view and let it write directly into the BytesIO object's memory. The constraint -- that the BytesIO cannot be resized while the view is live -- is the price of zero-copy access.

When NOT to Use memoryview

The previous sections make memoryview sound like a pure win. The reality is more nuanced, and knowing when to skip it prevents you from adding complexity without benefit.

A memoryview object has non-trivial setup cost. Creating it involves calling into the buffer protocol machinery, which has a fixed overhead that does not exist for a plain bytes slice. For small objects or operations performed only once or twice, the overhead of creating the memoryview will exceed any savings from avoided copies.

import time

# Small data -- memoryview overhead dominates
small = b"Hello"

start = time.perf_counter()
for _ in range(100_000):
    _ = small[1:4]             # Simple slice
copy_time = time.perf_counter() - start

start = time.perf_counter()
mv = memoryview(small)
for _ in range(100_000):
    _ = mv[1:4]                # memoryview slice
view_time = time.perf_counter() - start

print(f"Direct slice: {copy_time:.4f}s")
print(f"memoryview:   {view_time:.4f}s")
# On small data, direct slicing is often faster

The rule of thumb: use memoryview when all three of these are true: (1) the underlying data is large -- tens of KB or more, (2) you are performing many slice operations against the same underlying data, and (3) the code is in a hot path where allocations are measurably costly. If any of those conditions is absent, a plain bytes slice is simpler and just as fast.

There is also a readability cost. Code using memoryview requires the reader to understand the buffer protocol and to track whether the view is mutable or read-only based on the underlying type. On a team with developers who are less familiar with binary data handling, that cognitive overhead can outweigh the performance gains.

Byte Streams and asyncio

Everything discussed so far uses synchronous I/O. In an asyncio application -- an HTTP server, a WebSocket handler, or a network protocol implementation -- byte stream work is asynchronous, and the interfaces look different.

asyncio exposes binary streams through StreamReader and StreamWriter, which are the async counterparts of file objects in binary mode. The data they exchange is always bytes -- the same type you work with throughout the rest of this article, just consumed and produced through await expressions:

import asyncio

async def echo_server(reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
    """A simple echo server that reads binary data and echoes it back."""
    addr = writer.get_extra_info("peername")
    print(f"Connection from {addr}")

    while True:
        # Read up to 1024 bytes -- returns b'' on EOF
        data = await reader.read(1024)
        if not data:
            break

        # Inspect the raw bytes
        print(f"Received {len(data)} bytes: {data[:20].hex()}...")

        # Write bytes back to the client
        writer.write(data)
        await writer.drain()    # Wait for the send buffer to flush

    writer.close()
    await writer.wait_closed()

async def main():
    server = await asyncio.start_server(echo_server, "127.0.0.1", 8888)
    async with server:
        await server.serve_forever()

# asyncio.run(main())

The key pattern to notice: await reader.read(n) returns bytes, and writer.write(data) accepts any bytes-like object. Everything you know about parsing with struct, computing checksums over bytearray buffers, and handling encoding errors applies directly inside async handlers. The async layer handles concurrency; the byte layer handles data. They are orthogonal concerns.

For high-performance protocol implementations, asyncio's Protocol and BufferedProtocol classes expose a lower-level API. BufferedProtocol is particularly useful: the framework calls get_buffer() to request a preallocated bytearray or buffer-protocol object, writes incoming data directly into it, then calls buffer_updated(nbytes). This eliminates the allocation of a new bytes object for every network read -- the same zero-copy idea from readinto(), applied to async networking:

import asyncio

class ZeroCopyProtocol(asyncio.BufferedProtocol):
    """BufferedProtocol: the framework writes into our buffer directly."""

    def __init__(self):
        self._buf = bytearray(65536)      # Pre-allocated receive buffer
        self._transport = None

    def connection_made(self, transport):
        self._transport = transport

    def get_buffer(self, sizehint):
        # Return our pre-allocated buffer (or a slice of it)
        return self._buf

    def buffer_updated(self, nbytes):
        # Process the first nbytes of self._buf -- no allocation occurred
        data = memoryview(self._buf)[:nbytes]
        print(f"Received {nbytes} bytes, first 4: {bytes(data[:4]).hex()}")

This pattern -- preallocate a bytearray, hand a view of it to the framework, process only the filled portion via memoryview -- is the same strategy used in high-performance I/O libraries like uvloop and the Sanic web framework's core networking layer.

Inspecting and Debugging Byte Data

The article has covered how to read, write, and manipulate byte streams extensively. What it has not addressed is what to do when something goes wrong -- when the bytes coming off a socket do not match the protocol you expect, or when a struct unpack raises an error and you need to see exactly what landed in your buffer. Debugging byte data is a distinct skill, and Python gives you several tools for it.

The simplest is .hex(), available on any bytes or bytearray object since Python 3.5. In Python 3.8 and later, it accepts a separator argument that makes the output substantially easier to read:

data = b"\x01\x00\x10\x00\xff\xc3\xa9\x7f"

# Basic hex dump
print(data.hex())               # 010010 00ffc3a97f

# With a separator (Python 3.8+)
print(data.hex(" "))            # 01 00 10 00 ff c3 a9 7f
print(data.hex(":", 2))         # 0100:1000:ffc3:a97f  (groups of 2 bytes)

The binascii module provides the same capability and some additional utilities that .hex() does not cover. binascii.hexlify produces uppercase hex and works on any bytes-like object; binascii.unhexlify (or equivalently, bytes.fromhex()) reverses it. The bytes.fromhex() class method is the cleaner modern choice for converting a hex string back to bytes:

import binascii

data = b"\xde\xad\xbe\xef"

# hexlify produces bytes, not a str
print(binascii.hexlify(data))           # b'deadbeef'

# bytes.fromhex is the modern inverse
restored = bytes.fromhex("deadbeef")
print(restored)                          # b'\xde\xad\xbe\xef'
print(restored == data)                  # True

# Useful when parsing hex-encoded data from a config or protocol
token = bytes.fromhex("48656c6c6f")
print(token.decode("ascii"))             # Hello

For serious debugging -- inspecting a binary file header, diffing two buffers, or logging what came off a socket before parsing -- a formatted hex dump that shows both the hex bytes and their ASCII representation is invaluable. Python does not have one in the standard library, but it takes about ten lines to write a serviceable one:

def hexdump(data, width=16):
    """Print a hex dump with ASCII sidecar, similar to xxd or hexdump -C."""
    if isinstance(data, memoryview):
        data = bytes(data)
    for i in range(0, len(data), width):
        chunk = data[i:i + width]
        hex_part = " ".join(f"{b:02x}" for b in chunk)
        ascii_part = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
        print(f"{i:08x}  {hex_part:<{width * 3}}  |{ascii_part}|")

# Example: inspecting the first 32 bytes of a response
response = b"HTTP/1.1 200 OK\r\nContent-Type: application/json\r\n"
hexdump(response)

The output looks like this:

00000000  48 54 54 50 2f 31 2e 31  20 32 30 30 20 4f 4b 0d  |HTTP/1.1 200 OK.|
00000010  0a 43 6f 6e 74 65 6e 74  2d 54 79 70 65 3a 20 61  |.Content-Type: a|
00000020  70 70 6c 69 63 61 74 69  6f 6e 2f 6a 73 6f 6e 0d  |pplication/json.|

The ASCII sidecar immediately reveals whether what you think is binary data is actually mostly printable text, which is a common discovery when first working with HTTP, SMTP, or Redis wire formats. The . placeholder for non-printable bytes shows control characters and high-byte values at a glance without hiding them.

Pro Tip

When logging byte stream data in production, avoid logging raw bytes repr -- Python will use \x escapes for non-ASCII bytes but print ASCII bytes as their characters, which produces inconsistent, hard-to-grep output. Instead, log data.hex() -- a consistent, machine-readable hex string that you can feed back into bytes.fromhex() to reproduce the exact buffer in a test.

Partial Reads: The Socket Gotcha

Why This Matters

This is the category of bug that ships to production, passes all tests, works fine on localhost, and then fails intermittently under load or on slow connections — the hardest kind to diagnose. The code looks correct, behaves correctly in development, and only breaks in production under conditions that are difficult to reproduce in a test environment. Understanding why recv(n) is not a guarantee is the difference between writing network code that is robust and writing network code that is fragile by design.

The chunked reading examples earlier in this article use f.read(chunk_size) against a file. Files behave predictably: read(n) returns exactly n bytes until end-of-file. Sockets and network streams do not. This is one of the most common bugs written by developers new to binary network programming, and the article would be incomplete without addressing it directly.

When you call socket.recv(4), you are asking for up to 4 bytes. The operating system may return 1, 2, 3, or 4 bytes depending on network conditions, kernel buffer state, and TCP segment boundaries. Calling recv(n) and assuming you got exactly n bytes is a latent bug that will behave correctly 99% of the time on localhost and fail unpredictably in production over real networks.

import socket

# THIS IS WRONG -- recv may return fewer than 4 bytes
def read_header_wrong(sock):
    header_bytes = sock.recv(4)       # Might get 1, 2, 3, or 4 bytes
    if len(header_bytes) < 4:
        raise ValueError("Short read!")  # This branch is silently reached in production
    return struct.unpack(">I", header_bytes)[0]

# THIS IS CORRECT -- read exactly n bytes, looping until satisfied
def recv_exactly(sock, n):
    """Read exactly n bytes from a socket, blocking until all arrive."""
    buf = bytearray(n)
    view = memoryview(buf)
    received = 0
    while received < n:
        count = sock.recv_into(view[received:], n - received)
        if count == 0:
            raise EOFError("Connection closed before all bytes arrived")
        received += count
    return bytes(buf)

def read_header_correct(sock):
    header_bytes = recv_exactly(sock, 4)   # Guaranteed to be exactly 4 bytes
    return struct.unpack(">I", header_bytes)[0]

The recv_into method writes directly into the pre-allocated bytearray buffer via a memoryview slice -- the same zero-copy pattern from the buffer protocol section. Each iteration advances the view's start position by however many bytes arrived, so subsequent calls fill the remaining space without touching the bytes already received.

The same issue applies to asyncio.StreamReader.read(n). The read(n) method returns up to n bytes. For exact reads in async code, use readexactly(n), which raises asyncio.IncompleteReadError if the connection closes before all bytes arrive:

import asyncio

async def parse_framed_message(reader: asyncio.StreamReader):
    # readexactly raises IncompleteReadError on early EOF
    length_bytes = await reader.readexactly(4)
    payload_length = int.from_bytes(length_bytes, byteorder="big")

    payload = await reader.readexactly(payload_length)
    return payload

Warning

The partial-read bug is especially insidious because it is nearly impossible to reproduce in unit tests against BytesIO or a localhost socket. Both deliver data in a single chunk reliably. The bug only surfaces under real network conditions or artificial fuzzing. If your protocol parser has never been tested against a stream that delivers data one byte at a time, it has not been tested.

struct.pack_into and unpack_from: Zero-Copy struct

The earlier sections on struct used struct.pack, which allocates and returns a new bytes object for every call. In hot paths -- writing many records into a large buffer, or building a protocol frame from multiple fields -- that allocation pressure adds up. struct.pack_into and struct.unpack_from are the buffer-aware counterparts that eliminate it.

struct.pack_into(fmt, buffer, offset, *values) writes packed bytes directly into a writable buffer at the given byte offset. The buffer must support the writable buffer protocol -- a bytearray, a writable memoryview, or a ctypes array all qualify. No new object is allocated; the data lands directly in your pre-allocated memory:

import struct

# Pre-allocate a buffer for a batch of 100 sensor records
# Each record: 4-byte uint32 ID + 4-byte float value = 8 bytes
RECORD_FORMAT = "<If"
RECORD_SIZE = struct.calcsize(RECORD_FORMAT)   # 8 bytes

num_records = 100
buf = bytearray(num_records * RECORD_SIZE)

# Write 100 records directly into the buffer -- zero allocations per record
for i in range(num_records):
    offset = i * RECORD_SIZE
    struct.pack_into(RECORD_FORMAT, buf, offset, i + 1, (i + 1) * 0.5)

print(f"Buffer size: {len(buf)} bytes")   # 800 bytes
print(buf[:16].hex())   # First two records in hex

struct.unpack_from(fmt, buffer, offset=0) is the read-side equivalent: it unpacks fields from the buffer at the given offset without slicing or copying it first. This pairs naturally with memoryview for parsing binary files or protocol frames that have already been read into a buffer:

import struct

# Parse all 100 records back from the buffer -- no intermediate copies
RECORD_FORMAT = "<If"
RECORD_SIZE = struct.calcsize(RECORD_FORMAT)

# Works directly on the bytearray -- no slice required
for i in range(100):
    offset = i * RECORD_SIZE
    record_id, value = struct.unpack_from(RECORD_FORMAT, buf, offset)
    # print(f"Record {record_id}: {value:.1f}")

# unpack_from also works on memoryview -- same interface
view = memoryview(buf)
record_id, value = struct.unpack_from(RECORD_FORMAT, view, 0)
print(f"First record via memoryview: id={record_id}, value={value:.1f}")

The combination of pack_into and a pre-allocated bytearray is the standard approach for building binary frames in performance-sensitive code. Compare it to the naive approach: a loop calling struct.pack and concatenating results with += creates a new bytes object on every iteration and copies all previously accumulated data into it each time, giving O(n^2) behavior as the buffer grows. pack_into into a pre-sized bytearray is O(n).

Note

struct.calcsize(fmt) tells you exactly how many bytes a given format string requires. Always use it to compute offsets and buffer sizes rather than hand-calculating them. Format strings do not always produce the byte count you expect: alignment padding can be inserted between fields when using native byte order (@ or no prefix), which is why network and file format code consistently uses an explicit byte order prefix (>, <, or !) that disables padding.

PEP 3137 (Guido van Rossum -- 2007): Established the immutable bytes and mutable bytearray types for Python 3.0. Defined the fundamental split between binary and text data that governs all byte stream work in modern Python.

PEP 3118 (Travis Oliphant, Carl Banks -- 2006): Redesigned the buffer protocol for Python 3.0, enabling zero-copy memory sharing between objects. This is the foundation that makes memoryview, NumPy array interop, and efficient I/O possible.

PEP 461 (Ethan Furman -- authored 2014, shipped Python 3.5): Added %-formatting to bytes and bytearray in Python 3.5, restoring practical support for wire format protocols that mix binary data with ASCII segments.

PEP 688 (Jelle Zijlstra -- accepted March 2023, shipped Python 3.12): Made the buffer protocol accessible from Python code via the __buffer__ and __release_buffer__ dunder methods. Added collections.abc.Buffer as a standard ABC for type annotation and runtime buffer-protocol checks, and inspect.BufferFlags to expose buffer creation flags. Custom Python classes can now fully participate in the buffer protocol without dropping into C.

The Bottom Line

Which type do I reach for?

Do I need to hold data that won't change? (network response, hash, dict key)

→ yes

bytes

Immutable, hashable, safe to pass anywhere.

Am I building or modifying binary data incrementally? (framing a protocol, patching bytes in place)

→ yes

bytearray

Mutable, supports append / extend / readinto. Convert to bytes to freeze when done.

Do I need to slice a large buffer many times without copying? (parsing, struct offsets, zero-copy I/O)

→ yes

memoryview

Zero-copy views over bytes, bytearray, or mmap. Each slice is another view, not a copy.

Is the data a large file with random access patterns? (binary index, database engine, patch tool)

→ yes

mmap

OS-managed pages. Behaves like a bytearray backed by disk. Combine with memoryview for zero-copy slicing.

Do I need a file-like interface but my data is already in memory?

→ yes

io.BytesIO

Wrap any bytes-like in a stream interface. Use .getbuffer() for zero-copy write access to its internals.

Byte streams are the foundation of everything Python does with the outside world. Files, networks, hardware, serialization formats, cryptography, compression -- all of these operate on bytes. Python 3's strict separation of str and bytes can feel like friction when you first encounter it, but it eliminates an entire category of bugs that plagued Python 2 codebases for years.

bytes is immutable and hashable: use it for data that should not change -- network responses, cryptographic digests, dictionary keys.
bytearray is mutable: use it when you need to build or modify binary data in place, especially when accumulating data incrementally before finalizing it.
memoryview gives you zero-copy slicing: use it when performance matters on large data and you are performing many slice operations. For small data or one-off slices, plain slicing is simpler and comparably fast.
io.BytesIO gives you a file-like interface over in-memory bytes: use it when APIs expect files but your data is already in memory. Use .getbuffer() when you need a zero-copy writable view of its contents.
mmap is the right tool for large files with random access: memory-mapped files behave like bytearray objects backed by disk, with the OS transparently handling page loading.
Always be explicit about encoding when crossing the boundary between bytes and text -- and always choose an errors handler deliberately, not by accident.
In async code, the data is still just bytes: asyncio.StreamReader, asyncio.BufferedProtocol, and every other async I/O primitive produce and consume the same types you use synchronously.
Debug with .hex() and a proper hex dump: log data.hex() rather than raw bytes repr, and build a formatted hex dump function for inspecting binary frames during development.
Never assume a socket read is complete: socket.recv(n) returns up to n bytes. Use a loop with recv_into, or asyncio.StreamReader.readexactly(n), to guarantee you have all the bytes you expect before parsing.
Use struct.pack_into and struct.unpack_from for high-throughput struct work: packing directly into a pre-allocated bytearray at a given offset eliminates per-record allocations and avoids the O(n^2) cost of repeated concatenation.

Binary data does not care about your assumptions. The byte 0xC3 might be the first half of a UTF-8 encoded e-acute, a pixel value in a grayscale image, or the opcode for a return instruction on an x86 processor. Your code's job is to know which one it is and handle it correctly. That starts with understanding how Python represents, manipulates, and streams raw bytes -- and it ends with the discipline to be explicit at every boundary where interpretation happens.

Understanding Byte Streams in Python: bytes, bytearray, memoryview, and the io Module

What Is a Byte Stream?

The Python 3 Split: How We Got Here

The Three Core Binary Types

bytes: Immutable Binary Data

bytearray: Mutable Binary Data

memoryview: Zero-Copy Access

The Buffer Protocol: PEP 3118

The io Module: Stream Abstractions

BytesIO: In-Memory Byte Streams

BufferedReader and BufferedWriter

RawIOBase: Unbuffered Access

Encoding and Decoding: The Bridge Between Bytes and Text

PEP 461: Formatting Bytes

Working with Binary File Formats

Network Byte Order and Endianness

Streaming Large Data: Chunked Reading

Practical Example: Parsing a Custom Binary Protocol

Decoding Errors and the errors Parameter

Memory-Mapped Files: mmap

BytesIO.getbuffer(): Zero-Copy Access to the Internal Buffer

When NOT to Use memoryview

Byte Streams and asyncio

Inspecting and Debugging Byte Data

Partial Reads: The Socket Gotcha

struct.pack_into and unpack_from: Zero-Copy struct

Summary of Related PEPs

The Bottom Line