Learn What I/O Is in Python: Absolute Beginners Tutorial

Q: What is the print() function in Python?

print() is Python's built-in function for standard output. It converts its arguments to strings and writes them to the terminal, followed by a newline by default. You can pass multiple values separated by commas, and control the separator and end character using the sep and end keyword arguments.

Q: What is the input() function in Python?

input() is Python's built-in function for reading a line of text from the user at the terminal. It accepts an optional prompt string, displays it to the user, waits for them to press Enter, then returns everything typed as a plain string. You must convert that string to int or float if you need a number.

Q: How do you read a file in Python?

Use the built-in open() function with a file path and the mode 'r' to open a file for reading. Always use a with statement so Python closes the file automatically. Call .read() to get the entire file as a string, or .readlines() to get a list of lines.

Q: How do you write to a file in Python?

Use open() with mode 'w' to create or overwrite a file, or 'a' to append to an existing one. Inside a with block, call .write() and pass the string you want to write. Unlike print(), .write() does not add a newline automatically — you must include \n if you want one.

Q: What is the difference between 'w' and 'a' mode when writing files?

'w' (write) mode creates the file if it does not exist, and overwrites the entire file if it does. 'a' (append) mode also creates the file if needed, but leaves existing content intact and adds new content at the end. A third option, 'x' (exclusive creation), creates a new file but raises a FileExistsError if a file with that name already exists — useful when overwriting would be a bug.

Kandi Brian

Every useful program does two things: it receives data and it produces data. That exchange — taking something in and putting something out — is what programmers call I/O. Before you can write programs that interact with users or work with files, you need a clear picture of how Python handles that flow.

I/O stands for input/output. It describes the movement of data into and out of a running program. In Python, this takes several forms depending on where the data comes from and where it needs to go. At the beginner level the three forms you will work with most often are keyboard input, screen output, and file I/O. Each one uses a small set of built-in tools that Python provides out of the box.

What I/O Means and Why It Matters

A program that cannot receive data from the outside world can only ever do the same thing every time it runs. A program that cannot send data anywhere produces results nobody can see or use. I/O is what connects your code to reality.

Python organises I/O around the concept of streams. A stream is simply a sequence of data that flows in one direction. Three standard streams exist in almost every program:

What it is: The default channel through which a program receives data. By default this is the keyboard.
Python access: The input() function reads from stdin automatically. You can also access it directly via sys.stdin.

What it is: The default channel through which a program sends normal output. By default this is the terminal screen.
Python access: The print() function writes to stdout automatically. Direct access is available via sys.stdout.

What it is: A separate output channel reserved for error messages and diagnostics. It also appears in the terminal by default but can be redirected independently of stdout.
Python access: Pass file=sys.stderr to print() to send a message to stderr instead of stdout.

Note

These three streams are inherited from the operating system, not invented by Python. Every major programming language exposes them. Understanding this makes it easier to move between languages later.

Figure 1 — The three standard streams in a running Python program.

Standard Output: The print() Function

The most common piece of output in Python is a call to print(). It takes whatever values you pass in, converts them to strings, and writes them to stdout followed by a newline. The Python documentation defines it precisely: print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False). At its simplest:

python

print("Hello, world!")
# Output: Hello, world!

You can pass multiple values separated by commas. Python joins them with a space by default:

python

name = "Mona"
age = 28
print("Name:", name, "| Age:", age)
# Output: Name: Mona | Age: 28

Two keyword arguments give you control over how values are joined and how the line ends:

python

# sep changes the separator between values (default is a space)
print("red", "green", "blue", sep=", ")
# Output: red, green, blue

# end changes the line terminator (default is "\n")
print("Loading", end="...")
print("done")
# Output: Loading...done

predict the output what will Python print?

Before scrolling, look at this code and pick what you think Python will print:

python

print("a", "b", "c", sep="-", end="!\n")
print("done")

Pro Tip

f-strings (formatted string literals) are the cleanest way to embed variable values directly inside a string. Write an f before the opening quote, then place variable names or expressions inside curly braces: f"Hello, {name}!". They were introduced in Python 3.6 by PEP 498 and are the recommended approach in all modern Python code. They evaluate expressions at runtime, so you can write f"{2 + 2}" and get "4".

"The desire for a simpler way to format strings in Python drove this PEP."

— PEP 498 — Literal String Interpolation, Eric V. Smith, Python Software Foundation

code builder click a token to place it

Build a print() call that uses an f-string to output: Hello, Python!

your code will appear here...

f"Hello, {language}!") input( print( "Hello, Python!")

Why: The correct answer is print(f"Hello, {language}!"). An f-string requires the f prefix immediately before the quote. input( is a distractor — that function reads from the keyboard, not writes to the screen. A plain string "Hello, Python!") would also print correctly but does not use an f-string with a variable as the prompt asks.

Standard Input: The input() Function

Where print() sends data out, input() brings data in. It pauses the program, displays an optional prompt, waits for the user to type something and press Enter, then returns everything typed as a plain string.

python

name = input("What is your name? ")
print(f"Nice to meet you, {name}!")

The return value is always a string, even when the user types a number. If you need to do arithmetic with the result, you have to convert it:

python

raw = input("Enter your birth year: ")
year = int(raw)          # convert string to integer
age = 2026 - year
print(f"You are approximately {age} years old.")

Watch out

If the user types something that cannot be converted — for example, typing "hello" when you call int() — Python raises a ValueError and the program stops. You will learn how to handle this gracefully using try/except when you reach error handling.

spot the bug click the line that contains the bug

This function asks for two numbers and prints their sum. One line contains a bug. Click the line you think is wrong, then hit check.

The fix: Line 3 should be b = int(input("Enter second number: ")). Without int(), b is a string while a is an integer. Python cannot add an integer to a string, so line 4 raises a TypeError at runtime. Both variables must be converted before they can be added together.

How to Read and Write Files in Python

Once you move beyond the terminal, the most common form of I/O is file handling. Python provides the built-in open() function along with the with statement, which handles closing the file automatically even if something goes wrong inside the block.

Open the file with open() and a with statement

Call open() with the file path and the mode string. Wrap it in a with block so Python handles closing the file for you. The variable after as is your reference to the open file object.
Choose the correct mode

Pass 'r' to read an existing file. Pass 'w' to write — this creates the file if it does not exist, and overwrites it if it does. Pass 'a' to append new content to the end of an existing file without deleting what is already there. Pass 'x' (exclusive creation) to create a new file and raise a FileExistsError if the file already exists — useful when you need to guarantee you are not silently clobbering something.
Call the appropriate method

For reading, call .read() to get the entire file as one string, or .readlines() to get a list where each element is one line. For writing, call .write() and pass the string to store. Note that .write() does not add a newline automatically — include \n yourself when needed.

Here is what reading and writing looks like in practice:

python

# Writing to a file
with open("notes.txt", "w", encoding="utf-8") as f:
    f.write("Python I/O is straightforward.\n")
    f.write("Use 'with' to handle files safely.\n")

# Reading from a file
with open("notes.txt", "r", encoding="utf-8") as f:
    content = f.read()
    print(content)

python

# Appending a new line without overwriting
with open("notes.txt", "a", encoding="utf-8") as f:
    f.write("Appended this line later.\n")

# Reading line by line using readlines()
with open("notes.txt", "r", encoding="utf-8") as f:
    lines = f.readlines()
    for line in lines:
        print(line, end="")

For writing, there is a counterpart to .readlines() that many beginners miss: .writelines(). It accepts an iterable of strings and writes each one to the file without adding newlines between them — you are responsible for including \n in each string if you need line breaks:

python

# .writelines() writes an iterable of strings — no newlines added automatically
lines = ["First line\n", "Second line\n", "Third line\n"]
with open("output.txt", "w", encoding="utf-8") as f:
    f.writelines(lines)

# Also works with a generator — useful for large datasets
with open("output.txt", "w", encoding="utf-8") as f:
    f.writelines(f"Item {i}\n" for i in range(1, 6))

.write() vs .writelines()

Use .write() when you have a single string to write. Use .writelines() when you already have a list or generator of strings — it avoids constructing one large joined string in memory first. Remember that neither method adds newlines for you; include \n explicitly in each string.

There is a third approach that is more Pythonic than both .read() and .readlines() when you need to process a file line by line: iterating directly over the file object. Python's file objects are iterators, so you can use a for loop without loading the entire file into memory first:

python

# Iterate directly — one line at a time, no full load into memory
with open("notes.txt", "r", encoding="utf-8") as f:
    for line in f:
        print(line, end="")   # line already contains \n

When to use each

Use .read() when you need the file as one string. Use .readlines() when you need all lines in a list at once. Use for line in f: when you are processing the file sequentially and want minimal memory overhead — this pattern handles files of any size, from a few kilobytes to many gigabytes.

check your understanding question 1 of 3

"Explicit is better than implicit."

— PEP 20 — The Zen of Python

That principle applies directly to file I/O. The with statement makes the open and close operations visible and intentional. Leaving files open by calling open() without with and forgetting to call .close() is a real source of bugs, especially in long-running programs.

Output Buffering and Flushing

One behaviour that surprises many beginners is that print() does not always write to the screen the instant it is called. Python buffers stdout when it detects that output is going to a file or a pipe rather than an interactive terminal. That means your output may sit in memory and not appear until the buffer is full or the program ends.

You can force Python to write immediately using the flush keyword argument:

python

import time

for step in ["Connecting", "Authenticating", "Loading"]:
    print(step + "...", end="", flush=True)
    time.sleep(1)
    print(" done")

# Without flush=True the dots may not appear until after the sleep

You can also disable buffering entirely at startup by running Python with the -u flag (python -u script.py), or by setting the PYTHONUNBUFFERED=1 environment variable. Both approaches are common in containerised environments where log output needs to be visible in real time.

predict the output what appears on a single line?

This code runs in a terminal. What does the user see on one line before the newline arrives?

python

print("status", end="")
print(": OK")

Replacing stdin and stdout at Runtime

Because sys.stdin and sys.stdout are just file-like objects, you can replace them at runtime. This technique is used widely in testing and scripting to redirect I/O without touching the operating system level:

python

import sys
import io

# Capture all print() output into a string
buffer = io.StringIO()
sys.stdout = buffer

print("This goes into the buffer, not the terminal.")
print("So does this.")

sys.stdout = sys.__stdout__   # restore the original stdout
captured = buffer.getvalue()
print("Captured:", repr(captured))

Pro Tip

Python's standard library provides contextlib.redirect_stdout and contextlib.redirect_stderr as safer alternatives to manual sys.stdout assignment. They restore the original stream automatically, even if an exception occurs inside the block.

Writing to stderr

The third standard stream, sys.stderr, is reserved for error messages and diagnostic output. Sending errors to stderr instead of stdout matters because the two streams can be redirected independently at the shell level. A user running your script can pipe stdout to a file and still see error messages on the terminal — something that is impossible if you print everything to stdout.

The cleanest way to write to stderr from Python is to pass sys.stderr as the file argument to print():

python

import sys

# Normal output goes to stdout
print("Processing started...")

# Error messages belong on stderr
print("Error: file not found", file=sys.stderr)

# You can also write directly to the stream (must be a string, not bytes)
sys.stderr.write("Critical failure\n")

At the shell, you can redirect stdout and stderr independently. Running python script.py > output.txt 2> errors.txt sends normal output to output.txt and error messages to errors.txt. This separation is what makes log files useful in production — you can filter signal from noise without mixing the two streams.

Convention

Python's own runtime writes tracebacks and syntax errors to stderr. Following this convention in your own scripts makes your programs composable — other tools and scripts can consume your stdout output while ignoring diagnostic noise from stderr.

Encoding and Text Mode

When Python opens a file in text mode (the default), it automatically encodes and decodes bytes using a character encoding. The exact default depends on your platform and Python version. On most modern Linux and macOS systems it is UTF-8. On Windows, the default has historically been determined by the system locale — on older Windows installations it may be cp1252, cp932, or another legacy encoding — which has been a persistent source of cross-platform bugs for years.

Python 3.15 changes the default

Starting with Python 3.15, UTF-8 is the default encoding regardless of the system locale, on all platforms including Windows. This was formalised in PEP 686. If you need the previous locale-based behaviour after 3.15, pass encoding="locale" (available since Python 3.10) or set PYTHONUTF8=0. You can check your current platform encoding — regardless of UTF-8 mode — with import locale; locale.getencoding() (added in Python 3.11). Before 3.11, use locale.getpreferredencoding(False). Even with 3.15 making UTF-8 the default, always specifying encoding="utf-8" explicitly remains the right practice for code that must run on Python 3.14 and earlier.

This is a frequent source of bugs when code written on one machine is run on another. The safest practice — and mandatory for code that must work on Python 3.14 and earlier — is to always specify the encoding explicitly:

python

# Always specify encoding when portability matters
with open("data.txt", "r", encoding="utf-8") as f:
    content = f.read()

# Binary mode bypasses encoding entirely — use for images, PDFs, etc.
with open("image.png", "rb") as f:
    raw_bytes = f.read()

If you need to read a file whose encoding you do not know, the third-party chardet library can detect it. For files that may contain encoding errors, the errors parameter on open() controls the behaviour: errors="ignore" silently drops undecodable bytes, while errors="replace" substitutes a replacement character.

"The default encoding is platform dependent."

— Python 3 Built-in Functions — open(), Python Software Foundation

This is accurate through Python 3.14. From Python 3.15 onward, PEP 686 changes the default to UTF-8 on all platforms. The same advice still applies: specifying encoding="utf-8" explicitly is the portable choice that works correctly on every Python version and operating system.

Handling Missing Files

If you pass a path that does not exist to open() in read mode, Python raises a FileNotFoundError — a subclass of OSError. This is the single most common file I/O error beginners encounter, and it stops the program unless you handle it. The right tool is a try/except block:

python

import sys

try:
    with open("config.txt", "r", encoding="utf-8") as f:
        content = f.read()
except FileNotFoundError:
    print("Error: config.txt not found.", file=sys.stderr)
    sys.exit(1)    # exit with a non-zero code to signal failure

Notice the error message goes to sys.stderr, not stdout. The sys.exit(1) call terminates the program and signals to the calling shell or process that something went wrong. Exit code 0 means success; any non-zero value means failure.

Watch out

Avoid checking if a file exists with os.path.exists() before opening it. Between the check and the open() call, another process could delete the file — a race condition. The correct pattern is to attempt the open directly and catch FileNotFoundError if it fails.

spot the bug click the line that contains the bug

This script tries to read a config file safely. One line introduces a bug that makes the error handling unreliable. Find it.

The fix: The os.path.exists() check on line 2 is the bug. Between that check and the open() call on line 3, another process could delete the file — a race condition that makes the guard unreliable. The safe pattern is try: open(...) except FileNotFoundError:, which eliminates the gap between checking and opening.

Command-Line Arguments with sys.argv

The input() function handles interactive keyboard input, but many Python scripts need to accept input before they start running — through arguments typed on the command line. Python collects those arguments in sys.argv, a list of strings. The first element (sys.argv[0]) is always the script name; the rest are the arguments the user passed.

python

# greet.py
import sys

if len(sys.argv) < 2:
    print("Usage: python greet.py <name>", file=sys.stderr)
    sys.exit(1)

name = sys.argv[1]
print(f"Hello, {name}!")

Running this script with python greet.py Alice produces Hello, Alice!. Running it without an argument prints the usage message to stderr and exits with code 1.

All sys.argv values are strings, just like the return value of input(). Convert them with int() or float() when your script expects a number. For more sophisticated argument parsing — named flags, optional arguments, automatic help text — the standard library's argparse module builds on this foundation.

check your understanding question 1 of 3

Common Mistakes and Their Fixes

These are the errors beginners hit most often when working with Python I/O. Each one has a clear mechanical cause and a direct fix.

What goes wrong: TypeError — input() returns a string; you cannot add an integer to it.
The fix: age = int(input("Age: ")) — convert immediately at the point of input.

What goes wrong: File descriptor leaks if an exception occurs before .close(). On Linux, each process has a limit (typically 1024 open file descriptors) — exhaust it and subsequent open() calls raise OSError: [Errno 24] Too many open files.
The fix: with open("data.txt") as f: — the context manager closes the file on exit, even if an exception fires.

What goes wrong: Mode 'w' truncates the file to zero bytes before your first .write() call. The truncation happens at open() time, not at write time — so even if you never write anything, the file is already empty.
The fix: Use "a" to append without truncating. Use "x" (exclusive creation) to raise FileExistsError if the file already exists — the safest option when overwriting would be a bug.

What goes wrong: UnicodeDecodeError on machines with a non-UTF-8 locale — common on Windows where the default may be cp1252 or cp932. Code that works on your Mac silently breaks on a colleague's Windows machine.
The fix: open("file.txt", encoding="utf-8") — always explicit. Check your platform default with import locale; locale.getpreferredencoding().

What goes wrong: Error text pollutes stdout and cannot be separated from normal output. If a caller pipes your script's output with python script.py | grep result, the error messages appear inline and corrupt the piped data.
The fix: print("Error: ...", file=sys.stderr) — stderr is a separate stream that shells and pipelines handle independently.

What goes wrong: Race condition (TOCTOU — Time Of Check, Time Of Use). Another process may delete or replace the file in the window between the existence check and the open() call. The check is also redundant — open() itself will raise an exception if the file is missing.
The fix: try: open(path) except FileNotFoundError: — catch the exception directly. This is the EAFP pattern (Easier to Ask Forgiveness than Permission) that Python favors over LBYL (Look Before You Leap).

What goes wrong: IndexError: list index out of range when the argument is missing. The traceback points at your own code rather than giving users a helpful usage message.
The fix: Check len(sys.argv) >= 2 before accessing index 1, then print a usage message to stderr and call sys.exit(1). For scripts with multiple arguments, argparse handles this automatically.

Python I/O Learning Summary

I/O means input/output. Every program that does something useful takes data in and sends data out. Python organises this around three standard streams: stdin, stdout, and stderr.
print() handles output. It writes to stdout by default. Use sep to control how multiple values are joined and end to control the line terminator. The flush argument (added in Python 3.3) forces immediate output. F-strings are the most readable way to embed variable values in output.
input() handles keyboard input. It always returns a string. Convert the result with int() or float() when you need a numeric value.
open() handles file I/O. Use 'r' to read, 'w' to write (overwrites), 'a' to append, and 'x' to create exclusively (raises FileExistsError if the file already exists). Always wrap open() in a with statement so the file is closed automatically.
Three read methods, three use cases. Use .read() for the full file as one string, .readlines() for a list of lines, and .readline() to pull one line at a time — useful when reading large files you cannot load into memory all at once.
Type conversion is your responsibility. Python does not convert data for you when reading from input() or reading strings from files. You need to call int(), float(), or other converters explicitly.
Buffering affects when output appears. stdout is buffered when writing to a pipe or file. Use flush=True or the -u flag when you need output to appear in real time.
stderr is for errors. Pass file=sys.stderr to print() to send error messages to the error stream instead of stdout. The two streams can be redirected independently at the shell level, which is what makes this distinction useful.
Iterate over files directly. A for line in f: loop is the idiomatic way to read a file line by line without loading it all into memory. Use it instead of .readlines() when the file could be large.
FileNotFoundError is the common file exception. Wrap open() in a try/except FileNotFoundError block rather than checking existence with os.path.exists() first. This avoids the race condition between the check and the open.
sys.argv holds command-line arguments. Every string after the script name on the command line appears as an element of sys.argv. All values are strings; convert them as needed. For complex argument parsing, use the standard library's argparse module.
Always specify encoding when opening files. Default encodings differ between operating systems on Python 3.14 and earlier — check your platform default with locale.getencoding() (Python 3.11+) or locale.getpreferredencoding(). Python 3.15 makes UTF-8 the default on all platforms via PEP 686, but passing encoding="utf-8" explicitly to open() remains the correct practice for code targeting any Python version.

These fundamentals cover the I/O you will use in the vast majority of beginner and intermediate Python programs. More advanced forms — network sockets, database connections, binary file reading, serialisation formats like JSON and CSV — all follow the same mental model: data comes in through some channel, your program transforms it, and data goes out through some channel.

If you are working through python tutorials and this is your first encounter with I/O, take time to experiment with each function before moving on. Run the code examples in a terminal or IDE, intentionally cause a ValueError by passing the wrong type, and try redirecting stdout to a file from the command line with python script.py > output.txt. Hands-on repetition is what moves I/O from concept to instinct.

One detail worth understanding at the systems level: print() and input() are high-level wrappers built on top of the underlying C standard library calls fwrite and fgets. CPython calls these through the operating system's file descriptor API. Every call to print() ultimately writes bytes to file descriptor 1 (stdout), and every call to input() reads bytes from file descriptor 0 (stdin). Understanding this helps explain why you can redirect stdin and stdout at the shell level — the operating system owns those file descriptors, not Python. This is the kind of detail that separates programmers who understand the tool from those who only use it.

Under the Hood: How Python I/O Actually Works

Most Python tutorials stop at print() and open(). This section goes further — into the CPython internals, the OS layer, and the specific behaviours that catch even experienced programmers off guard.

The CPython I/O stack has three layers

When you call print("hello"), the data does not jump directly from your Python string to the terminal. It passes through three distinct layers in CPython's implementation:

The Python text layer (io.TextIOWrapper) handles encoding — converting the Unicode string to bytes using UTF-8, cp1252, or whatever encoding the stream uses.
The Python buffer layer (io.BufferedWriter) holds bytes in an in-memory buffer and decides when to flush them down to the OS. This is the layer responsible for the buffering behaviour you read about earlier.
The raw OS layer (io.FileIO) issues the actual write(2) system call — handing bytes to the kernel, which then decides when to deliver them to the terminal or flush them to disk.

You can inspect this stack yourself:

python

import sys

# sys.stdout is a TextIOWrapper
print(type(sys.stdout))           # <class '_io.TextIOWrapper'>

# .buffer is the BufferedWriter underneath
print(type(sys.stdout.buffer))    # <class '_io.BufferedWriter'>

# .buffer.raw is the FileIO at the bottom
print(type(sys.stdout.buffer.raw))  # <class '_io.FileIO'>

# The file descriptor number (1 = stdout, 0 = stdin, 2 = stderr)
print(sys.stdout.fileno())        # 1

Why this matters — including thread safety

Understanding the three-layer stack explains several non-obvious behaviours. flush=True only flushes the Python BufferedWriter layer — it does not force the OS kernel to flush its own page cache to disk (for that you need os.fsync()). Binary writes must go through sys.stdout.buffer.write() rather than sys.stdout.write(), because TextIOWrapper only accepts str. Replacing sys.stdout with io.StringIO works because StringIO satisfies the same TextIOBase interface. The Python documentation also explicitly notes that TextIOWrapper objects — including sys.stdout — are not thread-safe, while the underlying BufferedWriter objects are thread-safe (they use an internal lock). If you are writing to sys.stdout from multiple threads simultaneously, individual print() calls may interleave. Use a threading.Lock around output sections that must appear atomically.

Flushing Python vs flushing the OS: flush() vs os.fsync()

There are two entirely separate flush operations that beginners often confuse. Calling flush() (or print(..., flush=True)) only moves data from Python's BufferedWriter layer to the OS kernel's page cache. The data is still in RAM — the kernel owns it and decides when to write it to physical disk. To guarantee data reaches durable storage, you need os.fsync():

python

import os

with open("critical.dat", "w", encoding="utf-8") as f:
    f.write("important data\n")
    f.flush()          # moves from Python buffer → OS kernel page cache
    os.fsync(f.fileno())  # forces OS to flush kernel cache → physical disk
    # After fsync(), a power failure will NOT corrupt the data

flush=True does not mean the data is on disk

os.fsync() is a blocking syscall and is expensive — it can take tens to hundreds of milliseconds on spinning disk. Use it only when durability is genuinely required (financial records, database write-ahead logs, configuration saves). For log files and most general output, flush=True is sufficient.

Why print() is slower than sys.stdout.write() — but only in specific cases

print() calls str() on each argument, joins them with sep, appends end, then calls sys.stdout.write() once per print call. In tight loops writing millions of lines, this overhead is measurable. The standard pattern for high-throughput output is to accumulate lines in a list and write once:

python

import sys

# Slow: one write() call per line, plus print() overhead
for i in range(1_000_000):
    print(i)

# Fast: accumulate, then one write() call
sys.stdout.write("\n".join(str(i) for i in range(1_000_000)) + "\n")

# Fastest for very large output: write directly to the binary buffer
# (bypasses the text encoding layer — only valid for pure ASCII)
out = "\n".join(str(i) for i in range(1_000_000)) + "\n"
sys.stdout.buffer.write(out.encode("ascii"))

For typical programs — scripts, data processing tools, web applications — this difference is irrelevant. It matters only when your program's bottleneck is literally the I/O throughput of stdout, not computation.

The line-buffering mode you probably did not know existed

Python's open() accepts a buffering parameter that almost nobody discusses. The exact values and their meanings, per the Python documentation:

buffering=0 — unbuffered (binary mode only; every write goes straight to the OS)
buffering=1 — line-buffered (text mode only; flushes after every \n)
buffering=N (N > 1) — fixed-size buffer of approximately N bytes (binary mode; for text mode Python ignores this and uses the system default)
buffering=-1 — use the system default: io.DEFAULT_BUFFER_SIZE (8,192 bytes). Python first tries to read the filesystem block size via os.stat() and uses that if available; 8,192 bytes is the fallback when the block size cannot be determined.

python

# Line-buffered: every newline triggers a flush — useful for log files
# that need to be tail-able without losing data on crash
with open("app.log", "w", encoding="utf-8", buffering=1) as log:
    log.write("Server started\n")    # flushed immediately
    log.write("Listening on :8080\n") # flushed immediately

The buffering=1 mode is one of the best-kept secrets in Python file I/O. It is ideal for log files in long-running processes: you get the performance benefit of not making a system call on every character, but each complete line lands on disk immediately — so if the process crashes, you do not lose the last N kilobytes of log output that were sitting in a full buffer.

What actually happens when you read() a large file

When you call f.read() on a 2 GB file, CPython issues multiple read(2) system calls, each returning up to io.DEFAULT_BUFFER_SIZE (8,192 bytes) per call. It assembles the results into a Python str object in memory. CPython's string representation is defined by PEP 393 — the "Flexible String Representation" — and uses 1, 2, or 4 bytes per character depending on the highest Unicode code point in the string: ASCII and Latin-1 strings (U+0000–U+00FF) use 1 byte per character; strings containing any character in the BMP range (U+0100–U+FFFF) use 2 bytes per character for every character in the string; strings containing any character beyond the BMP (U+10000 and above, such as most emoji) use 4 bytes per character throughout. A 2 GB UTF-8 file containing only ASCII text loads into roughly 2 GB of RAM. The same file with a single emoji forces the 4-byte representation for every character, potentially consuming up to 8 GB. This is why iterating line by line with for line in f: matters so much for large files.

The for line in f: iterator avoids this by maintaining only one line in memory at a time. Internally it calls readline() repeatedly, which reads up to the next character from the buffer, refilling from the OS when the buffer runs out. The result is constant memory usage regardless of file size.

The newline translation you did not ask for

Text mode silently translates newlines. On Windows, Python converts (CRLF) to when reading, and converts back to when writing. On Linux and macOS, no translation happens. You can control this with the newline parameter:

python

# Read with universal newline translation (default)
with open("file.txt", "r", encoding="utf-8") as f:
    content = f.read()   # \r\n and \r both become \n

# Disable translation — get bytes exactly as stored on disk
with open("file.txt", "r", encoding="utf-8", newline="") as f:
    content = f.read()   # \r\n preserved as-is

# This matters for: CSV parsing (csv module sets newline="" internally),
# processing files from mixed-OS sources, and diff-generation tools.

Why the csv module always sets newline=""

The Python csv module documentation explicitly instructs you to open files with newline="". The reason: CSV fields can legally contain embedded newlines (inside quoted fields), and if Python's universal newline translation fires first, it can corrupt those embedded newlines before the CSV parser sees them. The csv module handles its own newline interpretation.

File descriptors, file objects, and the difference between them

A file descriptor is a small non-negative integer — a kernel-level handle to an open resource. A Python file object wraps a file descriptor with buffering, encoding, and a Python API. You can obtain the file descriptor from any file object with .fileno(), and you can create a Python file object from a raw descriptor with os.fdopen():

python

import os

with open("notes.txt", "w", encoding="utf-8") as f:
    fd = f.fileno()           # e.g. 3, 4, 5 — always >= 3 (0,1,2 are stdin/stdout/stderr)
    print(f"File descriptor: {fd}")

# os.open() returns a raw fd, not a Python file object
fd2 = os.open("notes.txt", os.O_RDONLY)
f2 = os.fdopen(fd2, "r", encoding="utf-8")   # wrap it in a Python file object
try:
    print(f2.read())
finally:
    f2.close()    # closing f2 also closes fd2

Why sys.stdout exists separately from sys.stdout

Python keeps two references: sys.stdout is the current stdout — the one print() uses, which you can replace freely. sys.__stdout__ is the original stdout that was in place when the interpreter started. Replacing sys.stdout never changes sys.__stdout__. This lets you always get back to the original terminal stream, regardless of what other code may have done to sys.stdout:

python

import sys, io

# Some library replaces sys.stdout
sys.stdout = io.StringIO()

# Your code still needs to reach the terminal
print("This is lost in the StringIO buffer")

# Emergency escape hatch — always points to the original terminal stream
sys.__stdout__.write("This reaches the terminal no matter what\n")

The same pattern applies to stderr

sys.__stderr__ is the original stderr stream, preserved exactly as sys.__stdout__ is preserved for stdout. Both originals survive any number of reassignments to sys.stdout or sys.stderr.

Sources and References

Every technical claim in this article is verifiable against the primary sources listed below. All Python documentation is published by the Python Software Foundation under the PSF License.

Python 3 Built-in Functions: print() — Python Software Foundation. Confirms the full signature print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False) and that the flush keyword was added in Python 3.3.
Python 3 Built-in Functions: input() — Python Software Foundation. Confirms that input() always returns a str and that the prompt is written to sys.stdout without a trailing newline.
Python 3 Built-in Functions: open() — Python Software Foundation. Confirms file modes r, w, a, and x; the platform-dependent default encoding; and the errors parameter.
Input and Output — Python 3 Tutorial — Python Software Foundation. The official tutorial chapter covering formatted string literals, open(), and the with statement for file handling.
PEP 20 — The Zen of Python — Tim Peters, Python Software Foundation. Source of the "Explicit is better than implicit" principle cited in the file I/O section.
PEP 498 — Literal String Interpolation — Eric V. Smith, Python Software Foundation. Specifies the f-string syntax and its introduction in Python 3.6.
Python sys module — sys.stdin, sys.stdout, sys.stderr — Python Software Foundation. Confirms that these are file-like objects that can be replaced at runtime.
Python io module — io.StringIO — Python Software Foundation. Confirms the in-memory text stream used for stdout capture in testing.
Python contextlib — contextlib.redirect_stdout — Python Software Foundation. The safer alternative to manual sys.stdout assignment for capturing output in tests.
Python io module — TextIOWrapper, BufferedWriter, FileIO — Python Software Foundation. Confirms the three-layer I/O stack, io.DEFAULT_BUFFER_SIZE (8,192 bytes), the buffering parameter values, and that TextIOWrapper is not thread-safe while BufferedWriter is.
PEP 393 — Flexible String Representation — Martin von Löwis, Python Software Foundation. Defines CPython's variable-width string storage: 1 byte per character for ASCII/Latin-1 (U+0000–U+00FF), 2 bytes for BMP characters (U+0100–U+FFFF), and 4 bytes for non-BMP characters (U+10000+). This determines the RAM impact of loading large text files.
PEP 686 — Make UTF-8 Mode Default — Inada Naoki, Python Software Foundation. Specifies that Python 3.15 enables UTF-8 as the default text encoding on all platforms, replacing the previous locale-dependent default. The opt-out is PYTHONUTF8=0 or -X utf8=0.

check your understanding question 1 of 5

Certificate of Completion

Final Exam

Pass mark: 80% · Score 80% or higher to receive your certificate

Enter your name as you want it to appear on your certificate, then start the exam. Your name is used only to generate your certificate and is never transmitted or stored anywhere.

Your name

Please enter your name before starting the exam.

Question 1 of 15

Frequently Asked Questions

What does I/O mean in Python?

I/O stands for input/output. In Python, input refers to any data that enters your program — from the keyboard, a file, or another program. Output refers to any data your program sends out, most commonly text printed to the screen or written to a file.

What is the print() function in Python?

print() is Python's built-in function for standard output. It converts its arguments to strings and writes them to the terminal, followed by a newline by default. You can pass multiple values separated by commas, and control the separator and end character using the sep and end keyword arguments.

What is the input() function in Python?

input() is Python's built-in function for reading a line of text from the user at the terminal. It accepts an optional prompt string, displays it to the user, waits for them to press Enter, then returns everything typed as a plain string. You must convert that string to int or float if you need a number.

What is the difference between stdin, stdout, and stderr?

stdin (standard input) is the default channel for receiving input, usually the keyboard. stdout (standard output) is the default channel for sending normal output, usually the terminal. stderr (standard error) is a separate output channel reserved for error messages, also displayed in the terminal by default but kept separate so it can be redirected independently.

How do you read a file in Python?

Use the built-in open() function with a file path and the mode 'r' to open a file for reading. Always use a with statement so Python closes the file automatically. Call .read() to get the entire file as a string, or .readlines() to get a list of lines.

How do you write to a file in Python?

Use open() with mode 'w' to create or overwrite a file, or 'a' to append to an existing one. Inside a with block, call .write() and pass the string you want to write. Unlike print(), .write() does not add a newline automatically — you must include \n if you want one.

Why does input() always return a string in Python?

Because Python cannot know ahead of time whether you want a number, a name, or something else. It returns the raw text the user typed and leaves type conversion to you. Use int() to convert to an integer or float() to convert to a floating-point number.

What is the difference between 'w' and 'a' mode when writing files?

'w' (write) mode creates the file if it does not exist, and overwrites the entire file if it does. 'a' (append) mode also creates the file if needed, but leaves existing content intact and adds new content at the end.

What happens if you forget to close a file in Python?

The file stays open, consuming a system resource called a file descriptor. On some systems this causes data not to be flushed to disk until the program ends. Using a with statement eliminates this problem because Python closes the file automatically when the block exits, even if an error occurs.

Can Python read input from sources other than the keyboard?

Yes. Python can read from files, network sockets, databases, pipes from other programs, environment variables, and command-line arguments through sys.argv. The input() function only covers keyboard input at the terminal; all other sources require different approaches.

What is output buffering in Python and how do you flush it?

Python buffers stdout when output is going to a file or pipe rather than an interactive terminal. Buffered output sits in memory until the buffer fills or the program ends. Force immediate output with print('text', flush=True), by running Python with the -u flag, or by setting the PYTHONUNBUFFERED=1 environment variable.

How do you capture print() output in Python for testing?

Replace sys.stdout with an io.StringIO() object before calling print(), then restore sys.stdout to sys.__stdout__ afterward and read the captured text with buffer.getvalue(). A safer approach is contextlib.redirect_stdout, which restores the original stream automatically even if an exception occurs.

Why does Python raise a UnicodeDecodeError when reading a file?

Python text mode uses a default encoding that varies by operating system and Python version. On most Linux and macOS systems it is UTF-8. On Windows with Python 3.14 and earlier it is the system locale encoding (such as cp1252 or cp932). From Python 3.15 onward, PEP 686 makes UTF-8 the default on all platforms. If the file's actual encoding does not match what Python expects, Python raises a UnicodeDecodeError. The portable fix — correct on all Python versions — is to specify the encoding explicitly: open('file.txt', 'r', encoding='utf-8').

What is the difference between .read(), .readline(), and .readlines()?

.read() loads the entire file into a single string. .readlines() loads the entire file and returns a list where each element is one line, including the newline character at the end. .readline() reads just one line at a time, which makes it useful for very large files that would be impractical to load into memory all at once. After the file ends, .readline() returns an empty string rather than raising an error.

What does the 'x' file mode do in Python?

The 'x' mode (exclusive creation) creates a new file for writing. Unlike 'w', it raises a FileExistsError if a file with that name already exists. This makes it the safe choice when silently overwriting an existing file would be a bug — for example, when generating unique output files in a script.

How do you print to stderr in Python?

Pass file=sys.stderr as a keyword argument to print(): print("Error message", file=sys.stderr). You must import sys first. Sending errors to stderr rather than stdout keeps the two streams separate so they can be redirected independently at the shell level — for example, python script.py > output.txt 2> errors.txt.

What is the difference between text mode and binary mode in Python?

Text mode (the default, or 't' added to a mode string like 'rb' becoming 'r') automatically encodes and decodes data using a character encoding such as UTF-8. Python translates newline characters according to the platform. Binary mode ('rb', 'wb') bypasses encoding entirely — you work directly with bytes objects. Use binary mode for images, audio, compiled files, or any data that is not plain text. Use text mode for files containing human-readable content.

How do you read a file line by line without loading the whole file into memory?

Iterate directly over the file object inside a with block: for line in f:. Python file objects are iterators — each pass through the loop fetches one line without reading the rest of the file into memory. This is preferable to .readlines() when working with large files, because .readlines() loads all lines into a list at once.

What is sys.argv and how do you read command-line arguments in Python?

sys.argv is a list of strings that Python populates with the command-line arguments passed to your script. sys.argv[0] is always the script's filename; the arguments that follow start at index 1. Running python script.py Alice 30 gives you sys.argv == ['script.py', 'Alice', '30']. All values are strings, so convert them with int() or float() as needed. For more complex argument handling, the standard library's argparse module provides named flags, optional arguments, and automatic help text generation.

What happens when you try to open a file that does not exist in Python?

Python raises a FileNotFoundError (a subclass of OSError) and the program stops unless you handle the exception. Wrap the open() call in a try/except FileNotFoundError block to catch the error gracefully. Avoid checking for file existence with os.path.exists() before opening — a race condition can occur if another process deletes the file between the check and the open call. Catching the exception directly is the correct pattern.

What is the difference between print() and sys.stdout.write()?

print() is a high-level wrapper around sys.stdout.write(). The key differences: print() automatically converts its arguments to strings, adds a newline at the end by default, and accepts multiple values separated by sep. sys.stdout.write() requires a string argument, adds no newline, and returns the number of characters written. In practice, print() is the right choice for almost everything. Use sys.stdout.write() only when you need exact low-level control over what is sent to the stream — for example, when writing output character by character or when the automatic newline from print() would break a protocol.

What does writelines() do and how is it different from write()?

.writelines() accepts an iterable of strings and writes each one to the file in sequence — it is the write-side counterpart to .readlines(). The important detail: .writelines() does not add newline characters between strings. You must include \n in each string yourself. Use .write() when you have a single string. Use .writelines() when you have a list or generator of strings and want to avoid joining them into one large string in memory first.

Can you write numbers or other data types directly to a file in Python?

No. The file's .write() method only accepts a string. Passing an integer or float directly raises a TypeError. Convert the value first with str(): f.write(str(42)), or use an f-string: f.write(f"{count}\n"). For structured numeric data, consider the standard library's csv module or the json module, both of which handle type conversion for you automatically.

What do the r+, w+, and a+ file modes do in Python?

Adding + to a mode string opens a file for both reading and writing. 'r+' opens an existing file for reading and writing without truncating it — the file must already exist, or Python raises FileNotFoundError. 'w+' opens for reading and writing but truncates (empties) the file first, creating it if necessary. 'a+' opens for reading and appending, with the write position fixed at the end. These modes require careful management of the file cursor using .seek(). For beginners, using separate open() calls for reading and writing is usually clearer and avoids cursor confusion.

Does input() block the program while waiting for the user?

Yes. input() is synchronous and blocking — the program pauses completely at that line until the user presses Enter. No other code runs while it is waiting. This is fine for simple command-line scripts. If you need to accept user input while also doing other work (updating a UI, running a timer, handling network events), you need a different approach: GUI toolkits like Tkinter provide event-driven input, and Python's asyncio library supports non-blocking I/O for asynchronous programs. For the programs beginners write, synchronous input() covers the vast majority of cases.

Learn What I/O Is in Python: Absolute Beginners Tutorial

What I/O Means and Why It Matters

Standard Output: The print() Function

Standard Input: The input() Function

How to Read and Write Files in Python

Output Buffering and Flushing

Replacing stdin and stdout at Runtime

Writing to stderr

Encoding and Text Mode

Handling Missing Files

Command-Line Arguments with sys.argv

Common Mistakes and Their Fixes

Python I/O Learning Summary

Under the Hood: How Python I/O Actually Works

The CPython I/O stack has three layers

Flushing Python vs flushing the OS: flush() vs os.fsync()

Why print() is slower than sys.stdout.write() — but only in specific cases

The line-buffering mode you probably did not know existed

What actually happens when you read() a large file

The newline translation you did not ask for

File descriptors, file objects, and the difference between them

Why sys.__stdout__ exists separately from sys.stdout

Sources and References

Frequently Asked Questions

Why sys.stdout exists separately from sys.stdout