In 2001, a physics graduate student named Fernando Pérez was supposed to be finishing his dissertation on lattice QCD at the University of Colorado, Boulder. Instead, he was productively procrastinating. Over a six-week stretch, he wrote a 259-line Python startup script — an enhanced interactive shell he called IPython. That act of thesis-avoidance would eventually become Jupyter Notebook, a tool now used by millions of people and recognized with the 2017 ACM Software System Award. (Source: UC Berkeley CDSS Interview; EarthCube / Project Jupyter History)
It helped produce the first image of a black hole. It powered the LIGO collaboration's gravitational wave analysis. It became the standard environment for data science education worldwide. But none of that explains why people actually use it. Why not just write Python scripts in a text editor? Why not use a full IDE? What is it about the notebook paradigm that resonates so deeply with the way people work with data and code? This article answers that question with specifics — real code, real workflows, real understanding of what makes Jupyter Notebook distinct as a programming environment — and goes further, into the cognitive science, security implications, and reproducibility solutions that no surface-level overview covers.
The Core Idea: Code That Tells a Story
A Jupyter Notebook is a document that interleaves executable code, rich text, mathematical equations, and visual output in a single, ordered sequence of cells. You write code in a cell, run it, see the result immediately below, then write more code or explanation in the next cell.
This sounds simple. It is transformative.
The concept comes from literate programming, a software development methodology introduced by Stanford computer scientist Donald Knuth in 1984 in his paper published in The Computer Journal (Volume 27, Issue 2). Knuth's central thesis was that programs should be written primarily for human comprehension, with executable code woven into a narrative structure. The Unidata Python Training program at UCAR describes the practical consequence of this idea: literate programming prioritizes prose-first exposition, punctuated with executable code blocks. (Source: Oxford Academic, D. E. Knuth, 1984; Unidata Python Training)
Here is what that looks like in practice:
# Cell 1 - Markdown
"""
## Exploring Sales Data
We'll load our Q4 sales data and look for regional patterns.
The dataset contains 50,000 transactions from October-December 2025.
"""
# Cell 2 - Code
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("q4_sales.csv")
print(f"Loaded {len(df):,} rows with columns: {list(df.columns)}")
Loaded 50,000 rows with columns: ['date', 'region', 'product', 'revenue', 'quantity']
# Cell 3 - Code
df.groupby("region")["revenue"].sum().sort_values().plot(kind="barh")
plt.title("Q4 Revenue by Region")
plt.xlabel("Total Revenue ($)")
plt.tight_layout()
plt.show()
Each cell builds on the last. You can see the code that produced every result. You can change a parameter and re-run a cell to see how the output changes. You can add a Markdown cell explaining why you filtered out certain records. The notebook becomes a computational narrative — not just code, but the reasoning behind the code.
This is why Jupyter Notebooks are the dominant format for data science tutorials, academic papers with reproducible analysis, and exploratory data work. The document is the analysis.
The Feedback Loop That Changes How You Think
The single most important feature of Jupyter Notebook is immediate feedback. You write a few lines, press Shift+Enter, and see the result. This changes the programming experience fundamentally.
In a traditional script, you write an entire program (or at least a substantial chunk), run it, wait for output, find an error, go back, fix it, run again. The feedback loop is measured in minutes. In a notebook, the loop is measured in seconds.
# You're exploring unfamiliar data. You don't know what's in it yet.
df.head()
You see the first five rows. There are null values in the "region" column. How many?
df["region"].isna().sum()
347
347 out of 50,000. Less than 1%. You decide to drop them.
df = df.dropna(subset=["region"])
print(f"Remaining rows: {len(df):,}")
Remaining rows: 49,653
Each step takes seconds. You are thinking with the data, not about the mechanics of running code. This rapid iteration cycle is why notebooks feel natural for exploratory work — you are having a conversation with your data, and the notebook is the medium.
In Python for Data Analysis, Wes McKinney — the creator of pandas — described his preferred workflow as IPython combined with a text editor, iteratively testing and debugging each piece of code. That statement captures what notebooks optimize for: a tight cycle of hypothesis, test, and revision. (Source: Wes McKinney, Python for Data Analysis, 3rd Ed.)
The Cognitive Science Behind the Feedback Loop
The notebook's power is not just a convenience feature. It maps to well-established principles in cognitive psychology that explain why certain learning and problem-solving environments produce better outcomes than others.
Cognitive load theory, developed by John Sweller in 1988, holds that working memory is finite — it can process only a limited number of novel information elements at once. When a programming environment forces you to hold an entire script in your head before seeing any output, it maximizes what Sweller calls extraneous cognitive load: mental effort spent on the tool rather than on the problem. Notebooks eliminate this overhead by letting you run one cell at a time, observe the result, and decide what to do next with full visual feedback.
This is also why notebooks excel as teaching tools. The cell-by-cell structure acts as a form of scaffolding — it introduces complexity incrementally, allowing learners to absorb each concept before encountering the next. A well-designed notebook mirrors the principles of active recall: the learner must predict what a cell will produce before running it, strengthening the neural pathways associated with that knowledge. Research from Roediger and Karpicke (2006), published in Psychological Science, found that students who practiced active retrieval retained 80% of material after one week, compared to 34% for those who only re-read the content.
The immediate visual feedback also exploits what psychologists call the generation effect — information you generate (or discover through interaction) is remembered more durably than information you passively receive. Every time you write a line of code in a notebook cell, predict the output, and then see the actual result, you are engaging this effect. That is why reading a data science tutorial in a notebook (where you can modify and re-run cells) produces deeper understanding than reading the same material in a blog post or textbook.
If you use notebooks for learning, try writing your prediction of what each cell will output before running it. This practice engages active recall and the generation effect simultaneously, producing significantly stronger retention than passive reading.
Rich Output: More Than Text
A Python script's output is text. A Jupyter Notebook's output is anything your browser can render.
DataFrames display as formatted HTML tables with aligned columns and proper formatting. Matplotlib and Seaborn plots render inline, directly below the code that created them. LaTeX equations render with MathJax. Interactive widgets let you adjust parameters with sliders. Images, audio, video, and HTML can all appear as cell output.
This matters because data work is inherently visual. You are looking for patterns, outliers, distributions. Seeing a histogram is categorically different from seeing summary statistics printed as text.
# This single line renders a formatted HTML table in a notebook.
# In a script, it would print an ugly truncated string.
df.describe()
Pandas recognized this early. The library implements special _repr_html_() methods on DataFrame and Series objects specifically so that Jupyter renders them as properly formatted tables. This tight integration between Python's scientific libraries and the notebook environment is not accidental — it is a design decision that makes both tools more useful together than either would be alone.
The .ipynb Format: A Document, Not a Script
A Jupyter Notebook is saved as an .ipynb file, which is JSON under the hood. Each cell — code, markdown, or raw — is a JSON object containing the cell's source text, its type, and (for code cells) its outputs, including images encoded as base64 strings.
This has practical consequences. On one hand, notebooks are shareable documents. You can email one to a colleague and they can see every result without running anything. GitHub renders .ipynb files directly in the browser. The nbviewer service lets anyone view a notebook as a static webpage. NBConvert can export notebooks to HTML, PDF, LaTeX, Markdown, or executable Python scripts.
The JSON format makes version control painful. A Git diff on a .ipynb file shows changes to execution counts, base64-encoded image data, and metadata that has nothing to do with your actual code. In a 2020 analysis of 10 million notebooks downloaded from GitHub, JetBrains found that 36% had cells executed out of order, meaning the saved notebook could not be reliably reproduced by running cells top to bottom. (Source: JetBrains Datalore Blog, December 2020)
This is the honest trade-off: notebooks are excellent for exploration and communication, but they require discipline to keep reproducible.
Where Jupyter Gets Used (and Where It Doesn't)
The JetBrains Developer Ecosystem Survey 2023 reported that approximately 40% of data science professionals use Jupyter notebooks to present their work results. Among those who use notebooks, nearly half spend between 10% and 20% of their working time in them — indicating that notebooks serve a specific, focused role rather than replacing general-purpose IDEs. The GitHub Octoverse 2024 report found a 92% year-over-year spike in Jupyter Notebook usage on GitHub, correlating with the surge in data science and machine learning projects across the platform. By 2025, the Octoverse reported that Jupyter Notebook usage inside AI-tagged repositories had nearly doubled, with approximately 403,000 repositories using the format. (Source: GitHub Octoverse 2025)
Notebooks dominate in specific workflows. Exploratory data analysis — where you are poking at a dataset, trying transformations, building visualizations — is the canonical use case. Machine learning prototyping is another: you load data, try a model, inspect the metrics, tweak hyperparameters, all in one document. Education is a third: instructors write notebooks that students can run, modify, and experiment with, turning a static textbook into an interactive one.
The LIGO Open Science Center published Jupyter Notebooks that allowed anyone to replicate the final steps of the gravitational wave detection analysis from event GW150914 — the first direct observation of gravitational waves, announced in February 2016. The Event Horizon Telescope collaboration used Python (including libraries like eht-imaging, built on NumPy) in the data processing pipeline that produced the first image of a black hole in April 2019. (Source: LIGO/GWOSC; NumPy Case Study: Black Hole Image)
Fernando Pérez, creator of IPython and co-creator of Jupyter, has explained that by building shareable, interoperable tools, the scientific community accelerates the cycle of discovery. (Source: UC Berkeley CDSS, 2021)
Notebooks are not well-suited for production software engineering. They lack proper debugging tools, make testing difficult, encourage global state, and do not support standard software engineering practices like modular imports, continuous integration, or type checking in any ergonomic way. The community broadly agrees on this boundary: notebooks for exploration and communication, scripts and packages for production.
AI and the Notebook: A Converging Future
The relationship between notebooks and artificial intelligence runs in both directions. AI researchers use notebooks to prototype models. And increasingly, AI tools are being embedded inside notebooks to accelerate the workflows they support.
Google Colab integrates Gemini-powered code suggestions directly into notebook cells. GitHub Copilot works natively inside VS Code's notebook editor. JupyterLab extensions now provide inline LLM assistance for code generation, error explanation, and data interpretation. The notebook's cell-by-cell structure turns out to be an ideal interface for AI-assisted coding: each cell is a discrete, contextually complete unit that an LLM can read, generate, or explain without needing the full project context that traditional IDEs require.
This convergence is not superficial. The 2025 GitHub Octoverse found evidence of a shift from notebook-based prototyping to production deployment, noting that Python code acceleration surged in mid-2025 while notebook growth flattened — a signal that teams were packaging experimental notebook code into production applications. Notebooks are becoming the R&D lab from which production AI systems emerge.
The notebook's cell structure gives AI tools a natural boundary for context. An LLM can generate code for a single cell, explain the output of another, and suggest the next analysis step — all without needing to parse an entire codebase. This is why AI coding assistants work better in notebooks than in many traditional IDE contexts.
The PEP Connection: Python Enhancements That Shape the Notebook Experience
Jupyter Notebook exists outside the Python standard library, but several PEPs have directly influenced how notebooks work with Python.
PEP 484 — Type Hints (Python 3.5, 2015): While notebooks are not typically type-checked, PEP 484 introduced the typing module and function annotation syntax that increasingly appears in notebook code. When you write a helper function inside a notebook, adding type hints makes it self-documenting:
def calculate_roi(investment: float, revenue: float) -> float:
"""Return the ROI as a percentage."""
return ((revenue - investment) / investment) * 100.0
IDE-integrated notebook environments like VS Code and PyCharm can use these hints to provide autocompletion and error detection even inside .ipynb cells.
PEP 3107 — Function Annotations (2006): The predecessor to PEP 484 that established the syntactic foundation for annotations. Without PEP 3107's def f(x: int) -> str syntax, the type hints that make notebook functions self-documenting would not exist.
PEP 20 — The Zen of Python: The philosophy that "Readability counts" and "There should be one obvious way to do it" deeply shaped the notebook paradigm. Notebooks elevate readability to a first-class concern.
PEP 257 — Docstring Conventions: Notebooks benefit enormously from well-documented functions. When you call help() or use ? to inspect an object in a notebook cell, the docstrings formatted according to PEP 257 are what you see. The notebook's introspection capabilities — pressing Shift+Tab in JupyterLab to see a function's signature and docstring — are powered by the conventions PEP 257 established.
PEP 723 — Inline Script Metadata (accepted 2023): Allows embedding dependency metadata directly in a Python script via a structured comment block. This PEP, now part of the official Python packaging specifications, is directly relevant to notebook reproducibility. When combined with tools like uv, it enables running a script (or a notebook exported to a script) with its dependencies automatically resolved — no separate requirements.txt needed. (Source: PEP 723 — peps.python.org)
Jupyter Notebook vs. JupyterLab vs. Everything Else
The original Jupyter Notebook interface (sometimes called "classic Notebook") provides a simple, linear, single-document experience. You open a notebook, you work in it, you save it. It is lightweight and fast.
JupyterLab, released as stable in 2018, is the next-generation interface. It provides a tabbed, IDE-like environment where you can have multiple notebooks open alongside terminals, text files, CSV viewers, and file browsers. From JupyterLab 4 onward, real-time collaboration is available via the jupyter-collaboration extension, allowing multiple users to edit the same notebook simultaneously.
But the notebook format has also escaped Jupyter entirely. Google Colab runs .ipynb files in the cloud with free GPU access. VS Code has native notebook support that many developers prefer because it integrates notebooks into their existing editor workflow. Amazon SageMaker, Databricks, Kaggle, and DataCamp all provide hosted notebook environments. The .ipynb format has become a lingua franca for interactive computing.
Here is how to set up the classic experience:
# Install Jupyter
pip install jupyter
# Launch the notebook server
jupyter notebook
This opens a browser tab with a file browser. Click "New > Python 3" and you are writing code.
For JupyterLab:
pip install jupyterlab
jupyter lab
Same notebooks, different interface. Your .ipynb files work in either.
Practical Patterns for Better Notebooks
Understanding why people use notebooks is only half the story. Using them well requires specific practices.
Keep cells short and focused. A cell should do one thing. Load data in one cell. Clean it in the next. Visualize in a third. This makes the narrative clear and lets you re-run individual steps. It also reduces cognitive load by keeping each decision unit small enough for working memory to handle.
Run cells top to bottom before sharing. The non-linear execution problem — where Cell 5 depends on a variable defined in Cell 8, which you happened to run first — is the most common source of unreproducible notebooks. Before sharing, restart the kernel and run all cells in order: Kernel > Restart & Run All.
Use Markdown cells generously. Explain what you are doing and why. A notebook with only code cells is just a script with extra steps. The prose is what makes a notebook a notebook.
# Cell - Markdown
"""
### Why we exclude Q4 outliers
Three transactions exceed $500K, which is 10x our median.
Manual review confirmed these are data entry errors (duplicate decimal points).
We exclude them to avoid skewing regional averages.
"""
# Cell - Code
df = df[df["revenue"] < 500_000]
Use magic commands for common tasks. Jupyter provides "magic" commands prefixed with % or %% that streamline common workflows:
# Time a single line
%timeit df.groupby("region")["revenue"].sum()
# Time an entire cell
%%timeit
result = df.merge(other_df, on="id")
result.groupby("category").agg({"value": "mean"})
# Display matplotlib plots inline (usually automatic, but explicit is clear)
%matplotlib inline
# Load environment variables
%env API_KEY=your_key_here
# Run a shell command
!pip install seaborn
Export when done. nbconvert turns notebooks into other formats:
# To HTML (self-contained, shareable)
jupyter nbconvert --to html analysis.ipynb
# To Python script (strips markdown, keeps code)
jupyter nbconvert --to script analysis.ipynb
# To PDF (requires LaTeX)
jupyter nbconvert --to pdf analysis.ipynb
Solving the Reproducibility Problem
The 36% non-linear execution rate from JetBrains' analysis is a symptom, not a cause. The real problem is that notebooks store state in a running kernel, and that state is invisible to anyone reading the saved file. Solutions exist, and the best ones go beyond the standard advice of "restart and run all."
nbstripout removes cell outputs and execution counts before committing to Git, ensuring diffs only show meaningful code changes. Install it as a Git filter and it works automatically on every commit.
pip install nbstripout
nbstripout --install
jupytext pairs notebooks with plain-text representations (Markdown or Python scripts). You edit one format and the other stays synchronized. This gives you clean Git diffs while preserving the full notebook experience.
papermill parameterizes and executes notebooks programmatically. You define input parameters at the top of a notebook, then run it from the command line with different parameter values. This turns a notebook into a reproducible, testable pipeline:
papermill analysis.ipynb output.ipynb -p dataset "q4_sales.csv" -p threshold 500000
Containerized environments (Docker, repo2binder) go further by packaging not just the notebook but the entire execution environment — Python version, system libraries, GPU drivers — into a reproducible image. Binder lets anyone run a notebook directly from a GitHub repository without installing anything locally.
pixi and uv represent a newer generation of environment management tools. Combined with PEP 723 inline metadata, they can resolve and install dependencies from a single exported script, closing the gap between "works on my machine" and "works everywhere."
A minimum viable reproducibility setup for any notebook project: nbstripout for clean diffs, a requirements.txt or pyproject.toml pinning your dependencies, and a CI step that runs jupyter nbconvert --execute to verify the notebook runs top-to-bottom without error.
The Security Question Nobody Asks
A Jupyter Notebook is, at its core, a web application that executes arbitrary code on the host machine. This design has serious security implications that are rarely discussed in beginner tutorials.
The notebook's architecture consists of a browser-based frontend communicating over HTTP and WebSockets with a backend server that dispatches code to a kernel for execution. In the words of the Jupyter security documentation, this modular architecture provides multiple points where threat actors can intervene. NVIDIA's AI Red Team developed jupysec, a JupyterLab extension that audits Jupyter environments against nearly 100 security rules. Their analysis found common misconfigurations including servers listening on non-localhost interfaces, disabled CSRF checks, and exposed kernel connections.
Specific threats to be aware of include opening untrusted .ipynb files (which can contain JavaScript payloads in outputs), running notebook servers without authentication tokens, and exposing notebooks to the public internet without a reverse proxy and TLS. JupyterHub addresses the multi-user scenario with role-based access, but it must be configured deliberately.
For individual users, these precautions are essential: always use a notebook password or token, do not expose the notebook port to untrusted networks, keep JupyterLab and its extensions updated, and treat any downloaded .ipynb file with the same suspicion you would give a downloaded executable. When deploying notebooks in enterprise or cloud environments, KubeArmor-style kernel-level policies can enforce zero-trust execution constraints.
The Honest Limitations
Notebooks are not without real problems, and pretending otherwise does not serve anyone.
Hidden state. If you define x = 5 in Cell 3, delete Cell 3, but Cell 7 still references x, the notebook will work until you restart the kernel. Then it breaks. The JetBrains analysis of 10 million GitHub notebooks found that 36% had non-linear execution patterns, making their reproducibility uncertain.
No real debugging. You cannot set breakpoints and step through code the way you can in a proper IDE. JupyterLab has added a visual debugger, but it is limited compared to what PyCharm or VS Code offer for standard Python files.
Difficult to test. Writing unit tests inside a notebook is awkward. There is no clean way to run pytest against notebook code without first extracting it to a module. Tools like testbook provide a workaround by letting you import and test notebook cells as functions, but the experience is not seamless.
Version control friction. The JSON format of .ipynb files produces noisy diffs. Tools like nbstripout and jupytext mitigate this, but they require setup that many teams never do.
Memory management. Each notebook runs in its own kernel, holding all variables in memory until the kernel is shut down. In enterprise environments, orphaned kernels can accumulate and consume significant resources. Long-running notebooks with large DataFrames or model training loops can silently exhaust system memory.
These are not reasons to avoid notebooks. They are reasons to use notebooks for what they are good at — exploration, prototyping, communication, education — and use proper Python modules, packages, and scripts for production code.
Why It Endures
The notebook paradigm has survived and thrived for over a decade because it matches how a particular kind of thinking works. When you are exploring data, you do not know in advance what the final script should look like. You are forming hypotheses, testing them, discarding some, refining others. You need to see intermediate results. You need to explain your reasoning, sometimes to yourself and sometimes to others.
Brian Granger, co-creator of Jupyter, has reflected that he and his collaborators could not have predicted the world's embrace of data science and machine learning — the forces that transformed a small physics computing project into a global standard. (Source: EarthCube / Project Jupyter History)
Machine learning engineer Rick Lamers, quoted on the EarthCube project page, explained the mechanism behind Jupyter's adoption: notebooks reduce the cost of experimentation to near zero by letting users run high-level code in a contextual environment focused on the specific task at hand. When trying something new costs almost nothing, people naturally become more experimental, leading to results that would be difficult to achieve through careful upfront planning alone. (Source: EarthCube / Project Jupyter)
That is the answer to "why Jupyter?" It is not the fastest environment. It is not the most rigorous. It is the one that best matches the iterative, exploratory, narrative-driven nature of working with data and code. It is the environment that aligns with how human cognition handles open-ended problems: one small, visible step at a time, with the freedom to change direction at any moment. And for that particular job, nothing else has come close.