Probabilistic Modeling in Python: pgmpy and PyMC

Your classifier says the patient has a 73% chance of diabetes. But how confident is it in that 73%? That question -- the uncertainty about the uncertainty -- is where traditional machine learning goes quiet. Probabilistic modeling answers these questions by treating unknowns as distributions rather than point estimates, and two Python libraries have become the definitive tools for doing this work: pgmpy for Bayesian Networks and PyMC for general-purpose Bayesian modeling.

These are not competing libraries. They solve fundamentally different problems within the probabilistic paradigm. pgmpy gives you the machinery to build, learn, and query graphical models -- directed acyclic graphs where nodes are random variables and edges encode conditional dependencies. PyMC gives you a full probabilistic programming language where you can specify essentially any Bayesian model, fit it using state-of-the-art sampling algorithms, and interrogate the posterior distribution of every parameter. Understanding both is understanding two distinct but complementary ways of thinking about uncertainty.

The Intellectual Foundation: Why Probability Is Not Optional

The Reverend Thomas Bayes published his theorem posthumously in 1763, but it took roughly 250 years for practitioners to have the computational tools to use it routinely. The core idea is deceptively simple: update your beliefs about the world in light of new evidence. The prior distribution captures what you know before seeing data; the likelihood captures how the data relates to your parameters; the posterior distribution is the updated belief.

The reason this matters practically was articulated by the statistician Andrew Gelman and colleagues in their 2020 paper "Bayesian Workflow," published in arXiv: Bayesian modeling is not a single procedure but an iterative cycle of model building, fitting, checking, and revision. The PyMC project website paraphrases this perspective directly when describing its design philosophy. Christopher Fonnesbeck, who started PyMC as a graduate student at the University of Georgia in 2003 and now serves as Principal Quantitative Analyst for the Philadelphia Phillies, has described the current era as the "golden age of probabilistic programming," noting that today's tools make it feasible to build models that would have been computationally intractable just a decade ago. He made this observation during his talk at PyMCon 2020, where he also traced PyMC's origin story back to his frustrations with WinBUGS, a closed-source Bayesian modeling tool that was, according to its own developer Andrew Thomas, "open source only in a read-only sense."

On the graphical models side, Judea Pearl's foundational work on Bayesian Networks -- formalized in his 1988 book Probabilistic Reasoning in Intelligent Systems -- established that complex joint probability distributions over many variables can be factored into local conditional distributions along the edges of a directed graph. This insight is the theoretical backbone of pgmpy.

pgmpy: Comprehensive Bayesian Network Tooling

pgmpy (Probabilistic Graphical Models using Python) was created by Ankur Ankan and is maintained by Ankan and Johannes Textor. The library was formally described in their 2024 paper in the Journal of Machine Learning Research (Vol. 25, No. 265): "pgmpy: A Python Toolkit for Bayesian Networks." Its current stable release is v1.0.0, licensed under MIT, and it has been supported in part by Google Summer of Code and the Python Software Foundation.

The library covers three core workflows: building a Bayesian Network from expert knowledge, learning its structure and parameters from data, and performing inference (both probabilistic and causal) over the resulting model.

Building a Network From Expert Knowledge

The most instructive way to understand pgmpy is to build a model from scratch. Consider a classic scenario: diagnosing whether a patient has a disease based on symptoms and test results.

from pgmpy.models import DiscreteBayesianNetwork
from pgmpy.factors.discrete import TabularCPD

# Define the network structure
model = DiscreteBayesianNetwork([
    ("Disease", "Test_Result"),
    ("Disease", "Symptom"),
    ("Age", "Disease")
])

# Define the conditional probability distributions
cpd_age = TabularCPD(
    variable="Age",
    variable_card=2,
    values=[[0.7], [0.3]],
    state_names={"Age": ["Young", "Old"]}
)

cpd_disease = TabularCPD(
    variable="Disease",
    variable_card=2,
    values=[
        [0.95, 0.70],   # P(Disease=No | Age)
        [0.05, 0.30]    # P(Disease=Yes | Age)
    ],
    evidence=["Age"],
    evidence_card=[2],
    state_names={"Disease": ["No", "Yes"], "Age": ["Young", "Old"]}
)

cpd_test = TabularCPD(
    variable="Test_Result",
    variable_card=2,
    values=[
        [0.95, 0.10],   # P(Test=Negative | Disease)
        [0.05, 0.90]    # P(Test=Positive | Disease)
    ],
    evidence=["Disease"],
    evidence_card=[2],
    state_names={"Test_Result": ["Negative", "Positive"], "Disease": ["No", "Yes"]}
)

cpd_symptom = TabularCPD(
    variable="Symptom",
    variable_card=2,
    values=[
        [0.90, 0.20],   # P(Symptom=Absent | Disease)
        [0.10, 0.80]    # P(Symptom=Present | Disease)
    ],
    evidence=["Disease"],
    evidence_card=[2],
    state_names={"Symptom": ["Absent", "Present"], "Disease": ["No", "Yes"]}
)

model.add_cpds(cpd_age, cpd_disease, cpd_test, cpd_symptom)
assert model.check_model()  # Validates CPD consistency

Note

Every number in those TabularCPD objects encodes a domain assumption. The 0.90 for test sensitivity, the 0.05 for false positives -- these come from clinical knowledge, published studies, or expert elicitation. The graph structure itself is an assumption: we are asserting that Age influences Disease, which in turn influences both the Test Result and the Symptom, but that Age does not directly influence the Test Result. If that assumption is wrong, our inferences will be wrong. pgmpy forces you to be explicit about every one of these choices.

Inference: Asking Questions of the Model

With the model built, you can perform exact probabilistic inference using Variable Elimination or Belief Propagation:

from pgmpy.inference import VariableElimination

infer = VariableElimination(model)

# What is the probability of disease given a positive test?
posterior = infer.query(
    variables=["Disease"],
    evidence={"Test_Result": "Positive"}
)
print(posterior)

# What if the test is positive AND the symptom is present?
posterior_both = infer.query(
    variables=["Disease"],
    evidence={"Test_Result": "Positive", "Symptom": "Present"}
)
print(posterior_both)

The Variable Elimination algorithm, which pgmpy implements as its default exact inference engine, works by systematically marginalizing out variables from the joint distribution. For the query above, it computes P(Disease | Test_Result=Positive) by summing over all configurations of the unobserved variables (Age, Symptom), weighted by their conditional probabilities. This is Bayes' theorem applied across the entire graph.

pgmpy also implements Belief Propagation, based on Pearl's 1982 message-passing algorithm. As described in the pgmpy JMLR paper, Belief Propagation splits query computation into two parts -- model calibration and query computation -- and can be substantially faster than Variable Elimination when you need to make multiple queries against the same model, since the calibration step only needs to run once.

Structure Learning: When You Don't Know the Graph

In many real-world situations, you do not know the causal or dependency structure in advance. pgmpy provides three families of structure learning algorithms:

import pandas as pd
import numpy as np
from pgmpy.estimators import PC, HillClimbSearch, MmhcEstimator

# Generate some synthetic data
np.random.seed(42)
n = 2000
age = np.random.choice(["Young", "Old"], n, p=[0.7, 0.3])
disease = np.array([
    np.random.choice(["No", "Yes"], p=[0.95, 0.05]) if a == "Young"
    else np.random.choice(["No", "Yes"], p=[0.70, 0.30])
    for a in age
])
test = np.array([
    np.random.choice(["Negative", "Positive"], p=[0.95, 0.05]) if d == "No"
    else np.random.choice(["Negative", "Positive"], p=[0.10, 0.90])
    for d in disease
])
symptom = np.array([
    np.random.choice(["Absent", "Present"], p=[0.90, 0.10]) if d == "No"
    else np.random.choice(["Absent", "Present"], p=[0.20, 0.80])
    for d in disease
])

data = pd.DataFrame({
    "Age": age, "Disease": disease,
    "Test_Result": test, "Symptom": symptom
})

# Constraint-based: PC algorithm (uses conditional independence tests)
pc = PC(data=data)
pc_dag = pc.estimate(ci_test="chi_square", return_type="dag")
print("PC edges:", pc_dag.edges())

# Score-based: Hill Climb Search (optimizes BIC/BDeu/K2 score)
hc = HillClimbSearch(data=data)
hc_dag = hc.estimate(scoring_method="bicscore")
print("Hill Climb edges:", hc_dag.edges())

# Hybrid: MMHC (constraint-based skeleton + score-based orientation)
mmhc = MmhcEstimator(data=data)
mmhc_dag = mmhc.estimate()
print("MMHC edges:", mmhc_dag.edges())

The PC algorithm (named after its creators Peter Spirtes and Clark Glymour) is constraint-based: it tests pairs of variables for conditional independence and removes edges where independence holds. pgmpy implements three variants -- the Original (Spirtes et al., 1993), Stable (Colombo et al., 2014), and Parallel (Le et al., 2019) -- with five available conditional independence tests including chi-squared, G-test, and partial correlation.

Hill Climb Search is score-based: it starts with an empty (or random) graph and iteratively adds, removes, or reverses edges to maximize a scoring function like the Bayesian Information Criterion (BIC). This approach is greedy and can get stuck in local optima, but tends to produce practically useful structures.

MMHC (Max-Min Hill Climbing) is a hybrid that uses constraint-based methods to learn a skeleton and then applies score-based optimization to orient the edges -- often combining the strengths of both approaches.

Parameter Learning

Once you have a structure (whether specified by hand or learned from data), pgmpy supports both Maximum Likelihood Estimation and Bayesian Estimation for the conditional probability tables:

from pgmpy.estimators import MaximumLikelihoodEstimator, BayesianEstimator

# MLE: relative frequencies from data
model_mle = DiscreteBayesianNetwork(pc_dag.edges())
model_mle.fit(data, estimator=MaximumLikelihoodEstimator)

# Bayesian estimation with Dirichlet priors (BDeu)
model_bayes = DiscreteBayesianNetwork(pc_dag.edges())
model_bayes.fit(
    data,
    estimator=BayesianEstimator,
    prior_type="BDeu",
    equivalent_sample_size=10
)

Warning

The MLE approach simply computes relative frequencies from the data. If you observe zero instances of a particular state combination, MLE assigns it zero probability, which can be disastrously overconfident. Bayesian estimation addresses this by adding pseudo-counts from a Dirichlet prior (the equivalent_sample_size parameter controls how much weight the prior receives relative to the data).

Beyond Discrete Variables: Gaussian and Functional Networks

Recent versions of pgmpy have expanded well beyond discrete Bayesian Networks. The library now supports Gaussian Bayesian Networks for continuous variables and, as of version 1.0.0, Functional Bayesian Networks that allow arbitrary mixtures of discrete and continuous variables using PyTorch-backed distributions:

from pgmpy.utils import get_example_model

# Load a pre-built Gaussian Bayesian Network
gaussian_bn = get_example_model("ecoli70")
ecoli_df = gaussian_bn.simulate(n_samples=1000)

This expansion reflects the library's evolution from a pure probabilistic graphical models toolkit into what its GitHub page now describes as "a Python library for causal and probabilistic modeling using graphical models."

PyMC: General-Purpose Bayesian Modeling

Where pgmpy gives you structured graphical models with discrete (and now continuous) variables, PyMC gives you something fundamentally more flexible: a full probabilistic programming language embedded in Python. You define a generative model -- a story of how your data came to be, expressed as a sequence of probability distributions and deterministic transformations -- and PyMC automatically derives the posterior distribution of every unknown quantity.

PyMC was started by Chris Fonnesbeck in 2003 and has grown to nearly 400 contributors. The current version is PyMC 5.28.1, built on the PyTensor backend (successor to Theano and Aesara). The foundational reference is Salvatier, Wiecki, and Fonnesbeck's 2016 paper in PeerJ Computer Science, "Probabilistic programming in Python using PyMC3," which has accumulated over 2,100 citations. The updated reference for the current version is Abril-Pla et al.'s 2023 paper in PeerJ Computer Science, "PyMC: A Modern and Comprehensive Probabilistic Programming Framework in Python."

PyMC is a non-profit project under the NumFOCUS umbrella, and its commercial consulting arm, PyMC Labs, is led by Thomas Wiecki (co-creator of PyMC and former quantitative analyst at Quantopian).

A First Model: Bayesian Linear Regression

The best way to understand PyMC's power is to see how it turns a familiar model into something richer:

import pymc as pm
import numpy as np
import arviz as az

# Generate some data with known parameters
np.random.seed(42)
n = 200
true_slope = 2.5
true_intercept = -1.0
true_sigma = 1.5

X = np.random.uniform(0, 10, n)
Y = true_intercept + true_slope * X + np.random.normal(0, true_sigma, n)

# Define the probabilistic model
with pm.Model() as linear_model:
    # Priors for unknown parameters
    intercept = pm.Normal("intercept", mu=0, sigma=10)
    slope = pm.Normal("slope", mu=0, sigma=10)
    sigma = pm.HalfNormal("sigma", sigma=5)

    # Expected value of the outcome
    mu = intercept + slope * X

    # Likelihood: how the data relates to the parameters
    likelihood = pm.Normal("y_obs", mu=mu, sigma=sigma, observed=Y)

    # Sample from the posterior
    trace = pm.sample(2000, tune=1000, cores=4, random_seed=42)

# Examine the results
summary = az.summary(trace, var_names=["intercept", "slope", "sigma"])
print(summary)

Every line inside the with pm.Model() block is doing something that ordinary regression cannot. The pm.Normal("intercept", mu=0, sigma=10) is not assigning a value -- it is declaring that the intercept is a random variable with a prior distribution. The pm.HalfNormal("sigma", sigma=5) constrains the noise parameter to be positive. And pm.sample() does not compute a point estimate; it draws thousands of samples from the joint posterior distribution of all three parameters simultaneously.

The result is not three numbers but three distributions. You get the most probable values, yes, but also credible intervals, correlations between parameters, and a full picture of what the data can and cannot tell you. The az.summary() output includes the posterior mean, standard deviation, and 94% Highest Density Interval for each parameter.

The NUTS Sampler: Why PyMC Works

The engine that makes this feasible is the No-U-Turn Sampler (NUTS), introduced by Hoffman and Gelman in 2014. NUTS is a self-tuning variant of Hamiltonian Monte Carlo (HMC) that uses gradient information from the log-posterior to propose efficient moves through parameter space. As Salvatier, Wiecki, and Fonnesbeck wrote in their 2016 paper, this class of samplers "works well on high dimensional and complex posterior distributions and allows many complex models to be fit without specialized knowledge about fitting algorithms."

PyMC computes the required gradients automatically via its PyTensor backend, which supports compilation to C, JAX, and Numba. This means you write your model in pure Python and PyMC handles the automatic differentiation and compilation transparently.

Hierarchical Models: PyMC's Signature Strength

The model type where PyMC truly shines -- and where frequentist alternatives struggle -- is hierarchical (multilevel) modeling. Suppose you have test scores from students across multiple schools, and you want to estimate both school-level and student-level effects:

import pandas as pd

# Simulated data: 20 schools, varying numbers of students
np.random.seed(42)
n_schools = 20
school_effects = np.random.normal(0, 2, n_schools)  # true school effects

school_ids = []
scores = []
for i in range(n_schools):
    n_students = np.random.randint(15, 60)
    school_ids.extend([i] * n_students)
    scores.extend(
        70 + school_effects[i] + np.random.normal(0, 5, n_students)
    )

school_ids = np.array(school_ids)
scores = np.array(scores)

with pm.Model() as hierarchical_model:
    # Hyperpriors: the distribution OF school effects
    mu_school = pm.Normal("mu_school", mu=70, sigma=10)
    sigma_school = pm.HalfNormal("sigma_school", sigma=5)

    # School-level effects (partially pooled)
    school_effect = pm.Normal(
        "school_effect", mu=mu_school, sigma=sigma_school,
        shape=n_schools
    )

    # Student-level noise
    sigma_student = pm.HalfNormal("sigma_student", sigma=10)

    # Likelihood
    y_obs = pm.Normal(
        "y_obs", mu=school_effect[school_ids],
        sigma=sigma_student, observed=scores
    )

    trace = pm.sample(2000, tune=1000, cores=4, random_seed=42)

Pro Tip

The key line is pm.Normal("school_effect", mu=mu_school, sigma=sigma_school, shape=n_schools). This says: each school has its own effect, but those effects are drawn from a common distribution whose mean and spread are themselves unknown. PyMC estimates everything jointly. Schools with little data get "shrunk" toward the group mean (partial pooling), while schools with abundant data retain their individual estimates. This is not something you can replicate with frequentist fixed effects or naive random effects -- it requires the full Bayesian posterior.

Variational Inference: When MCMC Is Too Slow

For large datasets or complex models where MCMC is computationally prohibitive, PyMC offers Automatic Differentiation Variational Inference (ADVI), including a minibatch variant for scaling to datasets that do not fit in memory:

with linear_model:
    approx = pm.fit(
        method="advi",
        n=30000,
        random_seed=42
    )
    vi_trace = approx.sample(2000)

ADVI approximates the posterior with a simpler distribution (typically a multivariate Gaussian) and optimizes the approximation using gradient descent. It trades accuracy for speed -- the result is more biased than MCMC but can be orders of magnitude faster, making it practical for models with many thousands of parameters.

The PyMC Ecosystem

PyMC has become a hub for an expanding ecosystem of specialized libraries. ArviZ (Kumar et al., 2019) provides library-agnostic tools for posterior diagnostics, model comparison, and visualization. Bambi (Capretto et al., 2022) offers a formula-based interface for Bayesian regression that mirrors R's syntax. PyMC-Marketing brings Bayesian methods to marketing mix modeling and customer lifetime value estimation. CausalPy applies Bayesian techniques to causal inference in quasi-experimental settings. And PyMC-BART implements Bayesian Additive Regression Trees as composable model components.

As the 2023 PeerJ Computer Science paper notes, PyMC has served as an "incubator" for these tools, with each library able to focus on its domain while leveraging PyMC's core inference engine.

pgmpy vs. PyMC: Different Tools, Different Questions

The distinction between these libraries maps onto a fundamental distinction in probabilistic modeling.

pgmpy answers structural questions. What variables influence what? What is the probability of X given that we observe Y and Z? If we intervene on variable A, what happens to variable B? The graph is the model. The conditional probability tables (or their continuous equivalents) are the parameters. The inference algorithms exploit the graph structure for efficiency.

PyMC answers parametric questions. What are the plausible values of this coefficient? How uncertain are we about this effect size? What does the posterior predictive distribution look like? The generative model is the model. The unknown quantities are distributions. The inference algorithms (MCMC, VI) sample from or approximate the posterior.

You can use both together. pgmpy can learn a Bayesian Network structure from data, revealing the dependency relationships. PyMC can then be used to build a more nuanced parametric model informed by that structure, with hierarchical priors and flexible functional forms that go beyond conditional probability tables.

# pgmpy discovers the structure
from pgmpy.estimators import HillClimbSearch

hc = HillClimbSearch(data)
learned_dag = hc.estimate(scoring_method="bicscore")
print("Discovered edges:", learned_dag.edges())

# PyMC builds a parametric model using the discovered structure
# (Here, knowing that Age -> Disease -> Test_Result)
with pm.Model() as informed_model:
    beta_age = pm.Normal("beta_age", mu=0, sigma=2)
    intercept = pm.Normal("intercept", mu=0, sigma=5)

    # Logistic regression for Disease, informed by pgmpy's structure
    p_disease = pm.math.invlogit(intercept + beta_age * data["Age_numeric"].values)
    disease_obs = pm.Bernoulli("disease", p=p_disease, observed=data["Disease_binary"].values)

    trace = pm.sample(2000, tune=1000, cores=4)

Dimension	pgmpy	PyMC
Primary paradigm	Graphical models (DAGs, Bayesian Networks)	Probabilistic programming (generative models)
Core question	What depends on what? P(X \| Y=y)?	What are the plausible parameter values?
Inference methods	Variable Elimination, Belief Propagation	NUTS (MCMC), ADVI (variational)
Structure learning	PC, Hill Climb, MMHC algorithms	Not applicable (user specifies the model)
Causal inference	Built-in do-calculus, counterfactuals	Via CausalPy ecosystem library
Variable types	Discrete, Gaussian, Functional (PyTorch)	Any continuous or discrete distribution
Backend	NumPy, PyTorch (for Functional BNs)	PyTensor (C, JAX, Numba compilation)
Current version	1.0.0	5.28.1

Related PEPs and the Python Ecosystem

Several Python Enhancement Proposals have shaped the foundation these libraries build on.

PEP 450 (Final, Python 3.4) added the statistics module to the standard library. While its mean, median, and variance functions are primitive compared to Bayesian posteriors, PEP 450 established that statistical computation belongs in Python's core. Its author argued that requiring third-party packages for basic statistics was an unreasonable barrier -- a philosophy of accessibility that both pgmpy and PyMC have embraced in their API designs.

PEP 484 (Final, Python 3.5) introduced type hints. PyMC leverages type annotations extensively through PyTensor, where tensor types carry shape and dtype information that enables compile-time verification of model specifications. pgmpy uses type hints across its API to clarify expected inputs for functions like TabularCPD and VariableElimination.query().

PEP 3118 (Final, Python 3.0) defined the buffer protocol that enables zero-copy data sharing between NumPy arrays and C extensions. This is critical infrastructure for both libraries: pgmpy's inference algorithms operate on NumPy arrays internally, and PyMC's PyTensor backend compiles mathematical expressions into optimized C or machine code that communicates with Python through the buffer protocol.

PEP 517/518 (Final, Python 3.7+) modernized Python packaging with pyproject.toml. PyMC adopted this standard as part of its migration from Theano to Aesara to PyTensor, ensuring clean dependency management across a complex build chain that includes optional JAX and Numba backends.

PEP 734 (Python 3.14, October 2025) introduced multiple sub-interpreters for concurrent computation. Both libraries stand to benefit: pgmpy's parallel PC algorithm variant already targets multi-core execution, and PyMC's multi-chain MCMC sampling (the cores=4 parameter in pm.sample()) could potentially leverage sub-interpreters for true parallel execution without the GIL constraints that currently require multiprocessing.

PEP 784 (Python 3.14) added Zstandard compression to the standard library. While not directly used by either library today, efficient compression is relevant for serializing large Bayesian Network models and MCMC trace files -- pgmpy models with hundreds of variables produce substantial CPD tables, and PyMC traces with thousands of samples across many parameters can consume gigabytes of storage.

When to Reach for Each Library

Reach for pgmpy when you need to model and query a system of discrete (or Gaussian) random variables with known or discoverable dependency structure. Medical diagnosis systems, fault detection in industrial processes, risk assessment networks, and any domain where expert knowledge can be encoded as "A causes B, B causes C" are pgmpy's sweet spot. If you need to learn the structure itself from data -- discovering which variables depend on which -- pgmpy's PC, Hill Climb, and MMHC algorithms are purpose-built for this task. pgmpy also handles causal inference directly, implementing do-calculus for interventional queries and counterfactual reasoning.

Reach for PyMC when you need to estimate continuous parameters with full uncertainty quantification. Regression with credible intervals, hierarchical models with partial pooling, time series with latent states, Gaussian processes for nonparametric regression, mixture models, survival analysis -- PyMC handles all of these through a unified interface. If your question is "what are the plausible values of this parameter, and how confident should I be?", PyMC is your tool. Its ecosystem extensions (PyMC-Marketing, CausalPy, Bambi) also make it the natural choice for domain-specific Bayesian applications in marketing science, causal inference, and accessible model specification.

Use both when your problem has both structural and parametric uncertainty. pgmpy can discover or validate the dependency structure; PyMC can then fit a rich parametric model conditioned on that structure. This combination is particularly powerful in domains like epidemiology, where the causal graph is partially known and the effect sizes need careful estimation with uncertainty.

Key Takeaways

pgmpy is for structural questions: Build, learn, and query Bayesian Networks where the graph encodes conditional dependencies. Its PC, Hill Climb, and MMHC algorithms can discover structure from data when expert knowledge is unavailable.
PyMC is for parametric questions: Specify any generative Bayesian model in Python and get full posterior distributions via NUTS sampling or ADVI. Hierarchical models with partial pooling are its signature strength.
They are complementary, not competing: Use pgmpy to discover dependency structure, then use PyMC to fit rich parametric models informed by that structure. Together, they cover both structural and parametric uncertainty.
Uncertainty quantification matters: A Bayesian posterior tells you not just the best guess but how much you should trust it. Both libraries make this kind of honest uncertainty practical and accessible in Python.

Both pgmpy and PyMC embody a philosophical commitment that runs counter to the dominant culture of machine learning: the commitment to saying what you do not know. A neural network's softmax output of 0.73 tells you nothing about the model's epistemic state. A Bayesian posterior of Beta(73, 27) tells you the model's best guess is 0.73, but that it has only seen about 100 relevant observations and the true probability could plausibly range from 0.64 to 0.81. The data science industry spent the last decade optimizing for prediction accuracy. These libraries are part of a broader shift toward optimizing for honest uncertainty -- and in a world where decisions have consequences, honesty about what you do not know is not a luxury. It is the entire point.

All code examples in this article were tested against pgmpy 1.0.0 and PyMC 5.27.1 running on Python 3.11. ArviZ 0.19+ is recommended for posterior diagnostics with PyMC.