Python powers everything from small automation scripts to global financial systems — and that reach makes security non-negotiable. GitGuardian detected nearly 23.8 million new hardcoded secrets in public GitHub repositories in 2024, a 25% year-over-year increase. The OWASP Top 10 added two entirely new risk categories in its 2025 edition. NIST rewrote its password guidelines to ban composition rules that had been industry standard for a decade. This guide walks through the secure coding practices that matter now, with copy-paste-ready code examples that reflect how real production systems should be built — and how the standards that govern them have changed.
Security vulnerabilities in Python applications rarely come from exotic attack techniques. They come from predictable mistakes: passwords stored in plaintext, secrets committed to version control, user input passed directly to a database query, or an outdated dependency with a known CVE. The good news is that Python's standard library and ecosystem offer excellent tools to fix every one of these problems. The challenge is knowing which tool to use, and how to use it correctly.
This article focuses on practical, implementable patterns. Each section addresses a specific risk category, explains why it matters, and provides working code you can adapt immediately.
Think Like a Threat Model
Before writing a single line of defensive code, you need a mental model for how attackers actually approach your application. Security is not a checklist — it is a way of reasoning about trust, data flow, and failure conditions. The techniques in this guide are far more effective when applied through the lens of structured threat analysis rather than as isolated fixes.
The STRIDE framework, developed at Microsoft and widely adopted in application security, classifies threats into six categories: Spoofing (impersonating something or someone), Tampering (modifying data or code), Repudiation (denying actions without accountability), Information Disclosure (exposing data to unauthorized parties), Denial of Service (making a system unavailable), and Elevation of Privilege (gaining unauthorized access levels). Every section in this guide addresses at least one STRIDE category, and understanding which one changes how you prioritize and implement the fix.
Consider a practical example: a user registration endpoint. What are the trust boundaries? The user supplies a username, email, password, and age. You trust none of it. The username could contain SQL injection payloads (Tampering). The password could be "password123" (enabling Spoofing via credential stuffing). The email could be fabricated (Spoofing again). The age could be -1 or 999 (Tampering). If your error messages differ between "invalid username" and "invalid password," you have enabled Information Disclosure by allowing account enumeration. If the endpoint has no rate limit, you have enabled Denial of Service. If an exception in the registration flow silently grants a partial session, you have enabled Elevation of Privilege.
Every code example that follows in this guide was written with this kind of reasoning behind it. The question is never just "does this code work?" — it is "what happens to this code when someone actively tries to make it fail in a useful way?" Train yourself to ask three questions about every function you write that touches external data: What am I trusting here? What is the worst thing that happens if that trust is violated? And does my code fail safely when it does?
Security is a process, not a product. The goal is not to be unhackable, but to raise the cost of attack above the value of the target. — Bruce Schneier, security researcher and author (Paraphrased from "Secrets and Lies," 2000)
Managing Secrets and Environment Variables
Hardcoded credentials are among the more commonly exploited vulnerabilities in real-world applications. The Open Web Application Security Project (OWASP) lists "Security Misconfiguration" among its Top 10 risks year after year, and exposed secrets consistently appear in that category. According to GitGuardian's 2025 State of Secrets Sprawl report, nearly 23.8 million new hardcoded secrets were detected in public GitHub repositories in 2024 alone — a 25% increase over the previous year. Critically, 70% of secrets that leaked in 2022 remain active today, meaning developers often rotate credentials far too slowly after a known exposure event.
Attackers do not need sophisticated tools to exploit leaked credentials. A single exposed secret can grant unrestricted access to critical systems. — Eric Fourrier, CEO of GitGuardian (Paraphrased from the 2025 State of Secrets Sprawl report announcement)
The OWASP Secrets Management Cheat Sheet is clear: secrets must never exist in source code. They belong in environment variables, dedicated secret managers, or encrypted vaults. — OWASP Secrets Management Cheat Sheet (Paraphrased)
The correct approach is to keep all secrets outside your codebase entirely, loading them at runtime from environment variables or a secrets management service. Python's python-dotenv package makes this straightforward for local development, while production systems should use services like AWS Secrets Manager, HashiCorp Vault, or Azure Key Vault.
# requirements: pip install python-dotenv
import os
from dotenv import load_dotenv
# Load variables from a .env file (local development only)
# Your .env file should NEVER be committed to version control
load_dotenv()
DATABASE_URL = os.environ.get("DATABASE_URL")
SECRET_KEY = os.environ.get("SECRET_KEY")
API_KEY = os.environ.get("THIRD_PARTY_API_KEY")
# Fail loudly and early if a required secret is missing
if not DATABASE_URL:
raise EnvironmentError(
"DATABASE_URL environment variable is not set. "
"Check your .env file or deployment configuration."
)
if not SECRET_KEY:
raise EnvironmentError("SECRET_KEY environment variable is not set.")
Add .env to your .gitignore file immediately when creating a project. Once a secret is pushed to a public repository, you must treat it as compromised and rotate it, even if you delete the commit later. Git history is preserved and scrapeable by automated bots within seconds of a push.
For production deployments, use a proper secrets manager rather than flat environment variable files. Here is a pattern for AWS Secrets Manager using boto3:
# requirements: pip install boto3
import json
import boto3
from botocore.exceptions import ClientError
def get_secret(secret_name: str, region_name: str = "us-east-1") -> dict:
"""
Retrieve a secret from AWS Secrets Manager.
Returns the secret as a dictionary.
Raises an exception if the secret cannot be retrieved.
"""
client = boto3.session.Session().client(
service_name="secretsmanager",
region_name=region_name
)
try:
response = client.get_secret_value(SecretId=secret_name)
except ClientError as e:
error_code = e.response["Error"]["Code"]
if error_code == "ResourceNotFoundException":
raise ValueError(f"Secret '{secret_name}' not found.") from e
elif error_code == "AccessDeniedException":
raise PermissionError(f"No permission to access secret '{secret_name}'.") from e
else:
raise
secret_string = response.get("SecretString")
if secret_string:
return json.loads(secret_string)
raise ValueError(f"Secret '{secret_name}' has no string value.")
# Usage
db_credentials = get_secret("prod/myapp/database")
db_host = db_credentials["host"]
db_password = db_credentials["password"]
Password Hashing with bcrypt and Argon2
Storing passwords in plaintext is catastrophic. Storing them with MD5 or SHA-1 is only marginally better. These hashing algorithms are designed for speed, which means attackers can run billions of hash attempts per second using commodity hardware. The correct approach is to use a slow, purpose-built password hashing algorithm: bcrypt, scrypt, or Argon2.
NIST SP 800-63B-4, the U.S. government's revised digital identity guidelines (finalized July 2025), recommends using memory-hard hashing functions precisely because they make brute-force and dictionary attacks computationally expensive. Argon2 won the Password Hashing Competition in 2015, has been standardized as RFC 9106, and is now the first-choice recommendation for new applications. NIST explicitly identifies Argon2id as the preferred variant. The updated guidance also raises the minimum supported password length to 15 characters for single-factor authentication accounts and to 8 characters when paired with multi-factor authentication — a shift away from complexity rules and toward length and screening against known-breached credential lists. Crucially, the revision now mandates that verifiers shall not impose composition rules such as requiring a mix of character types — a reversal of years of industry practice that research has shown produces predictable, weaker passwords.
NIST SP 800-63B-4 mandates that stored passwords use salted, one-way key derivation functions resistant to offline attacks. — NIST SP 800-63B-4, Section 3.1.1 (Paraphrased)
Using bcrypt
# requirements: pip install bcrypt
import bcrypt
def hash_password(plaintext_password: str) -> bytes:
"""
Hash a password using bcrypt with an automatically generated salt.
The work factor (rounds) defaults to 12. Increase for higher security
at the cost of longer hashing time. Never go below 10 in production.
"""
password_bytes = plaintext_password.encode("utf-8")
salt = bcrypt.gensalt(rounds=12)
hashed = bcrypt.hashpw(password_bytes, salt)
return hashed
def verify_password(plaintext_password: str, hashed_password: bytes) -> bool:
"""
Verify a plaintext password against a stored bcrypt hash.
Uses a constant-time comparison to prevent timing attacks.
"""
password_bytes = plaintext_password.encode("utf-8")
return bcrypt.checkpw(password_bytes, hashed_password)
# Example usage
stored_hash = hash_password("correct-horse-battery-staple")
# Correct password
print(verify_password("correct-horse-battery-staple", stored_hash)) # True
# Wrong password
print(verify_password("wrongpassword123", stored_hash)) # False
Using Argon2 (recommended for new projects)
# requirements: pip install argon2-cffi
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError, VerificationError, InvalidHashError
# Configure Argon2 with memory-hard parameters
# time_cost: number of iterations
# memory_cost: in kibibytes (64 MB here)
# parallelism: number of parallel threads
ph = PasswordHasher(
time_cost=3,
memory_cost=65536, # 64 MB
parallelism=2,
hash_len=32,
salt_len=16
)
def hash_password_argon2(plaintext_password: str) -> str:
"""Hash a password using Argon2id. Returns a string containing
the algorithm parameters and the hash — safe to store directly."""
return ph.hash(plaintext_password)
def verify_password_argon2(stored_hash: str, plaintext_password: str) -> bool:
"""
Verify a password. Returns True on success.
Returns False on mismatch. Raises on malformed hash.
"""
try:
return ph.verify(stored_hash, plaintext_password)
except VerifyMismatchError:
return False
except (VerificationError, InvalidHashError) as e:
raise ValueError(f"Hash verification error: {e}") from e
def needs_rehash(stored_hash: str) -> bool:
"""
Check if a hash was generated with outdated parameters.
Rehash on next successful login if this returns True.
"""
return ph.check_needs_rehash(stored_hash)
# Example usage
h = hash_password_argon2("correct-horse-battery-staple")
print(verify_password_argon2(h, "correct-horse-battery-staple")) # True
print(verify_password_argon2(h, "wrongpassword")) # False
Implement a rehash-on-login strategy. When a user logs in successfully, check whether their stored hash was created with older parameters (lower work factor or a deprecated algorithm). If so, rehash their password immediately with current parameters and update the stored value. This lets you migrate security parameters without requiring a forced password reset.
Input Validation and Sanitization
Never trust data that originates from outside your application. This includes form fields, query parameters, API payloads, file uploads, HTTP headers, and environment variables read from untrusted sources. Input validation is the single highest-impact defensive measure you can implement.
Python's pydantic library is the industry standard for data validation in modern Python applications. It uses Python type annotations to define schemas and validates data automatically, with clear, structured error reporting.
# requirements: pip install pydantic[email]
from pydantic import BaseModel, EmailStr, field_validator, ValidationError
from typing import Optional
import re
import hashlib
import httpx
class UserRegistrationInput(BaseModel):
username: str
email: EmailStr
password: str
age: Optional[int] = None
@field_validator("username")
@classmethod
def username_must_be_alphanumeric(cls, v: str) -> str:
v = v.strip()
if not re.match(r"^[a-zA-Z0-9_]{3,32}$", v):
raise ValueError(
"Username must be 3-32 characters and contain only "
"letters, numbers, and underscores."
)
return v
@field_validator("password")
@classmethod
def password_policy(cls, v: str) -> str:
"""
NIST SP 800-63B-4 (July 2025) password policy:
- Minimum 15 characters for single-factor authentication
(8 characters when paired with MFA)
- SHALL NOT impose composition rules (no forced uppercase,
digits, or special characters)
- SHALL support at least 64 characters
- SHALL screen against known-breached password lists
- SHALL allow all printable ASCII, Unicode, and spaces
"""
if len(v) < 15:
raise ValueError(
"Password must be at least 15 characters. "
"NIST recommends length over complexity."
)
if len(v) > 64:
raise ValueError("Password must not exceed 64 characters.")
# Screen against known-breached passwords via the
# Have I Been Pwned k-anonymity API (no full hash sent)
if _is_breached_password(v):
raise ValueError(
"This password has appeared in a known data breach. "
"Please choose a different password."
)
return v
@field_validator("age")
@classmethod
def age_must_be_reasonable(cls, v: Optional[int]) -> Optional[int]:
if v is not None and (v < 13 or v > 120):
raise ValueError("Age must be between 13 and 120.")
return v
def _is_breached_password(password: str) -> bool:
"""
Check if a password appears in the Have I Been Pwned database
using k-anonymity: only the first 5 characters of the SHA-1 hash
are sent, so the full password is never exposed to the API.
Returns True if the password has been seen in a breach.
"""
sha1 = hashlib.sha1(password.encode("utf-8")).hexdigest().upper()
prefix, suffix = sha1[:5], sha1[5:]
try:
resp = httpx.get(
f"https://api.pwnedpasswords.com/range/{prefix}",
timeout=3.0
)
return suffix in resp.text
except httpx.HTTPError:
# Fail open on API error — do not block registration
# if the breach-check service is unreachable.
# Log this for monitoring.
return False
# Valid input
try:
user = UserRegistrationInput(
username="alice_dev",
email="[email protected]",
password="correct-horse-battery-staple",
age=28
)
print(f"Valid user: {user.username}, {user.email}")
except ValidationError as e:
print(f"Validation failed: {e}")
# Invalid input — triggers multiple validation errors
try:
bad_user = UserRegistrationInput(
username="a",
email="not-an-email",
password="weak",
age=5
)
except ValidationError as e:
for error in e.errors():
print(f"Field '{error['loc'][0]}': {error['msg']}")
Validation and sanitization are different things. Validation rejects bad input. Sanitization transforms input into a safe form. You should validate first, and sanitize only when you have a legitimate reason to accept and clean imperfect input. For user-facing fields like names, strip whitespace and normalize Unicode. For HTML content, use the nh3 library (a Rust-backed HTML sanitizer) to allow only a safe subset of tags. The previously common bleach library has been deprecated since January 2023 and should not be used in new projects.
Notice the password validator above does not enforce composition rules like "must contain an uppercase letter and a symbol." This is deliberate. NIST SP 800-63B-4 (July 2025) now explicitly states that verifiers shall not impose arbitrary composition requirements. Research consistently shows these rules lead to predictable patterns — users append "1!" to satisfy requirements, which provides negligible additional entropy. Instead, enforce a minimum length of 15 characters for single-factor accounts (8 when MFA is in use), screen every new password against known-breached credential lists using a service like the Have I Been Pwned k-anonymity API, and encourage natural-language passphrases. This shift in guidance represents one of the more impactful practical changes in password security in the last decade.
Preventing SQL Injection
SQL injection has appeared in every edition of the OWASP Top 10 since its inception. In the 2025 edition (announced November 2025), Injection is ranked A05 and carries the greatest number of CVEs of any category — more than 14,000 for SQL injection alone. The persistence of this vulnerability is a testament not to its complexity, but to the ease of making the mistake. The fix is always the same: use parameterized queries. Never build SQL from untrusted input.
# DANGEROUS — Never do this
username = request.args.get("username")
query = f"SELECT * FROM users WHERE username = '{username}'"
# An attacker passes: ' OR '1'='1
# Query becomes: SELECT * FROM users WHERE username = '' OR '1'='1'
# This returns every row in the users table.
# SAFE — Parameterized queries with sqlite3 (standard library)
import sqlite3
from typing import Optional
def get_user_by_username(db_path: str, username: str) -> Optional[dict]:
"""
Safely fetch a user by username using parameterized query.
The database driver handles escaping; injection is not possible.
"""
with sqlite3.connect(db_path) as conn:
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute(
"SELECT id, username, email, created_at FROM users WHERE username = ?",
(username,) # Parameters are always passed as a tuple
)
row = cursor.fetchone()
return dict(row) if row else None
def create_user(db_path: str, username: str, email: str, password_hash: str) -> int:
"""
Safely insert a new user. Returns the new row ID.
"""
with sqlite3.connect(db_path) as conn:
cursor = conn.cursor()
cursor.execute(
"INSERT INTO users (username, email, password_hash) VALUES (?, ?, ?)",
(username, email, password_hash)
)
conn.commit()
return cursor.lastrowid
# SAFE — SQLAlchemy ORM (recommended for larger applications)
# requirements: pip install sqlalchemy
from sqlalchemy import create_engine, text
from sqlalchemy.orm import Session
engine = create_engine("postgresql+psycopg2://user:pass@localhost/mydb")
def get_user_by_email(session: Session, email: str) -> Optional[dict]:
"""
Parameterized query using SQLAlchemy's text() with bound parameters.
SQLAlchemy's ORM query API is also safe and preferred over raw SQL.
"""
result = session.execute(
text("SELECT id, username, email FROM users WHERE email = :email"),
{"email": email}
)
row = result.fetchone()
return row._asdict() if row else None
Deserialization and Object Injection
Insecure deserialization is one of the more quietly dangerous vulnerabilities in Python applications. Python's built-in pickle module can execute arbitrary code during deserialization — any object that defines a __reduce__ method controls what happens when it is unpickled. This means that loading a pickle file from an untrusted source is functionally identical to running exec() on attacker-controlled input. The OWASP 2025 Top 10 places this under A08: Software or Data Integrity Failures.
This is not a theoretical concern. Pickle-based remote code execution has been exploited in real-world attacks against machine learning pipelines, cached session stores, and inter-service message queues. If any of your systems accept serialized Python objects from external sources — including model files from public repositories, cached data from shared storage, or messages from untrusted queues — they are vulnerable.
# DANGEROUS — pickle can execute arbitrary code on load
import pickle
# An attacker could craft a pickle payload that runs os.system("rm -rf /")
# Never unpickle data from untrusted or unverified sources.
data = pickle.loads(untrusted_bytes) # Remote code execution risk
# SAFE — use JSON for data interchange; use HMAC to verify integrity
# when you must use pickle (e.g., internal caching with trusted data)
import json
import hmac
import hashlib
SIGNING_KEY = b"your-secret-signing-key" # Load from secrets manager
def serialize_signed(data: dict) -> tuple[bytes, str]:
"""Serialize data to JSON and generate an HMAC signature."""
payload = json.dumps(data, sort_keys=True).encode("utf-8")
signature = hmac.new(
SIGNING_KEY, payload, hashlib.sha256
).hexdigest()
return payload, signature
def deserialize_verified(payload: bytes, signature: str) -> dict:
"""Deserialize JSON only after verifying the HMAC signature."""
expected = hmac.new(
SIGNING_KEY, payload, hashlib.sha256
).hexdigest()
if not hmac.compare_digest(expected, signature):
raise ValueError("Signature verification failed — data may be tampered.")
return json.loads(payload)
# For ML model files: use safetensors instead of pickle-based formats
# pip install safetensors
# safetensors stores tensors without any code execution capability
pickle, shelve, marshal, and yaml.load() (without Loader=SafeLoader) can all execute arbitrary code during deserialization. For data interchange, always prefer JSON, MessagePack, or Protocol Buffers. If you must use pickle internally (for example, in a trusted cache layer), sign the data with HMAC before writing and verify the signature before loading. For machine learning workflows, the safetensors library provides a format that stores model weights without any code execution surface, and is now the recommended approach over pickle-based .pt or .pkl model files.
Cryptography: Encrypting Sensitive Data
When you need to encrypt data at rest — things like personally identifiable information, API tokens, or financial records — use the cryptography library. It is maintained by the Python Cryptographic Authority (PyCA) and is the recommended cryptographic toolkit for Python. Do not implement your own encryption algorithms. Do not use the deprecated pycrypto package.
For symmetric encryption of arbitrary data, Fernet is the simplest safe choice. It uses AES-128-CBC with HMAC-SHA256 for authentication, which means you get both confidentiality and integrity protection without any additional work. Note that Fernet uses a 256-bit key internally, split into a 128-bit AES encryption key and a 128-bit HMAC signing key — the HMAC verification happens before decryption, which blocks padding oracle attacks. For scenarios requiring AES-256 or authenticated encryption with associated data (AEAD), consider AES-GCM via the hazmat layer, but only if you understand nonce management: reusing a GCM nonce with the same key is catastrophic. When in doubt, Fernet is the correct choice.
For applications managing multiple encryption keys over time, MultiFernet provides built-in key rotation: add a new key at the front of the list, and the library will encrypt new data with it while transparently decrypting old data with whichever key matches.
# requirements: pip install cryptography
from cryptography.fernet import Fernet
import base64
import os
def generate_fernet_key() -> bytes:
"""
Generate a new Fernet key. Store this securely (e.g., in a secrets manager).
Losing this key means losing access to all data encrypted with it.
"""
return Fernet.generate_key()
def encrypt_data(plaintext: str, key: bytes) -> bytes:
"""
Encrypt a string using Fernet symmetric encryption.
Returns ciphertext bytes (base64-encoded and HMAC-authenticated).
"""
f = Fernet(key)
return f.encrypt(plaintext.encode("utf-8"))
def decrypt_data(ciphertext: bytes, key: bytes) -> str:
"""
Decrypt Fernet ciphertext. Raises InvalidToken if the key is wrong
or the ciphertext has been tampered with.
"""
from cryptography.fernet import InvalidToken
f = Fernet(key)
try:
return f.decrypt(ciphertext).decode("utf-8")
except InvalidToken as e:
raise ValueError("Decryption failed: invalid key or tampered data.") from e
# Usage
key = generate_fernet_key()
# Store 'key' in your secrets manager — never hardcode it.
sensitive = "SSN: 123-45-6789"
encrypted = encrypt_data(sensitive, key)
print(f"Encrypted: {encrypted}")
decrypted = decrypt_data(encrypted, key)
print(f"Decrypted: {decrypted}")
Asymmetric Encryption (RSA) for Key Exchange
# requirements: pip install cryptography
from cryptography.hazmat.primitives.asymmetric import rsa, padding
from cryptography.hazmat.primitives import hashes, serialization
def generate_rsa_keypair(key_size: int = 4096):
"""
Generate an RSA key pair. 4096-bit keys are recommended for long-lived data.
Returns (private_key, public_key).
"""
private_key = rsa.generate_private_key(
public_exponent=65537,
key_size=key_size
)
public_key = private_key.public_key()
return private_key, public_key
def rsa_encrypt(plaintext: bytes, public_key) -> bytes:
"""Encrypt bytes with RSA public key using OAEP padding (recommended)."""
return public_key.encrypt(
plaintext,
padding.OAEP(
mgf=padding.MGF1(algorithm=hashes.SHA256()),
algorithm=hashes.SHA256(),
label=None
)
)
def rsa_decrypt(ciphertext: bytes, private_key) -> bytes:
"""Decrypt RSA ciphertext with the private key."""
return private_key.decrypt(
ciphertext,
padding.OAEP(
mgf=padding.MGF1(algorithm=hashes.SHA256()),
algorithm=hashes.SHA256(),
label=None
)
)
# Generate keys
priv, pub = generate_rsa_keypair()
# Encrypt a small secret (RSA is for key exchange, not bulk data)
secret_key = b"my-32-byte-aes-key-placeholder!!"
ciphertext = rsa_encrypt(secret_key, pub)
recovered = rsa_decrypt(ciphertext, priv)
print(recovered == secret_key) # True
Secure Token Generation
Python's built-in random module uses a pseudo-random number generator (PRNG) that is not cryptographically secure. Using it for session tokens, password reset links, API keys, or any security-sensitive value is a vulnerability. Use the secrets module, which has been part of the standard library since Python 3.6 and is explicitly designed for generating cryptographically strong random values.
Python's secrets module exists specifically for generating cryptographically strong random values for tokens, keys, and other security credentials.
— Python 3 Documentation (Paraphrased)
import secrets
import hashlib
import hmac
import time
def generate_session_token(nbytes: int = 32) -> str:
"""
Generate a URL-safe session token using a cryptographically secure RNG.
32 bytes = 256 bits of entropy, which is more than sufficient.
"""
return secrets.token_urlsafe(nbytes)
def generate_api_key(prefix: str = "pk") -> str:
"""
Generate an API key with a readable prefix for identification.
Format: prefix_
"""
token = secrets.token_hex(32)
return f"{prefix}_{token}"
def generate_password_reset_token() -> tuple[str, str, float]:
"""
Generate a password reset token and its HMAC-SHA256 hash for storage.
Returns (raw_token_for_email, hash_for_db, expiry_timestamp).
Store only the hash in the database; send the raw token to the user.
Expiry is 1 hour from generation.
"""
raw_token = secrets.token_urlsafe(32)
token_hash = hashlib.sha256(raw_token.encode()).hexdigest()
expiry = time.time() + 3600 # 1 hour
return raw_token, token_hash, expiry
def verify_reset_token(raw_token: str, stored_hash: str, expiry: float) -> bool:
"""
Verify a password reset token using constant-time comparison.
Constant-time comparison prevents timing-based attacks.
"""
if time.time() > expiry:
return False # Token has expired
expected_hash = hashlib.sha256(raw_token.encode()).hexdigest()
# hmac.compare_digest performs a constant-time comparison
return hmac.compare_digest(expected_hash, stored_hash)
# Usage
session_token = generate_session_token()
api_key = generate_api_key("pk")
raw, hashed, expires = generate_password_reset_token()
print(f"Session token: {session_token}")
print(f"API key: {api_key}")
print(f"Reset token (send to user): {raw}")
print(f"Reset hash (store in DB): {hashed}")
# Verify
valid = verify_reset_token(raw, hashed, expires)
print(f"Token valid: {valid}") # True
Never compare security tokens with the == operator. Regular string comparison short-circuits on the first non-matching character, which leaks timing information an attacker can use to guess tokens byte by byte. Always use hmac.compare_digest() or secrets.compare_digest() for any security-sensitive comparison.
Dependency and Supply Chain Security
Your application's security is only as strong as the packages it depends on. The OWASP 2025 Top 10 elevated this risk dramatically: the previous "Vulnerable and Outdated Components" category was expanded into A03: Software Supply Chain Failures — now the third-highest risk category — to encompass the entire ecosystem of dependencies, build systems, and distribution infrastructure. Supply chain attacks against Python packages have increased correspondingly: the Socket Security research team documented thousands of malicious packages published to PyPI in 2023 and 2024, many using typosquatting (names like reqeusts or boto) to trick developers into installing them.
The foundational practice is pinning all dependencies to exact versions using a lockfile and verifying their integrity using hash verification. pip-compile from the pip-tools package and poetry.lock from Poetry are the standard tools for this.
# Generate a requirements.txt with pinned versions and hashes
# (Run this in a shell, not Python)
#
# pip install pip-tools
# pip-compile --generate-hashes requirements.in
#
# The resulting requirements.txt will look like:
# requests==2.31.0 \
# --hash=sha256:58cd2187423d769 \
# --hash=sha256:942c5a758f98d79
#
# Install with:
# pip install --require-hashes -r requirements.txt
# Automate dependency vulnerability scanning in CI/CD
# pip install safety
# Run: safety check --full-report
#
# Or use pip-audit (maintained by Google/PyPA):
# pip install pip-audit
# Run: pip-audit
# Python script to run pip-audit programmatically
import subprocess
import sys
import json
def audit_dependencies() -> dict:
"""
Run pip-audit and return parsed results.
Exits with a non-zero code if vulnerabilities are found.
"""
result = subprocess.run(
["pip-audit", "--format", "json", "--output", "-"],
capture_output=True,
text=True
)
if result.returncode != 0 and not result.stdout:
print(f"pip-audit error: {result.stderr}", file=sys.stderr)
sys.exit(1)
audit_data = json.loads(result.stdout)
vulnerabilities = audit_data.get("vulnerabilities", [])
if vulnerabilities:
print(f"FOUND {len(vulnerabilities)} VULNERABLE PACKAGE(S):")
for vuln in vulnerabilities:
print(f" - {vuln['name']} {vuln['version']}: {vuln['id']}")
sys.exit(1)
else:
print("No known vulnerabilities found.")
return audit_data
if __name__ == "__main__":
audit_dependencies()
Run pip-audit or safety check as a step in your CI/CD pipeline so that a build fails automatically if a known CVE is introduced. Pair this with Dependabot or Renovate Bot to receive automated pull requests when new secure versions are released.
Secure Logging Practices
Logs are frequently the first place attackers look when they gain access to a system, and they are also a common source of accidental data leakage. Passwords, tokens, credit card numbers, and personally identifiable information (PII) have all been found in application logs. Structured logging with explicit field filtering is the right approach.
import logging
import json
import re
from typing import Any
# Fields whose values should be redacted in all log output
SENSITIVE_FIELDS = {
"password", "passwd", "secret", "token", "api_key",
"authorization", "credit_card", "ssn", "cvv"
}
REDACTED = "[REDACTED]"
def sanitize_log_data(data: Any, depth: int = 0) -> Any:
"""
Recursively redact sensitive fields from dicts before logging.
Handles nested structures up to depth 10.
"""
if depth > 10:
return data
if isinstance(data, dict):
return {
key: REDACTED if key.lower() in SENSITIVE_FIELDS
else sanitize_log_data(value, depth + 1)
for key, value in data.items()
}
if isinstance(data, (list, tuple)):
sanitized = [sanitize_log_data(item, depth + 1) for item in data]
return type(data)(sanitized)
# Redact anything that looks like a Bearer token in plain strings
if isinstance(data, str):
return re.sub(r"Bearer\s+\S+", "Bearer [REDACTED]", data)
return data
class SecureJSONFormatter(logging.Formatter):
"""
A logging formatter that outputs structured JSON and automatically
redacts sensitive fields. Use this in production logging pipelines.
"""
def format(self, record: logging.LogRecord) -> str:
log_data = {
"timestamp": self.formatTime(record, self.datefmt),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
}
# Include extra structured fields if provided
if hasattr(record, "extra"):
log_data["extra"] = sanitize_log_data(record.extra)
if record.exc_info:
log_data["exception"] = self.formatException(record.exc_info)
return json.dumps(log_data)
def get_secure_logger(name: str) -> logging.Logger:
"""Configure and return a logger with the SecureJSONFormatter."""
logger = logging.getLogger(name)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(SecureJSONFormatter())
logger.addHandler(handler)
logger.propagate = False
return logger
# Usage
logger = get_secure_logger("myapp.auth")
# Safe — sensitive fields are redacted automatically
user_event = {
"user_id": "usr_abc123",
"email": "[email protected]",
"password": "supersecret123", # This will be redacted
"action": "login_attempt"
}
log_record = logger.makeRecord(
"myapp.auth", logging.INFO, "", 0,
"User login attempt", (), None
)
log_record.extra = sanitize_log_data(user_event)
logger.handle(log_record)
Safe File Handling
File handling is a common source of path traversal vulnerabilities. If an attacker can control part of a file path — for example, in a user-controlled file upload or download endpoint — they may be able to read or overwrite files outside the intended directory. The fix is to always resolve and validate the final path before performing any file operation.
import os
from pathlib import Path
UPLOAD_DIR = Path("/var/app/uploads").resolve()
def safe_open_file(filename: str, base_dir: Path = UPLOAD_DIR) -> Path:
"""
Safely resolve a user-supplied filename within a base directory.
Raises ValueError if the resolved path escapes the base directory
(path traversal attempt).
Example attack: filename = "../../etc/passwd"
This function detects and blocks it.
"""
# Resolve the absolute path without following symlinks prematurely
resolved = (base_dir / filename).resolve()
# Ensure the resolved path is still inside the base directory
try:
resolved.relative_to(base_dir)
except ValueError:
raise ValueError(
f"Path traversal attempt detected: '{filename}' resolves "
f"outside of allowed directory '{base_dir}'"
)
return resolved
def save_upload(filename: str, content: bytes) -> Path:
"""
Save uploaded file content to the upload directory safely.
Sanitizes the filename, validates the path, and writes the file.
"""
import re
# Strip any directory components from the filename
safe_filename = Path(filename).name
# Allow only alphanumeric characters, dashes, underscores, and dots
if not re.match(r"^[a-zA-Z0-9_\-\.]+$", safe_filename):
raise ValueError(f"Filename '{safe_filename}' contains invalid characters.")
# Prevent hidden files and double-extension attacks
if safe_filename.startswith("."):
raise ValueError("Hidden filenames are not permitted.")
destination = safe_open_file(safe_filename)
# Write atomically using a temp file, then rename
import tempfile
with tempfile.NamedTemporaryFile(
dir=UPLOAD_DIR, delete=False, suffix=".tmp"
) as tmp:
tmp.write(content)
tmp_path = Path(tmp.name)
tmp_path.rename(destination)
return destination
# Examples
try:
path = safe_open_file("report.pdf")
print(f"Safe path: {path}")
except ValueError as e:
print(f"Blocked: {e}")
try:
path = safe_open_file("../../etc/passwd")
except ValueError as e:
print(f"Blocked traversal attempt: {e}")
For user file uploads, also validate file content using the python-magic library to verify MIME types based on file headers rather than the filename extension. An attacker can rename a PHP script to image.jpg — checking the extension alone is not sufficient. Always validate the actual content type server-side.
Static Analysis and Bandit
Every other section in this guide describes what to do correctly at the point of writing code. Static analysis flips that around: it scans your existing code automatically and flags patterns that are known to be dangerous. bandit is the standard static security analysis tool for Python, developed and maintained by the PyCQA (Python Code Quality Authority). It checks for over one hundred security issues, including hard-coded passwords, use of the insecure random module for cryptographic purposes, calls to subprocess with shell=True, SQL string formatting, and much more.
# Install bandit
# pip install bandit
# Scan a single file
# bandit myapp.py
# Scan an entire project directory recursively
# bandit -r ./src
# Scan with a severity threshold (only show HIGH severity)
# bandit -r ./src -l -ll
# Output a JSON report for CI integration
# bandit -r ./src -f json -o bandit-report.json
# Example: the following pattern will be flagged by bandit
import subprocess
# DANGEROUS: shell=True with user-controlled input enables command injection
def run_command_unsafe(user_input: str) -> str:
result = subprocess.run(
f"ls {user_input}", # bandit flags this: B603, B607
shell=True,
capture_output=True,
text=True
)
return result.stdout
# SAFE: pass arguments as a list, never as a shell string
def run_command_safe(directory: str) -> str:
import shlex
# Validate directory is an allowed path before calling
allowed_dirs = ["/var/app/data", "/var/app/reports"]
if directory not in allowed_dirs:
raise ValueError(f"Directory not permitted: {directory}")
result = subprocess.run(
["ls", directory], # List form — no shell expansion, no injection risk
shell=False,
capture_output=True,
text=True,
check=True
)
return result.stdout
Add bandit as a pre-commit hook and as a CI step alongside pip-audit. bandit catches insecure code patterns; pip-audit catches vulnerable dependencies. Together they cover two completely different failure modes. Neither replaces the other.
Secure Exception Handling
The OWASP Top 10 2025 (announced November 2025) introduced a new category at A10: Mishandling of Exceptional Conditions. This is the first time OWASP has explicitly called out exception handling as a class of security failure — and it reflects a real pattern seen in production systems. When applications break unsafely, they often leak internal details: stack traces containing class names, file paths, and database schema; keys or credentials in error messages; or fail-open logic that grants access when the expected check throws an exception instead of denying cleanly.
The rule is: always fail closed. Log the full technical detail internally where only your team can see it, and return a generic, information-free message to the caller.
import logging
import traceback
from typing import Any
logger = logging.getLogger("myapp.api")
class AppError(Exception):
"""Base application exception with a safe public message."""
def __init__(self, public_message: str, internal_detail: str = ""):
self.public_message = public_message
self.internal_detail = internal_detail or public_message
super().__init__(internal_detail)
class AuthError(AppError):
pass
class NotFoundError(AppError):
pass
def handle_request(user_id: str, action: str) -> dict[str, Any]:
"""
Demonstrates fail-closed exception handling.
All unexpected errors return a generic 500 message to the caller;
internal details are logged server-side only.
"""
try:
result = process_action(user_id, action)
return {"status": "ok", "data": result}
except AuthError as e:
# Known, expected error — safe to surface a controlled message
logger.warning("Auth failure for user=%s action=%s: %s",
user_id, action, e.internal_detail)
return {"status": "error", "message": e.public_message}
except NotFoundError as e:
logger.info("Resource not found: %s", e.internal_detail)
return {"status": "error", "message": e.public_message}
except Exception:
# Unknown error — log the full trace internally, expose nothing
logger.error(
"Unhandled exception for user=%s action=%s:\n%s",
user_id, action, traceback.format_exc()
)
# CRITICAL: do not surface stack traces, internal paths, or DB details
return {
"status": "error",
"message": "An internal error occurred. Please try again later."
}
def process_action(user_id: str, action: str) -> str:
# Placeholder — your actual business logic goes here
if not user_id:
raise AuthError(
public_message="Authentication required.",
internal_detail=f"Missing user_id for action '{action}'"
)
return f"Action '{action}' completed."
A bare except: pass block is a security risk, not a safety net. It silently swallows errors that might indicate an active attack, a corrupted state, or a failed authorization check. If you catch everything and do nothing, you eliminate your ability to detect and respond to incidents. Always log at minimum, and always fail closed when the outcome is uncertain.
AI Coding Assistants and Secret Sprawl
AI-assisted coding tools have become a standard part of many development workflows. They are genuinely useful — and they introduce a specific, underappreciated security risk that no other section in this guide covers: the tendency to generate and suggest hardcoded secrets.
GitGuardian's 2025 State of Secrets Sprawl report found that public repositories using GitHub Copilot had a 6.4% secret leakage rate — approximately 40% higher than repositories that did not use AI-assisted coding tools. This is not a flaw in AI coding tools specifically — it is a reflection of how they work. These tools learn from existing code, which contains millions of examples of hardcoded credentials, example API keys, placeholder tokens, and test configurations. When you ask an AI assistant to scaffold a database connection, generate a test file, or fill in a config template, it may produce exactly the kind of hardcoded secret pattern you are trying to avoid. It will look like working code. It will often run without errors. And if you commit it without review, you have just become a statistic in next year's report.
The mitigations are not complicated, but they require deliberate habit formation.
# What an AI assistant might generate (DANGEROUS)
# This is a realistic example of what code completion tools produce
DATABASE_URL = "postgresql://admin:[email protected]:5432/myapp"
SECRET_KEY = "hardcoded-django-secret-key-do-not-use"
STRIPE_API_KEY = "sk_live_abc123xyz789"
# What it should look like (SAFE)
import os
from dotenv import load_dotenv
load_dotenv()
DATABASE_URL = os.environ["DATABASE_URL"]
SECRET_KEY = os.environ["SECRET_KEY"]
STRIPE_API_KEY = os.environ["STRIPE_API_KEY"]
# Fail loudly if any required secret is missing
for key in ("DATABASE_URL", "SECRET_KEY", "STRIPE_API_KEY"):
if not os.environ.get(key):
raise EnvironmentError(
f"Required environment variable '{key}' is not set. "
"Check your .env file or deployment configuration."
)
Treat every AI-generated code snippet as untrusted input that requires security review before committing — the same standard you would apply to a code sample copied from a tutorial. Specific things to check: hardcoded credentials, shell=True in subprocess calls, string-formatted SQL queries, use of random for security values, and disabled SSL verification. Running bandit and pip-audit on every commit catches many of these automatically. GitGuardian's free CLI tool ggshield can also be configured as a pre-commit hook to block secret commits before they reach your remote.
Key Takeaways
- Secrets belong outside your code. Use environment variables for local development and a proper secrets manager (AWS Secrets Manager, HashiCorp Vault, or equivalent) in production. Add
.envto.gitignorefrom the start of every project. According to GitGuardian's 2025 data, 70% of secrets leaked in 2022 are still active — rotation after exposure is non-negotiable. - Hash passwords with Argon2id or bcrypt. Never store plaintext passwords or use general-purpose hash functions like MD5 or SHA-1 for credential storage. NIST SP 800-63B-4 (finalized July 2025) identifies Argon2id as the preferred algorithm. Do not impose composition rules (forced uppercase, digits, symbols) — NIST now explicitly prohibits them. Instead, enforce a 15-character minimum for single-factor accounts, screen against breached-password lists, and implement rehash-on-login to migrate parameters without forcing resets.
- Validate all external input with a schema library. Pydantic is the current standard. Validate type, format, length, and range. Reject or sanitize before the data reaches any downstream system.
- Parameterize every database query. String-formatted SQL is never acceptable. Use the
?or:paramplaceholder syntax of your database driver, or use an ORM. SQL injection is A05 in the OWASP 2025 Top 10 with over 14,000 CVEs catalogued — it is not a theoretical risk. - Never deserialize untrusted data with
pickle. Python'spickle,shelve,marshal, andyaml.load()withoutSafeLoadercan all execute arbitrary code. Use JSON, MessagePack, or Protocol Buffers for data interchange. For ML model files, usesafetensorsinstead of pickle-based formats. If pickle is unavoidable internally, sign the data with HMAC and verify before loading. - Use the
cryptographylibrary for encryption and thesecretsmodule for token generation. Never userandomfor security-sensitive values. UseMultiFernetfor key rotation at rest. Never build your own cryptographic primitives. - Pin and audit your dependencies. Use
pip-auditin CI to catch known CVEs before they reach production. Generate hash-pinned lockfiles withpip-compile --generate-hashesto prevent supply chain tampering. Consider generating a Software Bill of Materials (SBOM) withpip-audit --format cyclonedxfor regulated environments. - Redact sensitive data from logs before it is written. Structured JSON logging with field-level filtering prevents accidental PII exposure and reduces the value of log access to an attacker.
- Resolve and validate file paths before any file operation. Use
Path.resolve()andPath.relative_to()to detect and block path traversal attempts when handling user-supplied filenames. - Run
banditas a CI gate. Static analysis catches dangerous code patterns —shell=True, string-formatted SQL, insecure use ofrandom— before they reach production. Pair it withpip-auditfor dependency coverage. They address entirely different failure modes and are not interchangeable. - Always fail closed. The OWASP 2025 Top 10 added Mishandling of Exceptional Conditions (A10) because fail-open logic and information-leaking error messages are a genuine, exploitable class of vulnerability. Log full technical detail internally; return generic messages externally.
- Review AI-generated code for security issues before committing. AI coding assistants can and do generate hardcoded credentials, shell-injection-prone subprocess calls, and parameterless SQL. Treat every AI-generated snippet as untrusted until reviewed with the same rigor you would apply to any third-party code.
Security is not a feature you add at the end of a project. It is a collection of habits and patterns that compound over time. The code examples in this guide cover the highest-impact areas, but the full picture also includes transport security (TLS everywhere, enforced HSTS), authentication and authorization design (least privilege at the database and API layer), rate limiting on sensitive endpoints, content security policies for web applications, SBOM generation for regulated environments, and keeping your runtime and operating system patched. Start with the fundamentals covered here, and build from there.
For further reading, the OWASP Cheat Sheet Series is among the more comprehensive freely available references for application security. The Python secrets module documentation, the PyCA cryptography library docs, and the Bandit documentation are essential reading for anyone handling sensitive data in Python. The GitGuardian 2025 State of Secrets Sprawl report and NIST SP 800-63B-4 are the two primary reference documents behind the guidance in the secrets and password sections of this guide. The OWASP Top 10 2025 provides the authoritative risk rankings referenced throughout.