MCP Servers: How They Work, How to Build Them, and Why Security Matters

The Model Context Protocol has gone from a November 2024 Anthropic proposal to an industry-wide standard in roughly 14 months. If you write Python and work with AI agents, understanding how MCP servers actually function — and where they break — is no longer optional.

Before MCP arrived, connecting an LLM to an external system — a database, a file system, a Git repository, a third-party API — meant writing a custom integration every single time. There was no shared interface, no common vocabulary between the AI layer and the tool layer. Anthropic described this as an "N×M" data integration problem: every new AI model multiplied by every new external tool produced a fresh pile of bespoke connector code that was time-consuming to write and often fragile to maintain.

MCP addresses this directly. It is an open standard that gives AI systems a single, consistent way to declare what they need and gives external services a single, consistent way to respond. The spec draws on the design patterns of the Language Server Protocol (LSP) and rides on JSON-RPC 2.0 as its transport envelope. In December 2025, Anthropic donated governance of MCP to the Agentic AI Foundation (AAIF), a directed fund under the Linux Foundation co-founded by Anthropic, Block, and OpenAI — a move that signals the protocol is no longer proprietary infrastructure but a community-owned standard.

What MCP Is and Why It Exists

The Model Context Protocol defines how AI applications provide context to large language models in a standardized way, separating the concern of what context to provide from the concern of how the LLM uses it. In concrete architectural terms, it introduces three distinct roles.

The host is the application the end user actually interacts with: Claude for Desktop, an IDE like Cursor or Windsurf, a chatbot, or a custom agentic pipeline. The client lives inside the host and is responsible for managing connections to one or more MCP servers, discovering their capabilities, forwarding requests, and handling responses. The server is the process that wraps an external system — a filesystem, a database, a Slack workspace, a Git repository — and exposes it through the MCP specification.

"MCP servers are the bridge between the MCP world and the specific functionality of an external system. They are essentially wrappers that expose external capabilities according to the MCP specification." — Philipp Schmid, philschmid.de, April 2025

The MCP client does not directly call the LLM on behalf of the server in the default interaction pattern. Instead, the client passes the user's request along with information gathered from the servers to the LLM, which responds with the tool it wants to invoke and the parameters it wants to use. The server executes that tool call and returns structured results; the client hands those results back to the LLM, which generates the final response for the user. The LLM itself never directly communicates with the MCP server — the client is always the intermediary.

Adoption has been fast. OpenAI officially adopted MCP in March 2025, integrating the standard across its products including the ChatGPT desktop app. Google DeepMind followed. IDEs and coding platforms — Cursor, Windsurf, Replit, Sourcegraph, Zed — all added MCP support. By mid-2025, over 13,000 MCP servers had been published on GitHub alone.

Note

MCP has been compared to OpenAPI, which describes REST APIs in a machine-readable way. The analogy is fair: both are specification layers that let tooling autodiscover capabilities. The difference is that MCP is purpose-built for the agentic context window — it carries not just schema descriptions but executable primitives the LLM can invoke at runtime.

The Three Core Primitives: Tools, Resources, and Prompts

Every capability an MCP server exposes falls into one of three categories. Understanding these primitives precisely is the prerequisite for building, using, and auditing any MCP integration.

Tools

Tools are the workhorses of MCP. They are functions that the LLM can call to execute code or produce a side effect in the external world: sending an email, querying a database, writing a file, calling an API. The official documentation frames them as analogous to POST endpoints in a REST API — they take parameters and cause something to happen. Every tool definition carries a name, a natural-language description the LLM uses to decide when to invoke it, and a JSON Schema describing its input parameters.

When an MCP client initializes a session, it calls list_tools() on each connected server. The returned array of tool definitions gets surfaced to the LLM as part of its available context. When the LLM decides to invoke a tool, the client calls call_tool(name, arguments) and passes the result back. The spec introduced structured output schemas in the June 2025 revision (spec revision 2025-06-18), allowing servers to declare the JSON shape of a tool's return value and have that validated automatically — a significant improvement for type safety in agentic pipelines.

Resources

Resources are file-like data that clients can read and load into the LLM's context. The official documentation frames them as analogous to GET endpoints — they surface information without triggering side effects. A resource might be a file on disk, an API response, a database record, or any structured blob of text or binary content. Clients discover available resources with list_resources() and retrieve their content with read_resource(uri).

Prompts

Prompts are reusable templates for LLM interactions. A server that exposes prompts is essentially providing pre-written instruction sets that help users accomplish specific tasks consistently. Clients enumerate available prompt templates with list_prompts() and fetch a rendered prompt — optionally parameterized — with get_prompt(name, arguments). OpenAI's Agents SDK, for example, uses this feature to let MCP servers dynamically generate agent instructions at runtime.

Pro Tip

When building MCP servers, treat tool descriptions as first-class documentation. The LLM reads them to decide which tool to invoke. A vague or ambiguous description leads to incorrect tool selection — which means incorrect tool execution. Write descriptions the way you would write a clear function docstring intended for a reader who cannot see the implementation.

Transport Mechanisms: How Clients and Servers Talk

MCP is transport-agnostic at the specification level, but in practice the Python SDK and the wider ecosystem support three main options, each suited to a different deployment context.

stdio (Standard Input/Output)

The stdio transport is used when the client and server run on the same machine. The MCP client spawns the server as a child process and communicates with it through its standard input and standard output streams. This is the simplest option and the default for local integrations such as filesystem access or running a local script. It requires no network configuration and is straightforward to set up in Claude for Desktop's claude_desktop_config.json.

One critical implementation detail: for stdio-based servers, never write to stdout. Writing to stdout corrupts the JSON-RPC message stream and breaks the server. Use sys.stderr for any logging output. The official MCP build documentation is explicit on this point, and it is a common source of confusing failures for developers first building local servers.

# STDIO logging: always write to stderr, never stdout
import sys
import logging

# Bad - corrupts the JSON-RPC stream
print("Processing request")

# Good - writes to stderr, leaves stdout clean for the protocol
print("Processing request", file=sys.stderr)
logging.info("Processing request")  # also writes to stderr by default

HTTP with Server-Sent Events (SSE)

The SSE transport is used when the client and server run on different machines, or when the server needs to be reachable over a network. The client connects to the server over HTTP. After an initial handshake, the server can push events to the client over a persistent connection using the SSE standard. This is the transport of choice for hosted, production MCP servers accessed by remote clients.

Streamable HTTP

The MCP specification introduced Streamable HTTP as a newer transport mechanism that uses a single HTTP endpoint for bidirectional messaging. It supports both standard request-response patterns and optional SSE-style streaming within the same endpoint, making the communication model simpler than maintaining separate SSE channels. The Python SDK's FastMCP class supports this transport via the streamable-http argument to mcp.run(). The spec positions Streamable HTTP as the eventual successor to plain SSE for network-accessible servers, though both are fully supported today.

Note

The stdio transport is not suitable for production use cases that require multiple client connections, network accessibility, or authentication. If you are exposing an MCP server to more than one local client, or to any remote client, use SSE or Streamable HTTP instead.

Building an MCP Server in Python with FastMCP

The official Python SDK for MCP is maintained at github.com/modelcontextprotocol/python-sdk. The current stable release is the v1.x series. The SDK ships a high-level class called FastMCP that uses Python type hints and docstrings to automatically generate tool definitions, eliminating the need to manually write JSON schemas for every function. This is the recommended entry point for almost all server development.

First, install the SDK. The official documentation recommends using uv for project management, but pip works equally well:

# With uv (recommended)
uv add mcp

# With pip
pip install "mcp[cli]"

A minimal MCP server that exposes one tool, one resource, and one prompt looks like this:

from mcp.server.fastmcp import FastMCP

# Name your server clearly - clients use this for identification
mcp = FastMCP("inventory-server")

# --- TOOL ---
# FastMCP reads the type hints and docstring to build the JSON schema automatically
@mcp.tool()
def check_stock(product_id: str, warehouse: str = "default") -> dict:
    """Check current stock levels for a product in a given warehouse.

    Args:
        product_id: The unique identifier for the product.
        warehouse: The warehouse to query. Defaults to 'default'.
    """
    # In a real server, this would query your actual inventory system
    return {"product_id": product_id, "warehouse": warehouse, "quantity": 142}

# --- RESOURCE ---
# Resources are read-only data sources the client can load into context
@mcp.resource("catalog://products/{category}")
def get_product_catalog(category: str) -> str:
    """Return a plain-text product catalog for a given category."""
    return f"Products in {category}: Widget A, Widget B, Widget C"

# --- PROMPT ---
# Prompts are reusable instruction templates
@mcp.prompt()
def reorder_analysis(product_id: str) -> str:
    """Generate a reorder recommendation prompt for a specific product."""
    return (
        f"Analyze the stock levels and sales velocity for product {product_id}. "
        "Recommend whether to reorder, and if so, suggest an order quantity "
        "based on a 30-day lead time and 2-week safety stock."
    )

if __name__ == "__main__":
    # Run locally over stdio for Claude for Desktop integration
    mcp.run(transport="stdio")

To connect this server to Claude for Desktop, add an entry to claude_desktop_config.json (located at ~/Library/Application Support/Claude/ on macOS or %APPDATA%\Claude\ on Windows):

{
  "mcpServers": {
    "inventory": {
      "command": "uv",
      "args": [
        "--directory",
        "/absolute/path/to/your/project",
        "run",
        "server.py"
      ]
    }
  }
}

Use the absolute path to your project directory. On macOS or Linux, pwd gives you the correct value. On Windows, use double backslashes or forward slashes in the JSON. After saving, restart Claude for Desktop and a hammer icon will appear, indicating MCP tools are available.

For network-accessible servers using the Streamable HTTP transport, switch the run call:

if __name__ == "__main__":
    # Expose over HTTP for remote clients
    # Default port is 8000; endpoint is /mcp
    mcp.run(transport="streamable-http")

Testing Without a Full Client

The MCP SDK ships with the MCP Inspector, a developer tool you can run against your server before connecting a real client. Start your server, then in a separate terminal:

npx @modelcontextprotocol/inspector

In the inspector UI, connect to http://localhost:8000/mcp. You can list tools, call them individually with test arguments, inspect resources, and verify that your schemas look correct before wiring in an actual LLM. This feedback loop is significantly faster than iterating through a full Claude for Desktop session.

Writing Unit Tests for MCP Servers

The Python SDK allows you to connect a test client directly to your server in the same process. This is the cleanest way to write unit tests that exercise your tool logic without standing up a real transport:

import asyncio
import pytest
from contextlib import AsyncExitStack
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

SERVER_COMMAND = "python"
SERVER_ARGS = ["server.py"]

@pytest.mark.asyncio
async def test_check_stock_tool():
    server_params = StdioServerParameters(
        command=SERVER_COMMAND,
        args=SERVER_ARGS,
    )
    async with AsyncExitStack() as stack:
        read, write = await stack.enter_async_context(stdio_client(server_params))
        session = await stack.enter_async_context(ClientSession(read, write))
        await session.initialize()

        result = await session.call_tool(
            "check_stock",
            arguments={"product_id": "WIDGET-001", "warehouse": "east"}
        )
        assert result is not None

Pro Tip

The Python SDK requires Python 3.10 or higher and MCP SDK version 1.2.0 or higher for full compatibility with the current protocol. If you see unexpected behavior with older installations, version pinning is worth checking before debugging your server logic.

The Security Reality: Real CVEs and Real Lessons

The rapid growth of the MCP ecosystem has produced a security surface that deserves direct, unvarnished attention. By mid-2025, researchers had catalogued multiple vulnerability classes specific to MCP, and by January 2026, real CVEs had been assigned to flaws in Anthropic's own reference server implementations. These are not theoretical edge cases. They are production issues that affected default installations.

Prompt Injection Is the Primary Attack Vector

Prompt injection attacks coerce the LLM into generating or executing unintended tool calls by embedding malicious instructions in content the model reads. In the MCP context, this attack surface is particularly wide because servers routinely load external content — README files, issue descriptions, web pages, database records — directly into the LLM's context. An attacker who can influence any of that content can potentially control what the LLM does next, without having any direct access to the target system.

"These flaws can be exploited through prompt injection, meaning an attacker who can influence what an AI assistant reads — a malicious README, a poisoned issue description, a compromised webpage — can weaponize these vulnerabilities without any direct access to the victim's system." — Yarden Porat, Cyata, January 2026 (via The Hacker News)

CVE-2025-68143, CVE-2025-68144, CVE-2025-68145: The Git MCP Server Flaws

In January 2026, Cyata disclosed three vulnerabilities in mcp-server-git, the official Git MCP server maintained by Anthropic. All three were exploitable via prompt injection. The flaws affected all versions released before December 8, 2025, and worked against default installations — no misconfiguration required.

The root cause was a failure to validate repository paths and sanitize arguments passed to Git commands. The server would operate on any directory on the system, not only the repository defined in its configuration. In a documented attack chain, the three CVEs could be chained together: use git_init to create a repository in any writable directory, use git_log or git_diff to read the contents of that directory into the LLM's context (leaking sensitive files to the AI), and then use unsanitized arguments to the git_diff command to overwrite arbitrary files. When combined with the Filesystem MCP server, this chain escalated to remote code execution by writing a malicious .git/config and triggering a git_init through prompt injection.

Anthropic accepted the reports in September 2025 and released fixes in December 2025. The git_init tool was removed from the package entirely, and additional path traversal validation was added. Users should update to version 2025.12.18 or later.

Security Warning

If you are running mcp-server-git, update to version 2025.12.18 or later immediately. Review your environment for any configuration that enables both Git and Filesystem MCP servers simultaneously, as this combination was the specific attack surface for the RCE chain.

CVE-2025-49596: RCE in the MCP Inspector (CVSS 9.4)

Oligo Security Research discovered a critical remote code execution vulnerability in the MCP Inspector itself — the developer tool used to test and debug MCP servers. The issue stemmed from the inspector's proxy architecture: before version 0.14.1, the inspector ran with the user's full privileges, listened on localhost or 0.0.0.0, and had no authentication mechanism. A user who visited a malicious web page while the inspector was running could have arbitrary commands executed on their machine through DNS rebinding or by exploiting the 0.0.0.0 bind address. CVSS scored this at 9.4 — Critical.

Anthropic fixed the issue in version 0.14.1 by adding a session token requirement (analogous to Jupyter's approach) and enforcing allowed-origins verification to block browser-based cross-origin attacks. If you installed the MCP Inspector globally, update it globally: check your current version and upgrade to 0.14.1 or later.

Tool Poisoning and Supply Chain Attacks

Beyond the CVEs in official packages, researchers have documented a broader attack class called tool poisoning. A malicious MCP server presents itself as legitimate — its tool name and surface behavior match what a user expects — while performing covert operations using the protocol's sampling feature. In one documented proof-of-concept, a malicious "code summarizer" server provided genuine summarization functionality while simultaneously using MCP sampling to craft hidden prompts that exfiltrated data through the LLM's completion path. Because the malicious instructions lived inside the server's sampling requests rather than in visible user input, standard prompt-injection defenses did not catch them.

A separate incident involved a package masquerading as a legitimate "Postmark MCP Server" that was injecting BCC copies of all email traffic processed by the server to an attacker's address. The MCP spec does not enforce sandboxing, audit logging, or verification of server identity. Each third-party server you connect is a potential supply chain risk.

Practical Mitigations

The security picture for MCP is not fundamentally different from securing any privileged server process — but the LLM intermediary adds an attack surface that does not exist in traditional API security. A few concrete practices reduce risk substantially:

Run local MCP servers in a sandbox. Restrict what directories they can access and what system commands they can execute. The principle of least privilege applies at the process level.
Sanitize all data before passing it to shell commands. Any MCP server that calls subprocess with user-controlled or LLM-controlled arguments is a command injection risk. Validate and sanitize inputs explicitly.
Treat tool descriptions as a security boundary. What the LLM reads in a tool description shapes what it does. Keep descriptions precise and avoid patterns that could be hijacked by injected instructions.
Audit tool combinations. The Git MCP CVE chain required both the Git server and the Filesystem server to be active simultaneously. Review which servers you have enabled together and whether those combinations create unsafe cross-server capabilities.
Keep official servers updated. The OWASP Top 10 for LLMs covers many of the same issues — prompt injection, data leakage, excessive agency — that manifest in MCP. Anthropic's reference implementations do receive security patches; stay current.
Require human approval for sensitive tool calls. The OpenAI Agents SDK exposes a require_approval configuration that can mandate human review before certain tool calls execute. Similar patterns are achievable in any MCP-connected agent framework by wrapping call_tool with an approval step.

"If security boundaries break down even in the reference implementation, it's a signal that the entire MCP ecosystem needs deeper scrutiny. These are not edge cases or exotic configurations — they work out of the box." — Shahar Tal, CEO, Cyata, January 2026

Key Takeaways

MCP solves a structural problem: Before MCP, every AI-to-tool integration required custom connector code. The protocol provides a universal interface built on JSON-RPC 2.0 that any compliant client and server can use without bespoke glue code.
Three primitives cover everything: Tools execute actions with side effects, resources expose read-only data, and prompts provide reusable instruction templates. Understanding which primitive to use for a given capability is the foundational design decision in any MCP server.
Transport choice is a deployment decision: Use stdio for local, single-client integrations. Use SSE or Streamable HTTP for network-accessible servers. Never use stdio in production scenarios requiring multiple connections or authentication.
FastMCP dramatically reduces boilerplate: Python type hints and docstrings generate JSON schemas automatically. Give your server a clear name, write precise tool descriptions, and let the SDK handle the protocol overhead.
Security is not optional: Real CVEs with CVSS scores up to 9.4 have been found in Anthropic's own reference implementations. Prompt injection, tool poisoning, path traversal, and supply chain attacks are active risks in the MCP ecosystem. Apply least-privilege principles, sanitize inputs, audit server combinations, and keep packages updated.

MCP has established itself as the connective tissue between AI agents and the real world with unusual speed. The Python SDK makes building a compliant server straightforward; the security picture makes building a safe server a discipline that requires deliberate attention. Both things are true at once, and both are worth taking seriously.