Agentic Coding with LangGraph: Build Stateful AI Agents That Actually Work

LangGraph is the framework that finally makes agentic Python feel like engineering rather than guesswork. This guide walks through every core concept—StateGraph, nodes, edges, checkpointing, human-in-the-loop interrupts, and multi-agent orchestration—with working code you can run today.

Most introductions to agentic AI start at the wrong end. They show you a one-liner that calls a tool and treat it like an agent. Real agents do something harder: they maintain state, make branching decisions, recover from failure, loop until a condition is met, and hand off control to humans at critical moments. For a long time, LangChain handled simple linear chains well, but anything involving loops or cycles required ugly workarounds.

LangGraph was built to close that gap. Released in early 2024 as a separate library on top of LangChain, it introduces graph-based agentic workflows that bring the theory of finite state machines into practical LLM development. The result is a framework trusted in production by companies including Klarna, Replit, Elastic, Uber, and LinkedIn—all running real agentic workloads at scale.

Why LangGraph Exists

Before LangGraph, developers building complex LLM applications faced a recurring problem: the moment you needed a loop—an agent that retries a failed tool call, refines its own output, or waits for human feedback before continuing—you were writing custom orchestration from scratch. LangChain's chain primitives were not designed for cyclic execution.

"LangGraph sets the foundation for how we can build and scale AI workloads—from conversational agents, complex task automation, to custom LLM-backed experiences that 'just work'." — LangChain, langchain.com/langgraph

The core insight behind LangGraph is borrowed from computer science: model your agent as a state machine. Each node in the graph represents a unit of work—an LLM call, a tool execution, a decision function. Edges represent the transitions between those units. A shared state object flows through the graph and accumulates the results of every step. Because you define explicit transitions, the framework can validate the graph before execution, detect cycles, and optimize execution paths—things that were invisible in prompt-chain approaches.

Note

LangGraph is MIT-licensed and free to use. Costs come from the LLM APIs, vector stores, and optional infrastructure you wire into it—not from the framework itself. Source: langchain.com/langgraph

The Four Core Primitives

Everything in LangGraph is built from four concepts. Understand these and you understand the entire framework.

1. State

State is the shared memory object that every node reads from and writes to. It is defined as a Python TypedDict, which gives you type safety and a clear contract for what your agent is tracking. The state is not just a message buffer—it can hold tool outputs, metadata, task counters, retrieved documents, decision history, and anything else relevant to your workflow.

When a node updates state, LangGraph does not do a simple overwrite. It applies a merge according to rules defined by the schema. For a list field annotated with the Annotated type and LangGraph's add_messages reducer, new messages are appended rather than replacing the existing list. This deterministic, schema-driven merging is what makes state predictable across complex multi-step workflows.

from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

# Define the shape of your agent's memory
class AgentState(TypedDict):
    # add_messages reducer appends rather than overwrites
    messages: Annotated[list, add_messages]
    task_status: str
    retry_count: int

2. Nodes

A node is any Python callable that accepts the current state and returns a partial update. That's the entire contract. The function can call an LLM, invoke an external API, run a database query, apply business logic, or do all of these in sequence. LangGraph does not care what happens inside a node—it only cares about the input type (state) and the output type (a partial state update or a Command object).

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o", temperature=0)

def call_llm(state: AgentState) -> dict:
    """Node that calls the LLM with the current message history."""
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

3. Edges

Edges connect nodes and determine what runs next. There are two kinds. A normal edge is unconditional: after node A, always go to node B. A conditional edge is a function that inspects the current state and returns the name of the next node to execute. Conditional edges are how you implement branching, routing, and loop termination.

from langgraph.graph import StateGraph, START, END

def should_continue(state: AgentState) -> str:
    """Route to tools if the LLM requested one, otherwise end."""
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "tools"
    return END

4. The Graph and Compilation

The StateGraph is where you assemble nodes and edges into a runnable structure. Once assembled, you call .compile(). Compilation validates node connections, identifies unreachable nodes, checks for missing edges, and produces an immutable executable. After compilation the graph's structure cannot be modified at runtime, which guarantees consistent behavior across all invocations.

"Once compiled, the graph becomes immutable, ensuring consistent behavior across all executions and preventing runtime modifications that could disrupt workflow stability." — Latenode, LangGraph Architecture Guide 2025

Building Your First Agent

The canonical starting point for LangGraph is a ReAct agent: a loop where an LLM decides whether to call a tool, executes the tool if needed, feeds the result back to the LLM, and repeats until no more tool calls are requested. Here is a complete, runnable implementation.

First, install the required packages:

pip install langgraph langchain-openai langchain-core

Then build the graph:

import os
from typing import TypedDict, Annotated
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode

# --- State ---
class AgentState(TypedDict):
    messages: Annotated[list, add_messages]

# --- Tools ---
@tool
def search_web(query: str) -> str:
    """Simulate a web search. Replace with a real implementation."""
    return f"Search results for '{query}': [placeholder result]"

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression safely."""
    try:
        result = eval(expression, {"__builtins__": {}})
        return str(result)
    except Exception as e:
        return f"Error: {e}"

tools = [search_web, calculate]

# --- LLM with tools bound ---
llm = ChatOpenAI(model="gpt-4o", temperature=0)
llm_with_tools = llm.bind_tools(tools)

# --- Nodes ---
def agent_node(state: AgentState) -> dict:
    response = llm_with_tools.invoke(state["messages"])
    return {"messages": [response]}

tool_node = ToolNode(tools)

# --- Routing logic ---
def should_continue(state: AgentState) -> str:
    last = state["messages"][-1]
    if hasattr(last, "tool_calls") and last.tool_calls:
        return "tools"
    return END

# --- Build the graph ---
graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)

graph.add_edge(START, "agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")  # loop back after tool execution

app = graph.compile()

# --- Run it ---
from langchain_core.messages import HumanMessage
result = app.invoke({"messages": [HumanMessage(content="What is 144 * 12?")]})
print(result["messages"][-1].content)

What makes this different from a simple chain is the edge from "tools" back to "agent". This creates a cycle. The agent loops until no tool calls remain, then exits via the END node. LangGraph handles this cycle natively; a standard LangChain chain would require custom recursion logic to do the same thing.

Pro Tip

LangGraph ships a create_react_agent prebuilt function that generates a ReAct graph in one line. Use it for prototyping. For production agents with custom logic, build the graph manually so you can control every edge and add validation nodes, retry logic, and guardrails as first-class citizens.

Checkpointing and Persistence

Checkpointing is where LangGraph stops being a scripting tool and starts being an infrastructure primitive. A checkpointer saves the complete graph state after every node execution. If your agent fails mid-run—due to a network timeout, a provider outage, or a crash—you can resume execution from exactly the last successful node rather than starting over.

"In LangGraph, memory is checkpointing/persistence. Checkpointing saves the state of the agent's execution at each node in the graph, which is crucial for resuming execution, debugging and inspection, and asynchronous operations." — Google Cloud, Generative AI GitHub repository

Every run is associated with a thread_id. Reusing the same thread_id resumes the existing checkpoint; using a new value starts a fresh thread. For development, InMemorySaver is sufficient. For production, swap it for SqliteSaver or PostgresSaver—the graph logic does not change at all, only the one line that instantiates the checkpointer.

from langgraph.checkpoint.memory import InMemorySaver

# Development: in-memory checkpointer
checkpointer = InMemorySaver()
app = graph.compile(checkpointer=checkpointer)

# All invocations share state via thread_id
config = {"configurable": {"thread_id": "user-session-42"}}

# First turn
result = app.invoke(
    {"messages": [HumanMessage(content="What is the capital of France?")]},
    config=config
)

# Second turn — the agent remembers the first turn
result = app.invoke(
    {"messages": [HumanMessage(content="And what language do they speak there?")]},
    config=config
)

Beyond simple resumption, checkpointing enables time-travel debugging. You can inspect the state at any prior node, replay execution from a specific checkpoint without modification, or branch from a past state to explore an alternative execution path. These capabilities are surfaced through LangSmith, LangChain's observability platform, which provides visual traces of state transitions and execution paths in real time.

LangGraph also distinguishes between two memory scopes. Short-term memory is the state that flows through a single thread; it persists conversation history, tool outputs, and intermediate values for the duration of that thread. Long-term memory persists across sessions and threads, typically stored in a database or vector store. Long-term memory is accessed through stores that nodes can query and update, enabling agents that recall user preferences across separate conversations.

Note

Switching checkpointers in production requires only one line change. The LangGraph documentation explicitly states this is by design—your graph logic and all node implementations remain untouched. This makes it straightforward to move from a SQLite file to a Postgres instance as your workload scales. Source: Towards Data Science, LangGraph 201

Human-in-the-Loop with Interrupts

Full automation is not always the goal. Many production workflows—content approval pipelines, financial risk systems, medical record processing, deployment gates—require a human to inspect and approve agent actions before they execute. LangGraph handles this through its interrupt mechanism, which is one of the most carefully designed parts of the entire framework.

How Interrupts Work

Calling interrupt() inside a node does two things simultaneously: it serializes the complete graph state to the checkpointer, and it returns a value to the caller with the interrupt payload. The graph waits indefinitely at that point. When the human is ready to continue, the caller invokes the graph again with a Command(resume=...) object. That resume value becomes the return value of the interrupt() call inside the node, and execution continues from exactly where it stopped.

"The interrupt function pauses graph execution and returns a value to the caller. When you call interrupt within a node, LangGraph saves the current graph state and waits for you to resume execution with input." — LangChain Docs, Interrupts

This is different from static breakpoints defined at compile time (interrupt_before and interrupt_after), which always pause at specific node boundaries. The interrupt() function is dynamic—it can be called conditionally anywhere inside a node's logic, making it far more flexible for real-world approval flows.

import sqlite3
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.types import interrupt, Command
from langgraph.checkpoint.sqlite import SqliteSaver
from langchain_core.messages import HumanMessage

class WorkflowState(TypedDict):
    task: str
    generated_output: str
    decision: str
    final_status: str

def generate_content(state: WorkflowState) -> dict:
    """Simulate content generation (replace with real LLM call)."""
    return {"generated_output": f"Draft: {state['task']} — [AI-generated content here]"}

def human_review(state: WorkflowState) -> dict:
    """Pause and present the draft to a human reviewer."""
    decision = interrupt({
        "instruction": "Review the generated content below and respond 'approve' or 'reject'.",
        "content": state["generated_output"]
    })
    return {"decision": decision}

def route_decision(state: WorkflowState) -> str:
    if state["decision"] == "approve":
        return "publish"
    return "discard"

def publish(state: WorkflowState) -> dict:
    return {"final_status": "Published successfully."}

def discard(state: WorkflowState) -> dict:
    return {"final_status": "Content discarded by reviewer."}

# Build graph
builder = StateGraph(WorkflowState)
builder.add_node("generate", generate_content)
builder.add_node("review", human_review)
builder.add_node("publish", publish)
builder.add_node("discard", discard)

builder.add_edge(START, "generate")
builder.add_edge("generate", "review")
builder.add_conditional_edges("review", route_decision, {
    "publish": "publish",
    "discard": "discard"
})
builder.add_edge("publish", END)
builder.add_edge("discard", END)

# Use SqliteSaver for durable persistence
conn = sqlite3.connect("workflow.db", check_same_thread=False)
checkpointer = SqliteSaver(conn)
app = builder.compile(checkpointer=checkpointer)

config = {"configurable": {"thread_id": "content-job-001"}}

# --- Run until the interrupt ---
result = app.invoke(
    {"task": "Write a summary of Q1 earnings", "generated_output": "", "decision": "", "final_status": ""},
    config=config
)
print(result["__interrupt__"])
# -> [Interrupt(value={'instruction': 'Review...', 'content': 'Draft: ...'}, ...)]

# --- Human reviews and approves ---
final = app.invoke(Command(resume="approve"), config=config)
print(final["final_status"])
# -> Published successfully.

Static vs. Dynamic Interrupts

When every invocation of a particular node requires human approval—such as a node that deletes database records or pushes to a production environment—compile-time interrupts are cleaner. They do not require modifying node code:

app = graph.compile(
    checkpointer=checkpointer,
    interrupt_before=["delete_records"],   # always pause before this node
    interrupt_after=["generate_response"]  # always pause after this node
)

Dynamic interrupt() calls, by contrast, let you trigger pauses conditionally. An agent processing a batch of documents might interrupt only when the confidence score falls below a threshold, passing routine documents through automatically and routing uncertain ones to a human queue.

Pro Tip

The payload passed to interrupt() can be any JSON-serializable dictionary. Design it as a UI contract: include the kind of decision being requested, the data the human needs to see, and instructions for valid responses. Your frontend reads result["__interrupt__"] and renders the appropriate review interface.

Multi-Agent Patterns

Single-agent graphs handle many problems well. For tasks that benefit from specialization—where different agents hold different knowledge, have access to different tools, or operate at different levels of abstraction—LangGraph supports multi-agent orchestration through three patterns: supervisor, swarm, and subgraphs.

Supervisor Pattern

In the supervisor pattern, one agent orchestrates a team of specialized worker agents. The supervisor receives the user's request, decides which worker to route to, receives the worker's output, and either routes to another worker or returns a final answer. LangGraph provides a high-level langgraph-supervisor library for this pattern, though you can also build it manually for tighter control.

from langgraph.graph import StateGraph, START, END
from langgraph.types import Command
from typing import TypedDict, Literal

class SupervisorState(TypedDict):
    task: str
    result: str
    next_agent: str

def supervisor(state: SupervisorState) -> Command:
    """Route to the appropriate specialist or end the workflow."""
    task = state["task"].lower()
    if "code" in task or "python" in task:
        return Command(goto="code_agent", update={"next_agent": "code_agent"})
    elif "research" in task:
        return Command(goto="research_agent", update={"next_agent": "research_agent"})
    else:
        return Command(goto=END, update={"result": "No suitable agent found."})

def code_agent(state: SupervisorState) -> dict:
    # Real implementation calls an LLM with coding tools
    return {"result": f"Code agent handled: {state['task']}"}

def research_agent(state: SupervisorState) -> dict:
    # Real implementation calls an LLM with search tools
    return {"result": f"Research agent handled: {state['task']}"}

graph = StateGraph(SupervisorState)
graph.add_node("supervisor", supervisor)
graph.add_node("code_agent", code_agent)
graph.add_node("research_agent", research_agent)
graph.add_edge(START, "supervisor")
graph.add_edge("code_agent", END)
graph.add_edge("research_agent", END)

app = graph.compile()
result = app.invoke({"task": "Write a Python function to sort a list", "result": "", "next_agent": ""})
print(result["result"])

LinkedIn's AI Hiring Agent is a real-world example of the supervisor pattern. Their LangGraph implementation coordinates agents that handle conversational candidate search, requirement extraction, and profile matching—each a specialist operating under a supervisor that routes based on recruiter intent. Uber's developer platform team built a similar architecture for automated unit test generation and coding standards enforcement across internal repositories.

Swarm Pattern

The swarm pattern is a peer-to-peer alternative to the supervisor. Multiple agents can directly hand off tasks to one another without a central coordinator. LangGraph's langgraph-swarm library implements this pattern. Swarms work well when the handoff logic is local—each agent knows which peer is best suited for the next step—and when you want to avoid a single orchestration bottleneck.

Subgraphs for Modularity

Complex workflows benefit from nesting. A subgraph is a compiled LangGraph that can be used as a single node inside a parent graph. This lets you encapsulate related logic—a document parsing pipeline, a retrieval-augmented generation module, a validation loop—into a reusable component with its own internal state. Subgraphs expose only their input and output interface to the parent graph.

from langgraph.graph import StateGraph, START, END

# --- Define a reusable validation subgraph ---
class ValidationState(TypedDict):
    content: str
    is_valid: bool
    error_message: str

def validate_format(state: ValidationState) -> dict:
    if len(state["content"]) < 10:
        return {"is_valid": False, "error_message": "Content too short"}
    return {"is_valid": True, "error_message": ""}

def validate_content(state: ValidationState) -> dict:
    if "spam" in state["content"].lower():
        return {"is_valid": False, "error_message": "Content flagged as spam"}
    return {}

sub = StateGraph(ValidationState)
sub.add_node("check_format", validate_format)
sub.add_node("check_content", validate_content)
sub.add_edge(START, "check_format")
sub.add_edge("check_format", "check_content")
sub.add_edge("check_content", END)
validation_subgraph = sub.compile()

# The compiled subgraph can now be added as a node in any parent graph

Production Considerations

LangGraph is explicitly designed for production. The framework's documentation describes it as infrastructure for "long-running, stateful workflows" and lists durable execution, human-in-the-loop, comprehensive memory, and production-ready deployment as central benefits—not optional add-ons.

Streaming

LangGraph supports native token-by-token streaming, which is critical for responsive user interfaces. Instead of calling .invoke() and waiting for the entire graph to complete, use .stream() to receive intermediate state updates as they occur:

for event in app.stream(
    {"messages": [HumanMessage(content="Explain gradient descent")]},
    config=config,
    stream_mode="values"  # stream full state after each node
):
    last_message = event["messages"][-1]
    if hasattr(last_message, "content"):
        print(last_message.content, end="", flush=True)

Parallel Execution

When multiple nodes can execute without depending on each other's outputs, LangGraph runs them in parallel. The two canonical patterns are scatter-gather (distributing the same input to multiple agents and merging results at a downstream node) and pipeline parallelism (different agents handle sequential stages of a process concurrently). Parallel execution is declared implicitly: if two nodes share the same upstream edge and do not have a dependency between them, the framework handles concurrent execution automatically.

Error Handling and Retries

Because every node is a pure Python function, you can add retry logic, fallbacks, and validation gates directly in your node implementation or as dedicated nodes. Adding a retry_node that checks the current retry_count in state and re-routes to a node on failure is a first-class pattern in LangGraph rather than something bolted on. This makes complex workflows more predictable than ad-hoc ReAct prompts by giving you explicit, testable control over every transition.

Warning

Debugging distributed multi-agent graphs requires more effort than debugging linear code. When workflows fail inside complex graph structures, pinpointing the root cause can be significantly harder than tracing a traditional stack. LangSmith's visual execution trace tool is the recommended approach for diagnosing issues in production LangGraph deployments. Plan for this in your team's tooling budget before go-live.

LangGraph Platform

For teams that do not want to manage their own agent runtime infrastructure, LangChain offers LangGraph Platform (formerly LangGraph Cloud)—a deployment layer that handles hosting, state persistence, scaling, and the queue management that long-running agents require. LangGraph Studio, the visual prototyping environment, integrates directly with it for rapid iteration before production deployment.

Key Takeaways

  1. State is the foundation. Everything in LangGraph revolves around a typed, schema-driven state object. Invest time in designing your state schema upfront—adding fields later is easy, but a poorly designed state creates confusing reducer conflicts across nodes.
  2. Cycles are the feature, not a workaround. The ability to loop back from a tool node to an agent node is precisely what makes LangGraph more capable than linear chains. Build agents that retry, reflect, and refine as loops with explicit termination conditions rather than trying to prompt a single LLM call to do everything.
  3. Checkpointing enables everything else. Without a checkpointer, you cannot use interrupts, thread-scoped memory, or time-travel debugging. Add a checkpointer from the start, even if it is just InMemorySaver during development. Switching to a production database later is a one-line change.
  4. Use interrupt() for human oversight, not polling. The interrupt mechanism is LangGraph's solution to the problem of integrating human judgment into automated pipelines. It is far more robust than asking an LLM to pause and wait—execution is genuinely suspended and the state is persisted until the human responds.
  5. Match the pattern to the problem. A single ReAct agent handles many tasks. Use supervisor patterns when you need specialized agents with different tool sets. Use subgraphs to keep complex logic modular and testable. Adding multi-agent complexity before you need it increases debugging surface area without proportional benefit.

LangGraph's core strength is that it treats agentic workflows as engineering problems, not prompting problems. By giving you explicit state, typed transitions, persistent checkpoints, and first-class human oversight, it makes complex agent behavior something you can reason about, test, and operate in production with confidence. The framework is not a shortcut—it demands that you model your agent's behavior carefully. But that modeling work is exactly what separates reliable production agents from demos that only work when nothing goes wrong.

Source references: LangGraph GitHub (langchain-ai/langgraph)langchain.com/langgraphLangChain Docs: InterruptsDataCamp: How to Build LangGraph AgentsIBM Think: What is LangGraph?

back to articles