When Anthropic open-sourced the Model Context Protocol in November 2024, it started as a small experiment with a handful of reference servers. A few months later, it had reached 97 million monthly SDK downloads, with over 10,000 active servers in production and native support built into ChatGPT, Cursor, Gemini, Microsoft Copilot, and Visual Studio Code.
Enterprise AI deployment is accelerating alongside it. Gartner projects that 40% of enterprise applications will embed agentic capabilities by end of 2026, up from 12% the previous year. Teams building at that scale need two things: a standardized way to connect agents to tools, and an orchestration layer that handles the complexity of multi-step, stateful workflows. MCP handles the first. LangGraph handles the second.
This blog walks through building a production-ready enterprise agent using LangGraph MCP integration, covering MCP server setup, LangChain MCP adapters, a ReAct graph with human-in-the-loop checkpointing, and the production patterns that keep the stack reliable at scale.
Key Takeaways
- MCP standardizes how agents connect to tools via JSON-RPC 2.0, eliminating custom integration code for every external system.
- LangGraph’s StateGraph gives agents typed, persistent state across tool calls, which is what separates a working prototype from a production-grade system.
- The
MultiServerMCPClientlets one agent connect to multiple MCP servers simultaneously, across stdio, SSE, and Streamable HTTP transports. - Decoupling reasoning nodes from execution nodes in the StateGraph makes failure handling predictable and dramatically easier to debug.
- Auth must be scoped per MCP server and per user session. A single shared token across servers is the most common security gap in enterprise agent deployments.
MCP Architecture and How It Works
MCP is an open standard from Anthropic that defines how AI agents communicate with external tools, APIs, and data systems. Instead of writing custom integration code for every tool your agent needs, you expose each tool through an MCP server. The agent connects to it via an MCP client and discovers what’s available at runtime.
The architecture has three components working together:
- MCP Server: hosts tools, resources, and prompts. Any MCP-compatible client can call it without custom integration per tool.
- MCP Client: sits inside your agent. Sends structured requests to servers, gets structured responses back.
- AI Agent: uses the client to call tools, then reasons over the results.
1. The Protocol Layer: JSON-RPC 2.0, Tool Schemas, and Server Structure
All MCP communication runs over JSON-RPC 2.0. Every tool call, response, and context packet follows this structure, typed, validated, and with checksums to catch corruption during transfer.
Each tool must be registered with a JSON schema that defines its input parameters, validation rules, and expected output format. The agent uses this schema to validate inputs before sending any request. An incomplete schema is one of the most common sources of runtime errors in production deployments.
2. Transport Options: stdio, SSE, and Streamable HTTP
MCP supports three transport types. Choosing the right one affects latency, scalability, and deployment complexity.
| Transport | Best For | Limitation |
|---|---|---|
| stdio | Local dev, single-machine tools | Not suitable for web servers or multi-user setups |
| SSE | Real-time streaming, remote servers | One-directional push; session management complexity |
| Streamable HTTP | Production, multi-user, cloud deployments | Slightly more setup overhead |
A note from the official langchain-mcp-adapters repo: stdio was designed for applications running on a user’s machine. Before using it in a web server context, evaluate whether a simpler @tool decorator solves the problem without the overhead.
3. Dynamic Tool Discovery at Runtime
When an MCP client connects to a server, it calls session.initialize() and then load_mcp_tools(session). The agent receives a list of available tools with their schemas, with no hardcoded tool definitions in the agent itself.
This matters for enterprise setups where tool availability changes. A new MCP server can be registered, and the agent discovers it on the next request without a code change or redeployment.
Upgrade Your AI Stack With Contextual Intelligence via MCP!
Partner with Kanerika Today.
LangGraph in the Agent Stack
1. LangGraph vs LangChain: Different Layers, Different Responsibilities
LangChain handles chaining LLM calls, prompt templates, and basic tool use. LangGraph sits on top and handles orchestration, specifically stateful, multi-step workflows where the agent needs to make decisions, loop, branch, and recover from failures.
If you’re building an agent that calls one tool and returns an answer, LangChain is enough. If the agent needs to route between tools, remember state across calls, wait for human approval, or handle tool failures gracefully, LangGraph is the right layer.
2. StateGraph as an Orchestration Layer
In LangGraph, the agent is a graph. Each node is a function. Edges define which node runs next, and conditional edges let the agent branch based on current state.
The StateGraph holds everything the agent knows at any point in the workflow: messages, tool results, memory, and any custom fields you define. This persists across tool calls within a session, which is what lets agents recover from mid-task failures.
3. LangGraph vs CrewAI vs AutoGen: A Comparison for Enterprise Teams
| Capability | LangGraph | CrewAI | AutoGen |
|---|---|---|---|
| State control | Explicit, typed StateGraph | Role-based, less granular | Conversation-based |
| MCP support | Native via langchain-mcp-adapters | Limited, adapter needed | Limited, no native support |
| Observability | LangSmith integration built in | Basic logging | AutoGen Studio UI only |
| Human-in-the-loop | First-class, checkpoint-based | Manual implementation | Supported via user proxy |
| Enterprise maturity | LangGraph 1.0 GA (Oct 2025) | Pre-1.0 at time of writing | Forked: AG2 vs Microsoft AutoGen 0.4 |
For enterprise teams that need auditability, multi-user auth, and production observability, LangGraph is the most complete option.
Environment Setup and Project Structure
1. Python Version, Packages, and API Keys
MCP adapters require Python 3.11 or newer. Earlier versions will fail silently or throw adapter-level errors.
bash
python --version # Must be 3.11+
pip install langgraph langchain-mcp-adapters langchain langchain-openaiSet your API keys before running anything:
bash
export OPENAI_API_KEY=your_key_here
export LANGSMITH_API_KEY=your_key_here # Required for tracing
export LANGSMITH_TRACING=true2. Project Folder Structure
Keep MCP servers and the agent code separate from the start. Mixing them into one file creates context manager scope errors in production.
/project
/servers
internal_data_server.py
web_search_server.py
crm_server.py
agent.py
client_config.pyBuilding MCP Servers
1. Internal Data MCP Server
This server exposes a database query tool. The agent calls it to fetch internal records without direct DB access from the agent runtime.
python
# servers/internal_data_server.py
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("internal-data")
@mcp.tool()
def query_sales_data(region: str, period: str) -> dict:
"""Fetch sales data for a given region and period."""
# Replace with actual DB query
return {"region": region, "period": period, "revenue": 142000}
if __name__ == "__main__":
mcp.run(transport="stdio")2. Web Search MCP Server
python
# servers/web_search_server.py
from mcp.server.fastmcp import FastMCP
import httpx
mcp = FastMCP("web-search")
@mcp.tool()
def search_web(query: str) -> str:
"""Search the web and return a summary of top results."""
# Plug in your preferred search API here
response = httpx.get(f"https://api.search.example.com?q={query}")
return response.text
if __name__ == "__main__":
mcp.run(transport="stdio")3. CRM Write MCP Server
python
# servers/crm_server.py
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("crm-write")
@mcp.tool()
def update_crm_record(contact_id: str, notes: str) -> str:
"""Update a CRM contact record with new notes."""
# Replace with actual CRM API call
return f"Record {contact_id} updated."
if __name__ == "__main__":
mcp.run(transport="stdio")Connecting LangGraph Agents to MCP Servers
1. MultiServerMCPClient Configuration
MultiServerMCPClient lets a single agent connect to multiple MCP servers simultaneously, across different transport types.
python
# client_config.py
from langchain_mcp_adapters.client import MultiServerMCPClient
client_config = {
"internal-data": {
"command": "python",
"args": ["servers/internal_data_server.py"],
"transport": "stdio",
},
"web-search": {
"command": "python",
"args": ["servers/web_search_server.py"],
"transport": "stdio",
},
"crm-write": {
"url": "https://your-crm-mcp-server.example.com/mcp",
"transport": "streamable_http",
"headers": {
"Authorization": "Bearer YOUR_TOKEN"
}
}
}2. Dynamic Tool Discovery and Binding
Once the client config is defined, tool discovery happens automatically on connection. The agent receives all available tools from all servers as a flat list.
python
# agent.py
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain.chat_models import init_chat_model
from client_config import client_config
async def run_agent(user_input: str):
async with MultiServerMCPClient(client_config) as client:
tools = await client.get_tools()
model = init_chat_model("openai:gpt-4.1")
agent = create_react_agent(model, tools)
response = await agent.ainvoke({"messages": user_input})
return responseThe async with block is not optional. Closing the client context manager before the agent finishes its tool calls is the most common production error with MCP.
MCP & Context-Aware AI Agents: What Actually Works in 2026
Explains how MCP enables context-aware AI agents for reliable automation.
Building a Production-Ready Agent Architecture with LangGraph and MCP
1. Separating Thinking from Doing: the Decoupled Architecture Model
In a production LangGraph agent, the reasoning and the execution should live in separate nodes. The reasoning node decides what to do. The execution node does it. This separation makes failure handling cleaner and makes the graph easier to debug.
A flat agent that reasons and executes in the same node becomes difficult to trace when something goes wrong at scale.
2. Designing Your StateGraph: Router, Executor, Summarizer, and Fallback Nodes
python
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.prebuilt import ToolNode, tools_condition
def build_graph(tools):
model = init_chat_model("openai:gpt-4.1").bind_tools(tools)
def router_node(state):
return {"messages": [model.invoke(state["messages"])]}
tool_node = ToolNode(tools)
graph = StateGraph(MessagesState)
graph.add_node("router", router_node)
graph.add_node("executor", tool_node)
graph.add_edge(START, "router")
graph.add_conditional_edges("router", tools_condition)
graph.add_edge("executor", "router")
return graph.compile()| Node | Role | Triggers When |
|---|---|---|
| Router | LLM decides next tool or ends | Every step |
| Executor | Runs the selected tool via MCP | Router selects a tool |
| Summarizer | Condenses long tool outputs | Output exceeds context threshold |
| Fallback | Handles tool errors gracefully | Tool call raises an exception |
3. State Persistence and Memory Across Sessions
LangGraph checkpointing lets the agent save and resume state across sessions. Use SqliteSaver for local dev and PostgresSaver for production.
python
from langgraph.checkpoint.sqlite import SqliteSaver
checkpointer = SqliteSaver.from_conn_string("agent_state.db")
compiled_graph = graph.compile(checkpointer=checkpointer)
# Pass a thread_id to persist state per user session
config = {"configurable": {"thread_id": "user-session-001"}}
response = await compiled_graph.ainvoke({"messages": user_input}, config)Each thread_id maintains an independent conversation state. Without this, the agent starts fresh on every invocation.
4. Human-in-the-Loop Checkpoints: Placement and Trigger Conditions
Add an interrupt before any node that writes data: CRM updates, database writes, emails, financial transactions.
python
graph.compile(
checkpointer=checkpointer,
interrupt_before=["executor"] # Pause before any tool execution
)After the interrupt, the human reviews the planned action and either approves by calling ainvoke again with the same thread_id, or cancels by ending the thread.
5. Authentication at Scale: Scoped Access for Multi-User Deployments
A single shared MCP server with one auth token fails in enterprise. Different users need different access levels, and tool calls must carry user-scoped credentials.
python
from langchain_mcp_adapters.client import MultiServerMCPClient
def get_user_scoped_client(user_token: str):
return MultiServerMCPClient({
"crm-write": {
"url": "https://your-crm-mcp-server.example.com/mcp",
"transport": "streamable_http",
"headers": {
"Authorization": f"Bearer {user_token}"
}
}
})Auth headers are only supported on streamable_http and sse transports. If you’re using stdio in production with shared tool access, that’s the gap to close first.
Running, Testing, and Exposing the Agent
1. Local Test Run
python
import asyncio
async def main():
result = await run_agent("What were sales in the US last quarter?")
for message in result["messages"]:
print(message.content)
asyncio.run(main())2. Reading the Tool Trace Output
Most agent bugs don’t look like crashes. The agent runs, returns something, and the output looks plausible. The issue only surfaces when you check what actually got sent to the MCP server.
LangSmith gives you a full trace per run. Each node in the StateGraph appears as a step, and every tool call inside that step shows the tool name, the exact input payload, and the raw response. You can see token usage and latency broken down per node, not just per run.
The pattern to watch for is correct tool, wrong parameters. The agent picks the right server but constructs the input incorrectly, either missing a required field or passing a value in the wrong format. The MCP server returns an error, the agent receives it as a tool message, and depending on how error handling is set up, it either retries with a hallucinated fix or reports a false success.
To catch this before it reaches production, run LangSmith evals against a fixed set of test inputs. Each eval checks that the correct tool was called, the input matched the expected schema, and the final output was accurate. This is the minimum bar before scaling any agent to multi-user traffic.
3. Exposing the Agent as an MCP Server
Once your LangGraph agent is deployed on LangGraph Platform, it automatically gets an /mcp endpoint. Other agents or MCP-compatible clients can call it as a tool.
python
# Any external MCP client can now load your agent as a tool
from langchain_mcp_adapters.tools import load_mcp_tools
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client
async with streamablehttp_client("https://your-agent.langgraph.app/mcp") as (read, write, _):
async with ClientSession(read, write) as session:
await session.initialize()
tools = await load_mcp_tools(session)Common Errors and Fixes
Most LangGraph MCP failures come from five specific mistakes.
1. MCP Client Context Manager Scope
Closing the MultiServerMCPClient context before the agent finishes kills all active tool connections mid-execution. Keep the entire agent invocation inside the async with block.
python
# Wrong
client = MultiServerMCPClient(config)
tools = await client.get_tools() # Context not open
# Correct
async with MultiServerMCPClient(config) as client:
tools = await client.get_tools()
# Run agent here, inside the context2. stdio Server Subprocess Startup Timing
stdio servers start as subprocesses. If the agent tries to call a tool before the subprocess is ready, the call fails silently. Add a small startup delay or health check before the first tool call in local dev.
3. Tool Call Error Handling
Tool errors that aren’t caught propagate into the agent’s message state and often cause the LLM to hallucinate a successful result. Wrap tool nodes with explicit error handling.
python
def safe_tool_node(state):
try:
return tool_node(state)
except Exception as e:
return {"messages": [{"role": "tool", "content": f"Error: {str(e)}"}]}4. State Persistence Across Sessions
If the agent loses context between turns, the thread_id is either missing or changing between requests. Confirm the same thread_id is passed on every call for the same session.
5. Tool Name Conflicts Across Servers
When two MCP servers expose a tool with the same name, the adapter overwrites one with the other. Prefix tool names per server to avoid this.
| Error | Root Cause | Fix |
|---|---|---|
| Context lost mid-task | Client closed before agent finished | Keep agent inside async with block |
| Tool call fails silently | stdio subprocess not ready | Add startup health check |
| Agent ignores tool error | Unhandled exception in tool node | Wrap tool node with try/except |
| Session state lost | Missing or inconsistent thread_id | Pass same thread_id per session |
| Wrong tool called | Name conflict across servers | Prefix tool names by server |
Production Best Practices
1. Scoped Auth per MCP Server
Each MCP server should have its own authentication scope. Never share a single token across servers with different access levels. Use streamable_http transport with per-request authorization headers.
2. Observability with LangSmith
Enable tracing before anything goes to production. The minimum set of metrics to track:
| Metric | What to Track | Tool |
|---|---|---|
| Tool call success rate | Failed vs successful tool invocations | LangSmith |
| Node latency | Time per node in the StateGraph | LangSmith |
| Token usage | Input/output tokens per run | LangSmith |
| Context loss events | Sessions where state was dropped | Custom logging |
| Tool selection accuracy | Correct tool chosen for the query | LangSmith evals |
3. Protocol Version Pinning
MCP is still evolving. Pin your MCP and adapter versions in requirements.txt to avoid breaking changes on upgrade.
langchain-mcp-adapters==0.1.8
langgraph==1.x.x
mcp==1.x.x4. Multi-Agent Orchestration
In multi-agent setups, each agent should be a separate LangGraph graph exposed as an MCP server. The orchestrator agent discovers sub-agents via MCP tool discovery, same as any other tool. This keeps the architecture consistent and avoids custom inter-agent protocols.
5. Compliance Gates for Regulated Industries
For finance, healthcare, or legal use cases, add a compliance node before any write operation. The node checks the planned action against a policy ruleset and blocks or flags anything outside approved parameters. This runs automatically and logs every decision, separate from human-in-the-loop checkpoints.
Deployment Options
1. Remote MCP Servers vs Local stdio
| Factor | Remote (streamable_http/SSE) | Local stdio |
|---|---|---|
| Scalability | Multi-user, horizontally scalable | Single process |
| Latency | Network overhead | Near-zero |
| Auth | Header-based, per-request | Process-level only |
| Best for | Production, cloud, enterprise | Local dev and testing |
2. Docker Compose for Multi-Service Agent Stacks
Run each MCP server as its own container. The agent container connects to them over HTTP.
yaml
version: "3.9"
services:
agent:
build: .
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
depends_on:
- internal-data-server
- crm-server
internal-data-server:
build: ./servers/internal_data
ports:
- "8001:8001"
crm-server:
build: ./servers/crm
ports:
- "8002:8002"3. LangGraph Platform vs AWS Bedrock AgentCore vs Azure
LangGraph Platform gives you one-click deployment with built-in memory APIs, human-in-the-loop, and the /mcp endpoint out of the box. AWS Bedrock AgentCore and Azure offer more infrastructure control but require more setup to replicate the same observability and state management features.
For teams already on Azure, deploying the agent on Azure Container Apps with LangSmith for tracing is a practical middle ground.
4. MCP, A2A, and agents.json: Interoperability Ahead
MCP handles tool access. A2A (Agent-to-Agent protocol) handles agent-to-agent communication. agents.json is an emerging convention for declaring what an agent can do, similar to robots.txt for crawlers.
Together, these three form the foundation of interoperable agentic systems where agents built on different frameworks can discover and call each other. LangGraph’s /mcp endpoint positions any deployed LangGraph agent as a participant in this ecosystem without additional configuration.
How Kanerika Builds Enterprise AI Agents with LangGraph and MCP
Kanerika’s agentic AI practice covers production agent deployments across finance, retail, logistics, and compliance-heavy industries. Two products from Kanerika’s portfolio map directly to roles in a LangGraph MCP architecture.
Karl-AI Data Insights Agent: Karl is a native Microsoft Fabric workload that lets users query structured enterprise data in plain language. It translates intent into validated SQL, enforces role-based access per user session, and returns results with visual explanations rather than raw data. In a multi-agent MCP setup, Karl fits as a scoped data MCP server with typed, auditable responses. Available on the Azure Marketplace with a free 30-day trial, connecting to Fabric Lakehouse, Excel, CSV, and PostgreSQL out of the box.
DokGPT-Document Intelligence Agent: DokGPT handles the retrieval layer that a document-based MCP server would expose in a production stack. It queries large document repositories using natural language, operating within defined knowledge boundaries. In a verified investment bank deployment, it delivered:
- 43% faster information retrieval
- 35% reduction in manual review hours
- 100% role-based compliance maintained
Kanerika holds Microsoft Solutions Partner status for Data and AI and has worked with 100+ enterprise clients over 10+ years, maintaining a 98% client retention rate. For teams moving from prototype to production on LangGraph and MCP, Kanerika’s Agentic AI practice covers the full build.
FAQs
What are LangChain MCP adapters?
LangChain MCP adapters (langchain-mcp-adapters) is an official package that converts MCP tool schemas into LangChain-compatible tool objects. This allows LangGraph agents to bind and call MCP tools using the same interface as native LangChain tools, without writing custom integration code.
What is a LangGraph MCP server?
A LangGraph MCP server is an MCP-compatible server that exposes tools, resources, or prompts that a LangGraph agent can call. It can run locally via stdio transport or remotely via SSE or streamable HTTP transport. In LangGraph’s Agent Server, the agent itself can also be exposed as an MCP server via the /mcp endpoint.
How do you add a human-in-the-loop checkpoint in LangGraph?
Use LangGraph’s interrupt inside a graph node. When the agent reaches that node, execution pauses and the interrupt payload is surfaced to the reviewer. The graph resumes only when a human decision is provided via the thread’s state. This is the recommended pattern for any agent action that writes to external systems.
How does Kanerika use LangGraph and MCP in enterprise deployments?
Kanerika builds production agentic AI systems that apply MCP for tool standardization and LangGraph for orchestration. KARL (AI Data Insights Agent), Alan (legal document summarization), DokGPT (document intelligence), Mike (quantitative proofreading), and Susan (PII redaction) are each purpose-built agents that function as independent MCP-compatible capabilities. In a multi-agent architecture, a LangGraph orchestrator routes tasks to the right specialist via the MCP protocol, with FLIP providing the governed data pipeline layer that agents query.
Can a LangGraph agent itself be exposed as an MCP server?
Yes. When deployed via LangGraph’s Agent Server, the agent automatically exposes a /mcp endpoint using streamable HTTP transport. Any MCP-compatible client, including another LangGraph agent, Claude Desktop, or an IDE, can call the deployed agent as a tool. This is how multi-agent orchestration is built without custom message-passing infrastructure.



