Threat Modeling MCP in the Real World
People like to describe MCP as “USB-C for AI.”
It’s a good line. It explains why people care.
USB-C made hardware interoperability easier. MCP makes tool interoperability easier. Build once, connect everywhere, move faster.
The problem with good metaphors is that they are usually true in one way and dangerously false in another.
USB-C looks like a cable problem. MCP looks like a protocol problem.
But the hard part isn’t the connector. The hard part is delegation.
When an AI client connects to tools through MCP, it is not just moving data. It is moving authority: who can read what, who can trigger what, and under which identity.
That shift is what many threat models miss.
They evaluate MCP like an integration layer, when they should evaluate it like an authorization fabric.
Why this matters now
Standards compress engineering cost. They also compress attacker learning curves.
Before MCP, every integration had custom quirks. That was messy for developers and inconvenient for attackers. With standardization, we gain velocity and lose diversity. A weakness in common implementation patterns becomes reusable across many environments.
This doesn’t mean MCP is unsafe. It means MCP is now important enough to threat model as first-class infrastructure.
The teams that do this early will avoid the coming cycle: rapid adoption, soft defaults, then expensive retrofitting under incident pressure.
The core modeling error
Most AppSec teams start with the wrong question:
“Could the model call a bad tool?”
That question is too narrow.
The better question is:
What trust boundaries are crossed when this model asks for a capability, and what prevents a bad actor from crossing them too?
In practice, MCP systems involve at least five principals:
- the end user,
- the MCP client,
- the MCP server,
- one or more authorization servers,
- and downstream APIs/tools.
Every one of these can fail independently. Most serious incidents are interaction failures between them.
Threat modeling MCP as a system, not a feature
A useful MCP threat model starts by marking concrete boundaries.
Boundary 1: Untrusted input to capability request
Prompts, files, links, and third-party tool responses can all influence model behavior. If untrusted content can shape tool-call intent without policy mediation, the model becomes an amplifier for attacker instructions.
This is the classic “prompt injection” story, but in MCP environments the real risk is not bad text. It is bad text that can cause authorized side effects.
The tool-invocation boundary MUST enforce strict JSON schema validation and robust type-checking on every incoming request. The model can hallucinate parameters, format malicious payloads (SQL injection in a query parameter, path traversal in a file path), or inject unexpected fields that bypass downstream validation. The MCP server must validate every tool-call request against the tool’s declared input schema BEFORE execution — model output is untrusted input to the server. This is the same principle as never trusting client-side validation in web applications.
Boundary 2: Client ↔ MCP server transport
If transport and request semantics are treated as trusted once connected, attackers only need a valid foothold to start abusing protocol-level assumptions. Session handling, request binding, and state transitions matter more than teams expect.
Boundary 3: MCP server ↔ OAuth discovery chain
MCP authorization relies on metadata discovery and endpoint traversal. That creates fetch behavior that can be abused when URL validation is weak. Security guidance now explicitly calls out SSRF risks during OAuth metadata discovery, including internal IP targeting and cloud metadata endpoints.
Boundary 4: MCP proxy ↔ third-party authorization server
This is where “confused deputy” attacks appear. In proxy architectures, static client IDs, dynamic registration, and consent cookies can combine into a dangerous path where user consent is bypassed in practice, even while each component appears standards-compliant in isolation.
Boundary 5: MCP server ↔ downstream APIs
Token usage patterns define blast radius. If an MCP server accepts and forwards tokens not properly audience-bound to itself, you get token passthrough behavior: weak accountability, weak policy enforcement, and easier lateral abuse.
Boundary 6: Local host ↔ local MCP server process
Local server installation flows are not harmless convenience features. They are code execution pathways. If “one-click” setup can run opaque startup commands with broad host privileges, your threat model has already failed.
The high-value attack paths to model first
Teams often over-index on speculative model failure modes and under-index on boring, reliable infrastructure attacks. Start with these.
1) Confused deputy in OAuth proxy patterns
Security best-practice guidance for MCP now documents a practical confused-deputy chain: static client ID at the third-party auth server, dynamic client registration at the MCP layer, prior consent cookie in browser context, and weak per-client consent enforcement at the proxy.
Result: authorization code theft without fresh user intent.
This is important because it is subtle. No single component looks obviously broken. The architecture is.
2) Token passthrough anti-pattern
If an MCP server functions as a blind token relay, it loses security semantics it was supposed to enforce: audience checks, scoped authorization, request accountability, and actionable audit trails.
Token passthrough looks expedient during early integration and becomes expensive debt later.
3) SSRF through authorization metadata discovery
OAuth discovery requires fetching metadata from URLs that may be influenced by remote parties. Without strict URL and network controls, clients can be induced to request internal services, cloud metadata endpoints, or rebinding-controlled hosts.
This is not theoretical. It is exactly how discovery systems fail when convenience outruns egress control.
4) Session hijack and event injection in stateful deployments
When session identifiers become de facto authentication or are weakly bound in multi-node/evented deployments, attackers can inject payloads or impersonate clients. Session IDs are correlation artifacts, not trust anchors.
5) Local server compromise via installation flow
MCP security guidance explicitly flags this: local server setup can embed dangerous startup commands, obfuscated execution chains, or high-privilege operations. If users cannot clearly inspect and consent to exact commands, you are one social-engineering step away from host compromise.
Properties and Controls
A defensible MCP deployment requires concrete security properties, each enforced by a specific control layer. Property without implementation is policy theater. Implementation without a clear property to enforce is security theater. You need both.
Layer 1: Capability governance
Property: Authorization is explicit and contextual — the model can request capability, but only policy grants it.
- Maintain a server/tool allowlist per environment.
- Classify tools by impact tier.
- Require stronger approvals for destructive and externalized actions.
Layer 2: OAuth hardening
Property: Consent is per client, not per vague prior session — consent state must bind to concrete client identity, redirect URI, scope set, and anti-CSRF state.
- Enforce exact redirect URI matching.
- Require PKCE and one-time short-lived
state. - Store consent and state only after explicit approval.
- Bind consent to client identity, scope, and redirect target.
Layer 3: Token discipline
Property: Tokens are audience-bound and least-privilege — no broad standing tokens, no ambiguous audience, no passthrough shortcuts.
- Reject tokens not issued for the MCP server audience.
- Issue short-lived tokens with narrow scopes.
- Separate read-only and high-impact capabilities by token class.
Layer 4: Network egress controls
Property: Discovery fetches are network-constrained — private ranges, link-local, localhost, and suspicious redirects are blocked unless explicitly allowed for development.
- Enforce HTTPS for metadata fetches in production.
- Block private, link-local, and loopback ranges by default.
- Validate redirect hops and limit automatic following.
- Route through an egress proxy where feasible.
Layer 5: Runtime containment
Property: Sessions are not authentication substitutes — session identifiers must be unpredictable, scoped, and bound to verified user context.
- Never treat session IDs as authentication.
- Bind runtime events to authenticated user context.
- Sign and audit high-impact action requests.
- Support an immediate kill switch and token revocation.
Layer 6: Local server safety
Property: Local execution is sandboxed and inspectable — any command path must be visible, consented, and privilege-constrained.
- Show the exact startup command before execution.
- Flag dangerous patterns such as
sudo, destructive filesystem operations, and opaque shell chains. - Sandbox file, network, and process permissions.
- Require explicit privilege elevation, never implicit.
This looks like overhead until the first incident. After that it looks cheap.
What to measure (or you are flying blind)
Most teams still track only adoption metrics: number of MCP servers connected, number of tool calls, median response latency.
Those are product metrics, not risk metrics.
Track these instead:
- percent of tool calls authorized by explicit policy rule,
- percent of tokens with minimal scope profile,
- blocked metadata fetch attempts to restricted networks,
- mean time to revoke compromised MCP credential paths,
- count and age of policy exceptions for high-impact tools.
If your dashboard can’t show these, your threat model is prose, not operations.
The strategic takeaway
MCP is a useful standard, and it will likely become foundational for agent ecosystems.
That is exactly why security teams should stop treating it as a connector and start treating it as delegated authorization infrastructure.
The phrase “USB-C for AI” is good marketing.
But security work starts where the metaphor ends.
Because in production, MCP does not just connect systems. It connects trust domains.
And when trust domains are connected, design mistakes compound faster than model errors.
The organizations that understand this early will not merely avoid incidents. They will build agent capabilities they can safely scale.
That is the real competitive advantage.
References
- Model Context Protocol: Introduction — https://modelcontextprotocol.io/docs/getting-started/intro.md
- MCP Security Best Practices — https://modelcontextprotocol.io/docs/tutorials/security/security_best_practices.md
- MCP Authorization Specification — https://modelcontextprotocol.io/specification/latest/basic/authorization
- OAuth 2.0 Security Best Current Practice (RFC 9700) — https://datatracker.ietf.org/doc/html/rfc9700