Threat research & playbooks

The Context Guard blog

Field notes from defending production LLM applications - prompt injection, context poisoning, OWASP LLM Top 10 coverage, and the engineering behind the proxy.

Try Context Guard
LLM Tool Abuse Attacks: Shell Injection, SSRF, Credential Theft, and 252 Other Ways Your Agent Can Be Turned Against YouFeatured · Threat research
4 July 2026 16 min read

LLM Tool Abuse Attacks: Shell Injection, SSRF, Credential Theft, and 252 Other Ways Your Agent Can Be Turned Against You

AI agents call tools on your behalf. When an attacker controls the arguments, the agent becomes a weapon aimed at your infrastructure. Tool abuse is the largest attack category in production LLM deployments with 252 detection rules covering shell injection, SQL injection, path traversal, SSRF, credential harvesting, sandbox escapes, MCP exploitation, deserialization RCE, and mass assignment. Here are the nine attack families, the real payloads, and the four-layer defense architecture that stops tool-call attacks before they execute.

Read article

All posts

LLM Authentication Attacks: OAuth Token Theft, Session Hijacking, and Identity Bypass in AI Platforms
Threat research

LLM Authentication Attacks: OAuth Token Theft, Session Hijacking, and Identity Bypass in AI Platforms

OAuth token replay, CSRF bypass, scope escalation, IDOR in agent workspaces, and cross-user identity hijacking are the authentication attack classes that compromise AI platforms at the identity layer. The model is the entry point; the identity system is the prize. Backed by disclosed vulnerabilities in MCP OAuth flows, Langflow IDOR, and Open WebUI authorization bypasses, here is the full threat map and the defense architecture that closes the gaps.

1 July 2026 14 min
Email and Communication Channel Injection: How Attackers Hijack AI Assistants Through Slack, Email, and Shared Docs
Threat research

Email and Communication Channel Injection: How Attackers Hijack AI Assistants Through Slack, Email, and Shared Docs

CVE-2026-33654, CVE-2025-32711 (EchoLeak), CVE-2025-46059, and GitHub comment hijacking demonstrate that email bodies, HTML hidden content, AGENTS.md files, and Slack messages are all live injection channels. The attacker never touches the user prompt. Here are the six attack vectors, the CVEs behind them, and the defense architecture that secures every channel.

28 June 2026 13 min
LLM Sandbox Escapes: How AI Agents Break Out of Containment
Threat research

LLM Sandbox Escapes: How AI Agents Break Out of Containment

From unsandboxed Python execution disguised as isolation, to Docker socket privilege escalation, to managed identity token theft from cloud MCP servers, sandbox escapes in LLM agents are well-documented and growing. Here are the six attack families, the CVEs that prove them real, and the defense architecture that stops them.

25 June 2026 16 min
LLM Platform Vulnerabilities: IDOR, BOLA, GPU Leaks, and the Seven Attack Classes That Bypass Prompt Security
Threat research

LLM Platform Vulnerabilities: IDOR, BOLA, GPU Leaks, and the Seven Attack Classes That Bypass Prompt Security

IDOR, BOLA, GPU memory leaks, OAuth bypass, postMessage confirmation bypass, decompression bombs, and metadata manipulation are seven platform-level vulnerability classes that no prompt injection filter will catch. Backed by 30+ real security advisories from Open WebUI, vLLM, and Langflow, here is the full threat map and the defense architecture that closes the gaps.

22 June 2026 15 min
Conditional Trigger Attacks: How Delayed-Action Injections Bypass Every Filter
Threat research

Conditional Trigger Attacks: How Delayed-Action Injections Bypass Every Filter

Conditional trigger attacks plant dormant instructions in an LLM's context that only activate when a future condition is met. The attack is invisible to single-request inspection, and the breach request is clean. Here are the five attack patterns, the two detection rules that catch them, and the defense architecture that stops time-bomb injections before they fire.

19 June 2026 14 min
MCP Supply Chain Attacks: 30 CVEs, Rug Pulls, and the Trust Model That Broke
Threat research

MCP Supply Chain Attacks: 30 CVEs, Rug Pulls, and the Trust Model That Broke

Thirty CVEs in 60 days, the first malicious MCP server hitting 300 organizations, and a design-level RCE baked into Anthropic's SDK. The MCP supply chain is under active attack. Here is the full incident map and the defense architecture that stops it.

16 June 2026 15 min
Agentic Web Attacks: How Attackers Exploit AI Browsers That Browse the Internet
Threat research

Agentic Web Attacks: How Attackers Exploit AI Browsers That Browse the Internet

AI agents that browse the web are under active attack. Hidden instructions in web pages, browser manipulation, UI deception, credential harvesting, data exfiltration through forms, and MCP tool hijacking are six attack classes that exploit the trust agents place in web content. Backed by the WAAA research and production attack patterns, here is the full threat map and the five-layer defense architecture.

13 June 2026 13 min
LLM Template Injection: How Template Engines Become Prompt Injection Vectors
Threat research

LLM Template Injection: How Template Engines Become Prompt Injection Vectors

Jinja2, Django templates, and Python format strings are the plumbing of every LLM pipeline. When attackers inject template syntax into that plumbing, they bypass every prompt filter and achieve data exfiltration from the application server. CVE-2025-65106 proved it in LangChain. Here are the five attack vectors and the defense architecture that stops them.

10 June 2026 12 min
LLM Code Execution Attacks: How Sandbox Escapes Turn AI Assistants Into Attack Platforms
Threat research

LLM Code Execution Attacks: How Sandbox Escapes Turn AI Assistants Into Attack Platforms

Sandbox escapes, pickle deserialization RCE, trust_remote_code execution, MCP server command injection, and self-propagating agent worms are the five code execution attack classes we see in production. Backed by CVEs, GitHub advisories, and published research, here is the full threat map and the defense architecture that stops your AI assistant from becoming an attack platform.

7 June 2026 13 min
Agent Memory Poisoning: How Attackers Plant Persistent Backdoors in LLM Memory
Threat research

Agent Memory Poisoning: How Attackers Plant Persistent Backdoors in LLM Memory

When an attacker poisons an agent's persistent memory, the compromise survives restarts, persists across sessions, and spreads to child agents through inheritance. Here are the five memory poisoning attack classes we detect in production and the defense architecture that stops poisoned memories from becoming persistent backdoors.

4 June 2026 14 min
LLM Supply Chain Attacks: How Compromised Models, Plugins, and Dependencies Subvert Your AI Stack
Threat research

LLM Supply Chain Attacks: How Compromised Models, Plugins, and Dependencies Subvert Your AI Stack

Compromised model weights, malicious MCP servers, template injection, sandbox escapes, SSRF, and framework vulnerabilities give attackers a path into your LLM stack that no prompt filter can close. Here are the six supply chain attack classes we see in production, the CVEs and advisories behind them, and the defense architecture that stops them.

3 June 2026 14 min
LLM Denial of Service: How Resource Exhaustion Attacks Drain Your AI Budget
Threat research

LLM Denial of Service: How Resource Exhaustion Attacks Drain Your AI Budget

LoopTrap termination poisoning, ThinkTrap infinite reasoning, RECUR recursive reflection abuse, and tool-chain cost amplification are four distinct attack classes that exploit the fact that LLMs keep working if nobody tells them to stop. Here is how each one works, why token limits do not help, and the five-layer defense that caps costs before they spiral.

28 May 2026 13 min
Invisible Prompt Injection: How Hidden Unicode Characters Bypass LLM Security
Threat research

Invisible Prompt Injection: How Hidden Unicode Characters Bypass LLM Security

Zero-width characters, Unicode tag sequences, bidirectional overrides, and homoglyphs let attackers smuggle malicious instructions past every keyword filter and human reviewer. The text you see is not the text the model sees. Here is how each invisible injection technique works and the normalize-decode-detect pipeline that stops them.

25 May 2026 13 min
System Prompt Leakage: Why Your AI's Hidden Instructions Are Not Hidden
Threat research

System Prompt Leakage: Why Your AI's Hidden Instructions Are Not Hidden

Every LLM application has a system prompt. Most teams treat it as a secret. It is not. System prompt leakage (OWASP LLM07) is one of the most exploited vulnerability classes in production LLM applications, and the extraction techniques range from trivially simple to sophisticated multi-turn probing campaigns. Here is the full threat map, the seven detection rules that catch every extraction method, and why treating your system prompt as a security boundary is a losing strategy.

22 May 2026 12 min
Multilingual Prompt Injection: How Non-English Attacks Bypass Your Defenses
Threat research

Multilingual Prompt Injection: How Non-English Attacks Bypass Your Defenses

Most LLM security filters are built for English. But models speak dozens of languages, and attackers use German, Spanish, Korean, and Russian to walk right past English-only defenses. Here is how multilingual injection works and how to build a defense that does not stop at the language border.

19 May 2026 11 min
LLM Output Exfiltration: How Attackers Steal Data Through Your Model's Response
Threat research

LLM Output Exfiltration: How Attackers Steal Data Through Your Model's Response

Markdown images, base64 coercion, cipher output, emoji substitution, and tool-call exfiltration are the seven output-side attack techniques that bypass traditional DLP. Here is how each one works and the multi-layer defense that stops them.

16 May 2026 12 min
AI Governance Crisis: Why Most Companies Are Deploying AI Without Authority
Governance

AI Governance Crisis: Why Most Companies Are Deploying AI Without Authority

A deep dive into the governance vacuum behind enterprise AI adoption, from shadow AI and procurement failures to PII leakage, regulatory exposure, and the technical controls companies need before they scale.

15 May 2026 14 min
CI/CD Pipeline Injection: When Your Build Bot Has an LLM Inside
Threat research

CI/CD Pipeline Injection: When Your Build Bot Has an LLM Inside

LLM-powered CI/CD workflows are a new attack surface that traditional pipeline security cannot defend. The Heimdallr research, CVE-2025-65106, and real-world attack patterns show how PR descriptions, commit messages, and template injection can compromise your build pipeline from the inside.

13 May 2026 13 min
RAG Data Exfiltration: How Attackers Steal Your Knowledge Base
Threat research

RAG Data Exfiltration: How Attackers Steal Your Knowledge Base

RAG systems give LLMs access to proprietary data. Attackers have figured out how to pull it all out through the model itself. Here is how the LeakDojo attack works, how enumeration probes map your knowledge base, and how to lock it down.

13 May 2026 12 min
Securing Autonomous AI Agents: Attack Surfaces, Threats, and Defense Patterns
Threat research

Securing Autonomous AI Agents: Attack Surfaces, Threats, and Defense Patterns

Autonomous AI agents can browse the web, call APIs, and send emails on your behalf. Here are the seven attack classes we see in production and the six-layer defense architecture that stops them.

12 May 2026 14 min
Why We Built a Hybrid Detection Engine
Engineering

Why We Built a Hybrid Detection Engine

Per-dataset benchmark results for the Context Guard hybrid pipeline (rules plus ML judge), where each layer wins, the AdvBench ceiling, and why we run both.

11 May 2026 8 min
MCP Security Attacks: How Attackers Hijack AI Tool Calls in 2026
Threat research

MCP Security Attacks: How Attackers Hijack AI Tool Calls in 2026

Three CVEs, multiple GitHub advisories, and growing academic research expose MCP tool hijacking, SSE injection, LoopTrap, and agentic browser attacks. Here is the full threat map and how to defend against it.

10 May 2026 14 min
AI Security Best Practices for Production LLM Applications
Guide

AI Security Best Practices for Production LLM Applications

An end-to-end practical guide to shipping production LLM applications safely: input validation, output filtering, agent controls, monitoring, and compliance.

8 May 2026 13 min
OWASP LLM Top 10 2025: Every Risk Explained with Mitigations
Reference

OWASP LLM Top 10 2025: Every Risk Explained with Mitigations

Walk through every item in the OWASP LLM Top 10 with practical mitigations and a coverage map for runtime defense layers.

4 May 2026 11 min
10 Real Prompt Injection Attacks & How to Stop Them
Tutorial

10 Real Prompt Injection Attacks & How to Stop Them

A practical tour of ten prompt injection techniques observed in production traffic, with payloads and the detection logic that stops each one.

30 April 2026 12 min
What Is Context Poisoning? The Complete Guide for 2026
Threat research

What Is Context Poisoning? The Complete Guide for 2026

Context poisoning is the next-generation cousin of prompt injection. Learn what it is, how it differs, real-world attack scenarios, and how to defend against it.

22 April 2026 10 min