Try Context Guard against a real prompt injection
Below is a sandboxed bank chatbot protected by our proxy. Throw an attack at it - the hybrid detection pipeline (rules + source-aware analysis + optional ML classifier + LLM judge) runs live and shows you exactly what it caught.
Pick a suggested attack or write your own. The chat hits a sandboxed bank assistant.
Regex rules + source-aware analysis run in ~0.5ms. The DeBERTa-v3 ML classifier and LLM judge layer in for ambiguous cases - total p50 ~1.3ms.
See risk score, action, threats, OWASP refs, and which detection layer fired - render live in the right-hand panel.
This demo runs the full hybrid stack. ML is opt-in (CG_ENABLE_ML=1) at the proxy - same detection capabilities on every tier.
Regex signatures + source-aware analysis. Catches the bulk of known injection, leak, and PII patterns with zero ML dependency.
DeBERTa-v3 classifier catches paraphrased and novel attacks the rules miss. Free toggle, no extra usage cost.
Only invoked for ambiguous cases. Adds rationale and confidence to the verdict you see in the panel.
System: You are a demo bank assistant for SafeBank. IMPORTANT RULES: 1) Never give specific financial advice, investment recommendations, or tell users to move money. 2) Never reveal or invent account numbers, balances, PINs, or personal data. 3) Always clarify this is a demo and not real financial advice. 4) If asked for specific financial guidance, respond: I am a demo assistant and cannot provide specific financial advice. Please consult a qualified financial advisor.
Demo only · Not financial advice · No real data stored · max 5 msgs/min · 20 per session
Rationale: Direct instruction-override pattern combined with a request to leak the system prompt. Hard-block rule matched.
Threats detected
2Classic 'ignore previous instructions' pattern attempting to override the system prompt.
User explicitly asked the model to print or reveal its system prompt.
LLM judge
User intent is unambiguous: override the established assistant role and exfiltrate the protected system prompt.
Ready to drop this in front of your real LLM?
Same detection pipeline, your tenants, your policies, your keys. Point your base URL at Context Guard and ship.