Back to console

Context Guard Cloud

Integration Guide

Context Guard is a hosted proxy that sits in front of your LLM calls. You do not need to clone a repo, run Docker, or self-host anything for the free trial. Just create an API key in Settings, point your client at https://api.ctx-guard.com, and add your key header.

01 - Overview

How it works

Your app sends prompts to Context Guard first. We run a hybrid detection pipeline (regex rules, source-aware analysis, and an optional ML classifier) before forwarding upstream.

Flow: Your App → Context Guard Proxy → OpenAI / Anthropic → Response back to your app
  • • Create an API key in Settings
  • • Change your LLM client base URL to https://api.ctx-guard.com
  • • Add X-API-Key: cg_live_... to every request
  • • Keep using your normal OpenAI / Anthropic SDK

The hosted proxy ships with rules + source-aware detection enabled by default (p50 ~1.3ms). The ML classifier is opt-in for workloads that need higher recall on adversarial attacks. See Detection Architecture for the full pipeline.

02 - Architecture

Detection architecture

Context Guard layers three detection stages. Each stage runs only when earlier stages don't already have a confident decision, so latency stays low on the hot path.

Hybrid pipeline
  1. 1. Regex rule engine~0.5ms p50

    Curated pattern set covering known prompt-injection phrasings, role-override attempts, exfiltration markers, and tool-misuse signatures. Hot-reloadable without restarts.

  2. 2. Source-aware analysisincluded in ~1.3ms p50

    Distinguishes user input from system / tool / retrieved content and weights risk accordingly. Catches indirect injection embedded in retrieved documents or tool output.

  3. 3. ML classifier (optional)~84ms when it fires

    DeBERTa-v3 fine-tuned for prompt-injection detection. Disabled by default. Only runs when the rule layer is silent, and only elevates risk. It never overrides a rule decision.

BenchmarkRecallPrecisionFPR
BIPIA100%100%0%
TensorTrust100%100%0%
CyberSecEval97.4%100%0%
JailbreakBench77.1%93.1%13.3%
AdvBench22.5%100%0%

Numbers reflect the hybrid pipeline (rules + source-aware + ML). The rule layer is production-ready standalone. ML adds recall on adversarial categories at the cost of per-request latency when it fires.

03 - ML Classifier

ML classifier (optional)

An opt-in DeBERTa-v3 model that runs after the rule layer to catch adversarial prompts the regex rules don't have signatures for.

What it is
  • • Model: protectai/deberta-v3-base-prompt-injection-v2
  • • ~184M parameters, CPU-only inference (no GPU required)
  • • Fine-tuned for prompt-injection classification
  • • Loaded lazily on first request after enablement
How it works
  • • Runs only when the rule layer is silent (no rule match)
  • • Can elevate the risk score when it detects injection
  • Never overrides a rule decision: rules are authoritative
  • • Silent rule + silent ML = the request passes through
Latency expectations
  • • Rules-only p50: ~0.5ms
  • • Rules + source-aware p50: ~1.3ms
  • • When ML fires: ~84ms (added to the request)

Because ML only runs on rule-silent traffic, average added latency depends on your traffic mix. Clean traffic in production typically sees the rule-layer latency on most requests.

Enable it

The ML classifier is opt-in. For self-hosted / enterprise deployments, set the environment variable on the proxy:

bash
# Enable the optional ML classifier
CG_ENABLE_ML=1

On the hosted cloud proxy, ML is off by default. Contact us if you want it enabled on your tenant.

When to use it: production systems that need higher recall on adversarial / novel-phrasing attacks (e.g. JailbreakBench-style content), and that can tolerate ~84ms on rule-silent requests. If you need consistent sub-millisecond p50 across the board, keep ML off. The rule layer is production-ready on its own.
04 - Quick Start

Fastest possible setup

If you already have an API key, this is the minimum change required.

bash
curl -X POST https://api.ctx-guard.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: cg_live_your_key_here" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

That's it. Same shape as OpenAI - just send the request to Context Guard instead.

05 - OpenAI

OpenAI SDK integration

Keep using the official SDK. Just change the base URL and add your Context Guard key.

python
from openai import OpenAI

client = OpenAI(
    api_key="your-openai-key",
    base_url="https://api.ctx-guard.com/v1",
    default_headers={"X-API-Key": "cg_live_your_key_here"},
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
)

print(response.choices[0].message.content)
javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-openai-key",
  baseURL: "https://api.ctx-guard.com/v1",
  defaultHeaders: { "X-API-Key": "cg_live_your_key_here" },
});

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello" }],
});

console.log(response.choices[0].message.content);
06 - Anthropic

Anthropic integration

Same idea - point the client at Context Guard and include your key header.

python
import anthropic

client = anthropic.Anthropic(
    api_key="your-anthropic-key",
    base_url="https://api.ctx-guard.com",
    default_headers={"X-API-Key": "cg_live_your_key_here"},
)

message = client.messages.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello"}],
)

print(message.content)
07 - Alerts

Webhooks

Send threat events to Slack, a SIEM, or your internal incident pipeline.

Configure webhook endpoints in Settings. You can subscribe to block, redact, log, and allow events.

json
{
  "event": "block",
  "request_id": "req_123",
  "risk_score": 0.97,
  "threat_type": "prompt_injection",
  "severity": "critical",
  "timestamp": "2026-05-07T13:00:00Z"
}
08 - API

API reference

Main endpoints you'll actually use on the hosted service.

MethodEndpointPurpose
POSThttps://api.ctx-guard.com/v1/chat/completionsOpenAI-compatible proxy
POSThttps://api.ctx-guard.com/v1/messagesAnthropic-compatible proxy
POSThttps://api.ctx-guard.com/api/v1/inspectDirect prompt inspection
GEThttps://api.ctx-guard.com/api/v1/threatsThreat log
GEThttps://api.ctx-guard.com/api/v1/statsDashboard stats
GEThttps://api.ctx-guard.com/api/v1/settingsRead settings
PUThttps://api.ctx-guard.com/api/v1/settingsUpdate settings

Use X-API-Key on your requests. Your LLM provider key stays in the normal SDK auth field.

09 - Troubleshooting

Common errors

The main ones trial users are likely to hit.

401 Invalid API key

Your Context Guard key is missing, revoked, or malformed.

403 API key expired

Your trial ended or the key expiry date passed.

429 Rate limited

You hit the per-key request cap. Slow down or upgrade.

503 Upstream error

The underlying model provider returned an error or timed out.

10 - Plans

Trial & upgrade

Free trial users use the hosted cloud proxy. Self-hosting is not part of the free-trial path.

Need to host locally?

Local hosting is not part of the free trial. If you need a private or self-hosted deployment, speak to us and we can discuss an enterprise setup.