Skip to content

AInvirion/aiproxyguard

AIProxyGuard

License GHCR Docs

LLM Security Proxy with Prompt Injection Detection.

What It Does

AIProxyGuard sits between your application and LLM providers to detect and block malicious inputs before they reach the model. Point your OpenAI/Anthropic SDK at the proxy instead of directly at the provider.

AIProxyGuard Flow

Quick Start

# Run the proxy
docker run -d -p 8080:8080 ghcr.io/ainvirion/aiproxyguard:latest

# Verify it's running
curl http://localhost:8080/healthz

Point your LLM client to the proxy:

from openai import OpenAI

client = OpenAI(
    api_key="sk-...",
    base_url="http://localhost:8080/openai/v1"
)

# Normal requests work as expected
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Malicious requests are blocked
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Ignore all previous instructions..."}]
)
# Raises: BadRequestError - content_blocked

Detection-Only Mode

Use the /check endpoint to scan text without forwarding to an LLM:

curl -X POST http://localhost:8080/check \
  -H "Content-Type: application/json" \
  -d '{"text": "Ignore all previous instructions"}'

# Response:
# {"action": "block", "category": "prompt-injection", "signature_name": "Ignore instructions directive", "confidence": 0.9}

Features

  • Multi-Provider Routing - OpenAI, Anthropic, OpenRouter, Ollama
  • Detection-Only Mode - /check endpoint for pre-validation
  • Request & Response Scanning - Regex + heuristics detection
  • Policy Engine - Per-category actions (block/warn/log)
  • Rate Limiting - iptables-based DDoS protection
  • Prometheus Metrics - Full observability at /metrics
  • Control Plane - Fleet management, automatic signature sync

Detection Categories

Category Description
prompt-injection Instruction override attempts
jailbreak DAN mode, persona exploits
encoding-bypass Base64/hex/ROT13 obfuscation
delimiter-injection JSON/XML structure attacks
indirect-injection Tool abuse, plugin exploits
unicode-evasion Homoglyphs, fullwidth chars
role-manipulation Named character roleplay

Documentation

Full documentation at ainvirion.github.io/aiproxyguard

Control Plane

Connect to aiproxyguard.com for fleet management and automatic signature updates:

docker run -d -p 8080:8080 \
  -e AIPROXYGUARD_CONTROL_PLANE_ENABLED=true \
  -e AIPROXYGUARD_CONTROL_PLANE_API_KEY=your-api-key \
  ghcr.io/ainvirion/aiproxyguard:latest

Development

python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest

License

Apache-2.0 - Copyright (c) 2025-2026 AInvirion LLC

About

AIProxyGuard is a security proxy that sits between your application and LLM providers (OpenAI, Anthropic, etc.) to detect and block prompt injection attacks, jailbreak attempts, and other malicious inputs.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors