OpenAI Drop-In Proxy
Change one URL to get enterprise PII protection with zero code changes
The Killer Use Case
Change one URL. Get enterprise PII protection. Zero code changes.
Before (unprotected)
client = OpenAI(api_key="sk-...")
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Customer Julia Schneider (julia@firma.de) needs help"}]
)
# Julia's name and email sent directly to OpenAIAfter (protected by OGuardAI proxy)
The proxy runs on port 8081 by default (separate from the OGuardAI server on port 3000 or 8080). Point your SDK's base_url at the proxy:
client = OpenAI(
api_key="sk-...",
base_url="http://localhost:8081/v1" # Only change: point to OGuardAI proxy (port 8081)
)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Customer Julia Schneider (julia@firma.de) needs help"}]
)
# OpenAI sees: "Customer `{{person:p_001}}` (`{{email:e_001}}`) needs help"
# Response automatically restored: "Dear Frau Julia Schneider..."How It Works
- Your app sends request to OGuardAI proxy (port 8081)
- Proxy masks PII in all user/assistant messages
- Proxy forwards to real OpenAI API
- OpenAI responds with tokens (e.g., "Dear
{{person:p_001}}") - Proxy restores tokens to real values
- Your app receives the response with real names
Setup
# Start OGuardAI proxy
oguardai-proxy --target https://api.openai.com --policy default --port 8081
# Or with Docker
docker run -p 8081:8081 ghcr.io/oronts/oronts-guardai/oguardai-proxy:latest \
--target https://api.openai.com \
--policy defaultStreaming Support
Streaming works transparently:
stream = client.chat.completions.create(
model="gpt-4",
messages=[...],
stream=True # Streaming works through proxy
)
for chunk in stream:
print(chunk.choices[0].delta.content, end="")What Gets Protected
| Message Role | Scanned? |
|---|---|
| system | No (preserves your prompts) |
| user | Yes |
| assistant | Yes |
| tool calls | Yes (function arguments) |
| tool results | Yes |
Anthropic Support
Same pattern for Anthropic:
client = Anthropic(
api_key="sk-ant-...",
base_url="http://localhost:8081"
)Configuration
The proxy respects all OGuardAI policy settings:
# Use a specific policy
oguardai-proxy --target https://api.openai.com --policy strict-pii --port 8081
# With German formal restore
oguardai-proxy --target https://api.openai.com --policy german-support --port 8081How Token Restoration Works
When the LLM generates output containing tokens like {{person:p_001}}, the proxy
automatically restores them using the session state that was created during the
request transformation. The restore mode (full, partial, masked, formatted, abstract)
is controlled by the policy configuration.
For example, with the german-support policy:
- Input: "Customer Julia Schneider (julia@firma.de) needs help"
- To OpenAI: "Customer
{{person:p_001}}({{email:e_001}}) needs help" - From OpenAI: "Dear
{{person:p_001}}, we have received your request..." - To your app: "Dear Frau Julia Schneider, we have received your request..."
Architecture
Your App
|
| (standard OpenAI SDK calls)
v
OGuardAI Proxy (port 8081)
|
| 1. Transform: mask PII in request
| 2. Forward: send safe request to OpenAI
| 3. Rehydrate: restore tokens in response
|
v
OpenAI API (never sees real PII)Limitations
- The proxy adds a small latency overhead for PII detection and token restoration
- System messages are not scanned (by design, to preserve your prompts)
- The proxy requires network access to both your app and the target API
- Session state is per-request; multi-turn conversations need session continuity (the proxy handles this automatically via sealed session blobs)
Troubleshooting
Tokens not being restored:
Check that the LLM is faithfully reproducing the token format {{type:id}}.
OGuardAI includes a 3-stage token repair pipeline (strict, repair, fuzzy) that
handles common LLM token mangling.
Performance: The proxy adds typically under 10ms overhead for PII detection on short inputs. For large inputs, consider using the chunking API directly.