Architecture Overview
How OGuardAI is organized into three planes, the pipeline flow, trust boundary model, and session design
What is OGuardAI?
OGuardAI is a semantic data protection and transformation runtime for AI systems. It sits between your applications and language models, detecting sensitive entities in text, replacing them with deterministic semantic tokens, and restoring original values in model output -- all governed by declarative policies. The result: your LLM produces grammatically correct, personalized output without ever seeing raw PII.
Architecture Diagram
OGuardAI is organized into three planes:
+-----------------------------------------------------+
| Applications |
| (Chatbots, RAG, Agents, CRM, Support, Docs) |
+---------------------------+--------------------------+
|
+---------------------------v--------------------------+
| Integration Plane |
| TS SDK | Python SDK | MCP | LangChain | Proxy |
+---------------------------+--------------------------+
|
+---------------------------v--------------------------+
| Runtime Plane (Rust) |
| +-----------+ +----------+ +-----------+ |
| | Detectors |>| Tokenizer|>|Transformer| --> LLM |
| +-----------+ +----------+ +-----------+ |
| | |
| +---------+ +----------+ +--------------+ |
| | Policy | | Session | | Rehydrator | <-- LLM |
| +---------+ +----------+ +--------------+ |
+------------------------------------------------------+Runtime Plane (Rust)
The core engine, written in Rust for performance and safety:
| Crate | Purpose |
|---|---|
guardai-core | Types, errors, runtime kernel |
guardai-tokenizer | Semantic token generation ({{type:id}} format) |
guardai-transformer | Prompt transformation (replace spans with tokens) |
guardai-rehydrate | Output rehydration (resolve tokens to values) |
guardai-policy | Policy engine (YAML rules, conditional evaluation) |
guardai-session | Session backend (sealed/AES-256-GCM). Memory and Redis backends are planned. |
guardai-detector-builtins | Rust regex-based entity detectors |
guardai-detector-client | Bridge to Python NER detector service |
guardai-auth | Authentication middleware (API key, JWT) |
guardai-token-robustness | 3-stage token repair (strict, deterministic, fuzzy) |
guardai-output-guard | Second-pass scan for LLM-hallucinated PII |
guardai-prompt-security | Prompt injection defense |
guardai-provider-strategy | Safe entity context generation for LLM system prompts |
guardai-large-text | Paragraph-boundary chunking for large documents |
guardai-streaming | SSE streaming for transform and rehydrate |
Integration Plane (TypeScript + Python)
SDKs and framework adapters:
| Package | Purpose |
|---|---|
@oguardai/sdk | TypeScript SDK with session management |
oguardai-sdk (Python) | Sync and async Python SDK |
@oguardai/mcp-server | MCP server for Claude Desktop/Code |
@oguardai/integrations-langchain | LangChain tools |
@oguardai/integrations-vercel-ai | Vercel AI SDK middleware |
Platform Plane (Future)
SaaS/PaaS capabilities: tenant management, usage-based billing, managed sessions.
Pipeline Flow
Every request follows this pipeline:
1. INPUT Raw text with PII
2. DETECT Regex + optional NER: find names, emails, IDs, phones, addresses
3. CLASSIFY Entity type, confidence score, language, metadata
4. POLICY CHECK Block? Tokenize? Allow? What restore mode per entity?
5. TOKENIZE Assign deterministic `{{type:id}}` tokens, build session mapping
6. TRANSFORM Replace spans in text, generate safe metadata for LLM
=== TRUST BOUNDARY === (raw PII stays inside, only tokens cross)
7. LLM PROCESSES Model sees ONLY tokens + safe metadata
8. LLM RESPONDS Response contains tokens: "Sehr geehrte `{{person:p_001}}`..."
=== TRUST BOUNDARY ===
9. TOKEN REPAIR 3-stage: strict parse -> deterministic repair -> fuzzy resolve
10. REHYDRATE Restore values per policy: full/partial/masked/formatted/abstract/none
11. OUTPUT GUARD Re-scan for NEW PII the model hallucinated (not from input)
12. FINAL OUTPUT Business-ready, personalized, compliantTrust Boundary Model
The trust boundary is the legal and technical foundation of OGuardAI. Raw PII never crosses it.
| Zone | Contains | Never Contains |
|---|---|---|
| Trusted Zone (OGuardAI runtime) | Raw PII, token mappings, encryption keys, policy rules | -- |
| Untrusted Zone (LLMs, tools, logs, vector stores) | Only {{type:id}} tokens, safe metadata, encrypted session blobs | Raw PII, real names, real emails, real IDs |
| Boundary Crossing | Tokenized text + entity_context metadata | Any raw sensitive value |
What crosses the boundary
Outbound (to LLM):
- Tokenized text:
"Contact `{{person:p_001}}` at `{{email:e_001}}`" - Entity context (safe metadata only):
[{token: "`{{person:p_001}}`", type: "person", gender: "female"}]
Inbound (from LLM):
- LLM output with tokens:
"Dear `{{person:p_001}}`, thank you for..."
Never crosses:
- Raw values: "Julia Schneider", "julia@firma.de"
- Token-to-value mappings
- Encryption keys
Crate Dependency Graph
oguardai-cli / oguardai-server
|
+-- guardai-core
| +-- guardai-api-types
|
+-- guardai-transformer
| +-- guardai-tokenizer
| +-- guardai-detector-builtins
| +-- guardai-detector-client
| +-- guardai-policy
| +-- guardai-provider-strategy
|
+-- guardai-rehydrate
| +-- guardai-token-robustness
| +-- guardai-output-guard
| +-- guardai-session
|
+-- guardai-auth
+-- guardai-prompt-security
+-- guardai-streaming
+-- guardai-large-textSession Model
Sessions bind token-to-value mappings for the duration of a transform-rehydrate cycle.
Default: Sealed sessions -- AES-256-GCM encrypted blobs returned to the client. No server-side state required. The client is the sole custodian of the encrypted session blob.
Other backends: in-memory (dev/test), Redis (multi-instance), managed/persistent (SQLite/Postgres).
Performance Targets
| Operation | Built-in Detectors (regex) | With Python NER |
|---|---|---|
| Transform (short text) | 1-5ms | 15-50ms |
| Transform (large doc, chunked) | 10-50ms | 50-200ms |
| Rehydrate | 1-3ms | 1-3ms |
| Sealed session encrypt/decrypt | <1ms | <1ms |
| Token repair (3-stage) | <1ms | <1ms |
OGuardAI adds minimal overhead compared to the LLM call itself (typically 500-5000ms).