Data Retention & Privacy
Complete data lifecycle, deletion procedures, and retention policies for OGuardAI deployments
Data Residency
OGuardAI is self-hosted. All data processing happens within your infrastructure. No data is sent to external services (except the configured LLM provider, which receives only tokenized text).
Session Data Lifecycle
| Backend | Data Location | Retention | Deletion |
|---|---|---|---|
| Sealed (default) | Client-held encrypted blob | Expires after TTL (default: 1 hour) | Automatic -- blob becomes invalid after TTL |
| Memory | Server process memory | Until restart or TTL | Automatic on read (TTL check) |
| Redis | Redis server | TTL-based expiry (SET EX) | Automatic by Redis |
Complete Data Lifecycle
Where PII Exists at Each Stage
| Stage | PII Location | Controlled By | Deletion Mechanism |
|---|---|---|---|
| HTTP request in flight | Server memory (ephemeral) | OGuardAI runtime | Freed after response returns |
| Transform processing | Server memory (ephemeral) | OGuardAI runtime | Freed after response returns |
| Sealed session blob | Client application | Client | Client discards blob; TTL auto-expires |
| Memory session | Server RAM | Server process | Auto-evicted on TTL check or process restart |
| Redis session | Redis server | Redis TTL | Automatic TTL expiry or explicit DEL command |
| RAG tokenized chunks | External vector store | Application | Application deletes from vector DB |
| Audit events | Log sink (stdout/file/SIEM) | Log infrastructure | Log rotation/retention policy |
| OCR temp files | /tmp directory | OS + RAII cleanup | Deleted on drop (even on error) |
"Forget User X" -- Complete Procedure
Sealed Sessions (Default -- Stateless)
The server retains ZERO session data after the HTTP response.
- If TTL has passed: Data is already gone. Sealed blobs become cryptographically invalid after TTL.
- If TTL has NOT passed: Rotate the session encryption key in
oguardai.yaml. ALL existing sealed session blobs become undecryptable instantly. - Per-tenant purge: Rotate only that tenant's key:
tenants: target_tenant: session_secret: "new-rotated-secret-for-this-tenant"
Redis Sessions
- Delete by session ID:
DEL guardai:session:<session_id> - Delete all for tenant:
redis-cli KEYS "guardai:session:*" | xargs redis-cli DEL(use SCAN in production) - Wait for TTL: sessions auto-expire
RAG Data
- Call
POST /v1/rag/deletewith the ingest session_state - OGuardAI invalidates the session -- tokens become unrehydratable
- Application MUST delete corresponding chunks from its vector store
- OGuardAI's design ensures that tokenized text in the vector store contains no raw PII
Emergency Purge
Rotate session.secret in configuration. Deploy. ALL existing sealed session blobs become invalid instantly. No data migration needed -- the old blobs simply fail to decrypt.
Compliance Certification
OGuardAI's sealed session architecture provides data minimization by design:
- Server retains NO session data after HTTP response (sealed mode)
- PII is encrypted at rest within client-held session blobs (AES-256-GCM)
- Sessions auto-expire after configurable TTL (default: 1 hour)
- Key rotation enables instant bulk invalidation
- Tokenized text in external stores (vector DBs, logs) contains zero raw PII
- Audit events contain entity types and counts only -- never raw values
Right to Be Forgotten (GDPR Art. 17)
OGuardAI's sealed session model inherently supports data minimization:
- Server retains NO session data after response (sealed mode)
- Sessions auto-expire after configurable TTL
- No persistent storage of raw PII values
- To "forget" an entity: simply let the session expire
For Redis backend: call DELETE on the session key, or wait for TTL expiry.
Entity Revocation (Restore-Time Suppression)
OGuardAI provides an entity revocation mechanism that prevents future rehydration of specific values.
What it IS:
- A restore-time suppression list
- Prevents revoked values from appearing in any future rehydrate response
- Stores only HMAC-SHA-256 digests (no raw PII in the revocation table)
- Type-scoped: revocation is per (entity_type, value) pair
- Case-insensitive via normalization
What it is NOT:
- Not a universal deletion of all historical data
- Not cryptographic erasure of sealed session blobs
- Not deletion from external systems (vector stores, logs)
- Existing sealed session blobs are unaffected (they just won't restore revoked values)
Use cases:
- GDPR Art. 17 "right to be forgotten" -- prevent future exposure
- Post-incident containment -- suppress leaked values immediately
- User deletion requests -- revoked values resolve to [DELETED]
- Emergency do-not-restore rules
How it works:
- Call
revocation_table.revoke("email", "julia@firma.de") - HMAC-SHA-256 digest stored (32 bytes, no raw PII)
- During
rehydrate_with_revocation(), the token's value is checked against the table - If revoked: resolves to
[DELETED]instead of original value - If not revoked: normal restore behavior
Complete "Forget User" Procedure
What OGuardAI Controls
| Data Location | OGuardAI Controls? | Deletion Mechanism |
|---|---|---|
| In-flight request memory | Yes | Freed after response |
| Sealed session blob | No (client-held) | TTL expiry or key rotation |
| Memory session backend | Yes | backend.remove() or restart |
| Redis session backend | Yes | TTL or explicit DELETE |
| Revocation table | Yes | POST /v1/revoke -- prevents future restore |
| RAG session | Yes | POST /v1/rag/delete |
| Prometheus metrics | No (counters only) | No PII in metrics |
| Audit log events | No (tracing sink) | Log rotation policy |
| Vector store chunks | No (application-owned) | Application deletes from vector DB |
Step-by-Step: Forget User "julia@firma.de"
-
Revoke the entity (prevents future restoration):
curl -X POST http://localhost:3000/v1/revoke \ -H "Content-Type: application/json" \ -d '{"entity_type":"email","value":"julia@firma.de"}' -
Revoke related entities (name, phone, etc.):
curl -X POST http://localhost:3000/v1/revoke/bulk \ -H "Content-Type: application/json" \ -d '{"entities":[ {"entity_type":"email","value":"julia@firma.de"}, {"entity_type":"person","value":"Julia Schneider"}, {"entity_type":"phone","value":"+49 30 12345678"} ]}' -
Delete known RAG sessions (if applicable):
curl -X POST http://localhost:3000/v1/rag/delete \ -d '{"session_state":"<ingest_session_blob>"}' -
Application: delete vector store chunks containing this user's data
-
For immediate invalidation of ALL sealed sessions: Rotate the session key
session: secret: "new-rotated-key-after-user-deletion"
What This Guarantees
- Future rehydration: Revoked values return
[DELETED] - Past sealed blobs: Invalid after key rotation
- Redis/memory sessions: Deleted or expired
- Vector store: Application responsibility (guided)
- Logs: Subject to log retention policy
What This Does NOT Guarantee
- External systems that received tokenized text still have the tokens
- Sealed blobs held by clients remain valid until TTL or key rotation
- Vector store cleanup depends on the application, not OGuardAI
What OGuardAI Stores
During a Request (Transient)
- Raw PII values exist only in server memory during active processing
- After the response is sent, raw values are discarded (sealed mode) or held in the configured session backend
- No raw PII is written to disk in any mode
In Session State (Sealed Mode)
- An AES-256-GCM encrypted blob containing the token-to-value mapping
- The blob is returned to the client; the server retains nothing
- The blob is tamper-proof and expires after the configured TTL
- Without the server's encryption key, the blob is unreadable
In Logs (Tracing)
- Entity types and counts (e.g., "detected 3 email, 1 phone")
- Policy names applied
- Operation durations
- Tenant IDs and session IDs
Logs NEVER contain:
- Raw PII values
- Email addresses, names, phone numbers
- Entity text content
- Session encryption keys
Audit Events
Audit events contain:
- Timestamps
- Tenant IDs
- Entity types and counts
- Policy names
- Operation durations
Audit events NEVER contain:
- Raw PII values
- Entity text
- Email addresses, names, phone numbers
- Session encryption keys
Log Retention
Configure your tracing subscriber to:
- Rotate logs (e.g., daily)
- Purge logs after retention period
- Ship to SIEM for long-term storage
Recommended Configuration
# Configure audit logging via environment variable
RUST_LOG=guardai::audit=infoData Flow Summary
Compliance Recommendations
- Use sealed sessions in production (default). This ensures zero server-side PII retention.
- Configure TTL appropriately: shorter TTLs reduce exposure window.
- Rotate encryption keys periodically (the
session.secretconfig value). - Ship audit logs to a dedicated SIEM system for long-term retention and analysis.
- Enable TLS for all network communication (server, Redis, LLM provider).