OGuardAI
Security

Data Retention & Privacy

Complete data lifecycle, deletion procedures, and retention policies for OGuardAI deployments

Data Residency

OGuardAI is self-hosted. All data processing happens within your infrastructure. No data is sent to external services (except the configured LLM provider, which receives only tokenized text).

Session Data Lifecycle

BackendData LocationRetentionDeletion
Sealed (default)Client-held encrypted blobExpires after TTL (default: 1 hour)Automatic -- blob becomes invalid after TTL
MemoryServer process memoryUntil restart or TTLAutomatic on read (TTL check)
RedisRedis serverTTL-based expiry (SET EX)Automatic by Redis

Complete Data Lifecycle

Where PII Exists at Each Stage

StagePII LocationControlled ByDeletion Mechanism
HTTP request in flightServer memory (ephemeral)OGuardAI runtimeFreed after response returns
Transform processingServer memory (ephemeral)OGuardAI runtimeFreed after response returns
Sealed session blobClient applicationClientClient discards blob; TTL auto-expires
Memory sessionServer RAMServer processAuto-evicted on TTL check or process restart
Redis sessionRedis serverRedis TTLAutomatic TTL expiry or explicit DEL command
RAG tokenized chunksExternal vector storeApplicationApplication deletes from vector DB
Audit eventsLog sink (stdout/file/SIEM)Log infrastructureLog rotation/retention policy
OCR temp files/tmp directoryOS + RAII cleanupDeleted on drop (even on error)

"Forget User X" -- Complete Procedure

Sealed Sessions (Default -- Stateless)

The server retains ZERO session data after the HTTP response.

  1. If TTL has passed: Data is already gone. Sealed blobs become cryptographically invalid after TTL.
  2. If TTL has NOT passed: Rotate the session encryption key in oguardai.yaml. ALL existing sealed session blobs become undecryptable instantly.
  3. Per-tenant purge: Rotate only that tenant's key:
    tenants:
      target_tenant:
        session_secret: "new-rotated-secret-for-this-tenant"

Redis Sessions

  1. Delete by session ID: DEL guardai:session:<session_id>
  2. Delete all for tenant: redis-cli KEYS "guardai:session:*" | xargs redis-cli DEL (use SCAN in production)
  3. Wait for TTL: sessions auto-expire

RAG Data

  1. Call POST /v1/rag/delete with the ingest session_state
  2. OGuardAI invalidates the session -- tokens become unrehydratable
  3. Application MUST delete corresponding chunks from its vector store
  4. OGuardAI's design ensures that tokenized text in the vector store contains no raw PII

Emergency Purge

Rotate session.secret in configuration. Deploy. ALL existing sealed session blobs become invalid instantly. No data migration needed -- the old blobs simply fail to decrypt.

Compliance Certification

OGuardAI's sealed session architecture provides data minimization by design:

  • Server retains NO session data after HTTP response (sealed mode)
  • PII is encrypted at rest within client-held session blobs (AES-256-GCM)
  • Sessions auto-expire after configurable TTL (default: 1 hour)
  • Key rotation enables instant bulk invalidation
  • Tokenized text in external stores (vector DBs, logs) contains zero raw PII
  • Audit events contain entity types and counts only -- never raw values

Right to Be Forgotten (GDPR Art. 17)

OGuardAI's sealed session model inherently supports data minimization:

  • Server retains NO session data after response (sealed mode)
  • Sessions auto-expire after configurable TTL
  • No persistent storage of raw PII values
  • To "forget" an entity: simply let the session expire

For Redis backend: call DELETE on the session key, or wait for TTL expiry.

Entity Revocation (Restore-Time Suppression)

OGuardAI provides an entity revocation mechanism that prevents future rehydration of specific values.

What it IS:

  • A restore-time suppression list
  • Prevents revoked values from appearing in any future rehydrate response
  • Stores only HMAC-SHA-256 digests (no raw PII in the revocation table)
  • Type-scoped: revocation is per (entity_type, value) pair
  • Case-insensitive via normalization

What it is NOT:

  • Not a universal deletion of all historical data
  • Not cryptographic erasure of sealed session blobs
  • Not deletion from external systems (vector stores, logs)
  • Existing sealed session blobs are unaffected (they just won't restore revoked values)

Use cases:

  • GDPR Art. 17 "right to be forgotten" -- prevent future exposure
  • Post-incident containment -- suppress leaked values immediately
  • User deletion requests -- revoked values resolve to [DELETED]
  • Emergency do-not-restore rules

How it works:

  1. Call revocation_table.revoke("email", "julia@firma.de")
  2. HMAC-SHA-256 digest stored (32 bytes, no raw PII)
  3. During rehydrate_with_revocation(), the token's value is checked against the table
  4. If revoked: resolves to [DELETED] instead of original value
  5. If not revoked: normal restore behavior

Complete "Forget User" Procedure

What OGuardAI Controls

Data LocationOGuardAI Controls?Deletion Mechanism
In-flight request memoryYesFreed after response
Sealed session blobNo (client-held)TTL expiry or key rotation
Memory session backendYesbackend.remove() or restart
Redis session backendYesTTL or explicit DELETE
Revocation tableYesPOST /v1/revoke -- prevents future restore
RAG sessionYesPOST /v1/rag/delete
Prometheus metricsNo (counters only)No PII in metrics
Audit log eventsNo (tracing sink)Log rotation policy
Vector store chunksNo (application-owned)Application deletes from vector DB

Step-by-Step: Forget User "julia@firma.de"

  1. Revoke the entity (prevents future restoration):

    curl -X POST http://localhost:3000/v1/revoke \
      -H "Content-Type: application/json" \
      -d '{"entity_type":"email","value":"julia@firma.de"}'
  2. Revoke related entities (name, phone, etc.):

    curl -X POST http://localhost:3000/v1/revoke/bulk \
      -H "Content-Type: application/json" \
      -d '{"entities":[
        {"entity_type":"email","value":"julia@firma.de"},
        {"entity_type":"person","value":"Julia Schneider"},
        {"entity_type":"phone","value":"+49 30 12345678"}
      ]}'
  3. Delete known RAG sessions (if applicable):

    curl -X POST http://localhost:3000/v1/rag/delete \
      -d '{"session_state":"<ingest_session_blob>"}'
  4. Application: delete vector store chunks containing this user's data

  5. For immediate invalidation of ALL sealed sessions: Rotate the session key

    session:
      secret: "new-rotated-key-after-user-deletion"

What This Guarantees

  • Future rehydration: Revoked values return [DELETED]
  • Past sealed blobs: Invalid after key rotation
  • Redis/memory sessions: Deleted or expired
  • Vector store: Application responsibility (guided)
  • Logs: Subject to log retention policy

What This Does NOT Guarantee

  • External systems that received tokenized text still have the tokens
  • Sealed blobs held by clients remain valid until TTL or key rotation
  • Vector store cleanup depends on the application, not OGuardAI

What OGuardAI Stores

During a Request (Transient)

  • Raw PII values exist only in server memory during active processing
  • After the response is sent, raw values are discarded (sealed mode) or held in the configured session backend
  • No raw PII is written to disk in any mode

In Session State (Sealed Mode)

  • An AES-256-GCM encrypted blob containing the token-to-value mapping
  • The blob is returned to the client; the server retains nothing
  • The blob is tamper-proof and expires after the configured TTL
  • Without the server's encryption key, the blob is unreadable

In Logs (Tracing)

  • Entity types and counts (e.g., "detected 3 email, 1 phone")
  • Policy names applied
  • Operation durations
  • Tenant IDs and session IDs

Logs NEVER contain:

  • Raw PII values
  • Email addresses, names, phone numbers
  • Entity text content
  • Session encryption keys

Audit Events

Audit events contain:

  • Timestamps
  • Tenant IDs
  • Entity types and counts
  • Policy names
  • Operation durations

Audit events NEVER contain:

  • Raw PII values
  • Entity text
  • Email addresses, names, phone numbers
  • Session encryption keys

Log Retention

Configure your tracing subscriber to:

  • Rotate logs (e.g., daily)
  • Purge logs after retention period
  • Ship to SIEM for long-term storage
# Configure audit logging via environment variable
RUST_LOG=guardai::audit=info

Data Flow Summary

Compliance Recommendations

  1. Use sealed sessions in production (default). This ensures zero server-side PII retention.
  2. Configure TTL appropriately: shorter TTLs reduce exposure window.
  3. Rotate encryption keys periodically (the session.secret config value).
  4. Ship audit logs to a dedicated SIEM system for long-term retention and analysis.
  5. Enable TLS for all network communication (server, Redis, LLM provider).