OGuardAI
Operations

Service Level Objectives

Availability, latency, throughput, and security SLOs for OGuardAI deployments

Availability

MetricTargetMeasurement
API availability99.9% uptimeHealth endpoint returns 200
Transform endpoint99.9% success rateNon-5xx responses / total requests
Rehydrate endpoint99.95% success rateHigher -- critical path

Latency

Builtin-Only Mode (detector.mode: builtin)

Endpointp50p95p99Max
POST /v1/transformUnder 1msUnder 3msUnder 5msUnder 50ms
POST /v1/rehydrateUnder 0.1msUnder 0.5msUnder 1msUnder 10ms
POST /v1/detectUnder 1msUnder 3msUnder 5msUnder 50ms
GET /v1/healthUnder 1msUnder 1msUnder 2msUnder 5ms

Builtin + NER Mode (detector.mode: both)

Endpointp50p95p99Max
POST /v1/transformUnder 200msUnder 500msUnder 1sUnder 5s (NER timeout)
POST /v1/rehydrateUnder 0.1msUnder 0.5msUnder 1msUnder 10ms
POST /v1/detectUnder 200msUnder 500msUnder 1sUnder 5s

Throughput

ModeSingle InstanceHorizontal Scaling
Builtin-only>1,000 req/sLinear with instances
Builtin + NER20-50 req/sLimited by NER sidecar

Payload Limits

LimitDefaultConfigurable
Max request body50 MBfile_upload.max_size_bytes
Max batch items100Hardcoded
Max session TTL3600s (1 hour)session.ttl_seconds
Max concurrent streamsUnlimitedOS/runtime limits
SSE heartbeat interval15sServer-side

Error Budget

Error TypeBudget (per 1000 requests)
5xx errorsUnder 1 (0.1%)
Detection false negativesUnder 50 (5%) for regex, under 100 (10%) for NER
Detection false positivesUnder 30 (3%) for regex, under 150 (15%) for NER
Token repair failuresUnder 10 (1%)
Session expiry (expected)N/A -- by design

Degraded Mode SLO

When NER sidecar is unavailable (mode=both):

MetricGuarantee
Availability100% (graceful fallback to builtin)
Latency impact+5s per request (NER timeout) then normal
Entity coverage15/18 types (person/company/location unavailable)
Data safetyUnaffected -- PII protection maintained

Session Security

PropertyGuarantee
EncryptionAES-256-GCM (AEAD)
Key strength256-bit minimum
Nonce uniquenessRandom 12-byte per seal (cryptographically random)
Tamper detectionAuthentication tag verified on every unseal
Replay protectionRequest counter + TTL
Cross-tenant isolationTenant ID validated during unseal

Revocation

PropertyGuarantee
Revocation latencyImmediate (in-memory + file/Redis)
Revocation persistenceSurvives server restart (file or Redis)
Revocation consistencyEventual (file) or strong (Redis)
Future restore suppression100% -- revoked values always return [DELETED]

Monitoring

Required monitoring for SLO compliance:

# Prometheus alerts (see deploy/prometheus/alerting-rules.yml)
- guardai_transforms_total          # request rate
- guardai_errors_total              # error rate
- guardai_transform_duration_seconds # latency
- guardai_rate_limit_rejections_total # capacity
- guardai_prompt_security_triggers_total # security