Architecture
Detector Runtime Contract
Detection modes, guarantees, fallback behavior, quality expectations, and how to verify the runtime configuration
Detection Modes
OGuardAI operates in one of three detection modes, configured in oguardai.yaml:
Mode: builtin (Deterministic)
detector:
mode: builtinGuarantees:
- Sub-millisecond detection latency (p99 < 5ms)
- 15 entity types detected via 30+ regex patterns
- Zero external dependencies
- Deterministic -- same input always produces same output
- Horizontally scalable -- no shared state
Entity types available: email, phone, ssn, iban, credit_card, ip, url, order, customer_id, passport, health_id, date_of_birth, address
NOT available (requires NER): person, company, location
Mode: both (Full Coverage)
detector:
mode: both
advanced_url: http://localhost:9090Guarantees:
- 18 entity types (15 regex + 3 NER-only: person, company, location)
- Person names, company names, locations detected via GLiNER/spaCy
- Graceful fallback: if NER sidecar is unreachable, falls back to builtin-only
- Fallback timeout: 5 seconds (configurable)
Additional entity types: person, company, location
Latency:
- Builtin entities: <5ms
- NER entities: 50-500ms depending on text length and model
Fallback behavior when NER is unavailable:
- Server logs:
ner_service_unavailable_falling_back_to_builtin - Detection continues with builtin regex only
- person/company/location will NOT be detected
- Health endpoint shows:
builtin_and_ner (full entity detection)but actual capability degrades - No crash, no hang -- maximum 5s timeout per request
Mode: advanced (NER Only)
detector:
mode: advanced
advanced_url: http://localhost:9090All detection via Python sidecar. Regex patterns not used.
Detection Mode Decision Tree
How to Verify Runtime Mode
# Check current mode
curl http://localhost:3000/v1/diagnostics | jq .detector_mode
# Check available entity types
curl http://localhost:3000/v1/capabilities | jq '.entity_types[].name'
# Check per-request mode (in transform response)
curl -X POST http://localhost:3000/v1/transform \
-H "Content-Type: application/json" \
-d '{"input":"test","input_type":"text"}' | jq .detector_modeQuality Expectations by Mode
| Entity Type | Builtin | + GLiNER | Notes |
|---|---|---|---|
| Precision >99% | Same | Regex + Unicode support | |
| Phone | Precision >95% | Same | International formats |
| SSN | Precision >99% | Same | Strict format matching |
| IBAN | Precision >99% | Same | Country code + structure |
| Credit Card | Precision >99% | Same | Luhn validation |
| Person Name | N/A | Precision ~85-95% | Language-dependent |
| Company Name | N/A | Precision ~80-90% | Context-dependent |
| Location | N/A | Precision ~80-90% | May overlap with Address |
Recommendation
- For maximum speed + determinism: Use
builtinmode - For maximum coverage: Use
bothmode with healthy NER sidecar - For production safety: Configure health monitoring for NER sidecar availability