OGuardAI
Architecture

Detector Capabilities

Entity detection capabilities by runtime mode, language support matrix, performance characteristics, and deployment modes

Entity Detection by Runtime Mode

Entity TypeBuiltin (Regex)+ GLiNER+ spaCy
EmailHighHighHigh
PhoneHigh (intl + DE)HighHigh
SSNHighHighHigh
IBANHighHighHigh
Credit CardHigh (Luhn)HighHigh
IP AddressHigh (v4 + v6)HighHigh
URLHighHighHigh
Customer IDHigh (pattern)HighHigh
Order NumberHigh (pattern)HighHigh
PassportMedium (pattern)HighMedium
Health IDMedium (pattern)HighMedium
Date of BirthHigh (with context)HighHigh
German Tax IDHighHighHigh
German Social SecurityHighHighHigh
Person NameNoneGoodGood
Company NameNoneGoodModerate
LocationNoneGoodGood
Address (street)Medium (DE/US)BetterBetter

Language Support

LanguageBuiltin+ GLiNER+ spaCy
EnglishFullFullFull
GermanFullFullFull
FrenchPartialFullFull (with model)
SpanishPartialFullFull (with model)
ItalianPartialGoodGood
ArabicBasicGoodLimited
JapaneseBasicGoodLimited
ChineseBasicGoodLimited
KoreanBasicGoodLimited
RussianBasicGoodGood
Other (20+)BasicGoodVaries

Performance Characteristics

MetricBuiltin Only+ GLiNER+ spaCy
Cold start<1ms5-30s (model load)3-15s (model load)
Idle memory~10MB~500MB (CPU) / ~2GB (GPU)~200MB
Latency p50<1ms10-50ms5-20ms
DeterministicYesMostlyMostly
GPU requiredNoOptional (faster)No

Deployment Modes

ModeConfigWhat RunsBest For
builtindetector.mode: builtinRust regex onlyLow-latency, structured data
advanceddetector.mode: advanced + detector.advanced_urlPython NER onlyMaximum accuracy
bothdetector.mode: both + URLRust regex + Python NER mergedBest coverage

Degraded Mode Behavior

When the Python NER sidecar is unavailable:

  1. Server logs: ner_service_unavailable_falling_back_to_builtin
  2. Detection continues with builtin regex only
  3. Health endpoint reports: detector_status: "degraded" (if configured as both) or "builtin_only"
  4. No crash, no hang, no data loss
  5. Entity types requiring NER (Person, Company, Location) will not be detected