Write declarative YAML policies that control how OGuardAI handles each entity type

Policies are declarative YAML documents that define how OGuardAI handles each entity type. They control detection, transformation, and restoration behavior.

Policy Evaluation Flow

Policy YAML Structure

name: "my-policy"
version: "1.0.0"
description: "What this policy does and when to use it"

rules:
  - entity_type: "person"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "email"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "ssn"
    protection_level: 1
    action: "block"
    conditions: []

defaults:
  protection_level: 2
  action: "tokenize"
  restore_mode: "full"

metadata_policy:
  expose_gender: true
  expose_formality: true
  expose_language: true
  expose_role: true

Top-Level Fields

Field	Required	Description
`name`	Yes	Unique policy identifier (referenced in API requests)
`version`	Yes	Semantic version for change tracking
`description`	No	Human-readable purpose description
`rules`	Yes	List of entity-type rules
`defaults`	Yes	Fallback values for entity types without explicit rules
`metadata_policy`	No	Controls which metadata fields are exposed to the LLM

Entity Rules

Each rule defines how one entity type is handled:

rules:
  - entity_type: "person"        # Which entity type this rule applies to
    protection_level: 2           # 1 = hard mask/block, 2 = reversible tokenization, 3 = semantic abstraction
    action: "tokenize"            # What action to take (block | abstract | tokenize | allow)
    conditions: []                # Optional conditional overrides

Protection Levels

Level	Applies To	Description
1	`passport`, `iban`, `health_id`, `ssn`	Hard masking or blocking. Raw values never stored in reversible form.
2	`person`, `email`, `phone`, `company`, `customer_id`, `order`, `address`, `ip`, `url`, `custom`	Reversible tokenization. Values can be fully restored during rehydration.
3	Metadata-only attributes	Semantic abstraction. No raw value stored.

Policy Actions

The action field in each rule controls what happens to the entity during transformation. Valid values come from the PolicyAction enum:

Action	Behavior
`block`	Request rejected if entity is detected
`hard_mask`	Replace with a fixed mask (e.g., `********`). Unlike `block`, the request is not rejected.
`abstract`	Replaced with category label (e.g., `[IBAN]`)
`tokenize`	Replaced with a semantic token `{{type:id}}` (reversible via rehydration)
`allow`	Entity passes through unchanged

Restore Modes

The restore_mode field (set in defaults or in channel overrides) controls how tokenized entities are restored during rehydration:

Restore Mode	Behavior
`full`	Complete original value restored
`partial`	Deterministic subset (e.g., J. Schneider)
`masked`	Character masking preserving length
`formatted`	Original + contextual formatting (e.g., "Frau Julia Schneider")
`abstract`	Semantic description (e.g., "(female customer)")
`none`	Value removed, shows `[REDACTED]`

Conditional Rules

Rules support conditions for context-dependent behavior:

rules:
  - entity_type: "person"
    protection_level: 2
    action: "tokenize"
    conditions:
      - field: "caller_role"
        operator: "eq"
        value: "support_agent"
        override_action: "partial"

      - field: "caller_role"
        operator: "in"
        value: ["admin", "supervisor"]
        override_action: "full"

Condition Operators

Operator	Description	Example
`eq`	Equals	`field: "caller_role", value: "admin"`
`neq`	Not equals	`field: "caller_role", value: "guest"`
`in`	In list	`field: "caller_role", value: ["admin", "support"]`
`not_in`	Not in list	`field: "caller_role", value: ["guest", "public"]`
`exists`	Field is present	`field: "vip_flag"`

Conditions are evaluated against the context object in the API request.

Channel Overrides

Output channels control how entities are restored for different audiences:

channel_rules:
  customer_email:
    person: "formatted"      # "Frau Julia Schneider"
    email: "none"            # [REDACTED]
    customer_id: "full"      # 948221
    order: "full"            # ORD-2026-4892

  internal_summary:
    person: "full"           # Julia Schneider
    email: "full"            # julia@firma.de
    customer_id: "full"      # 948221
    order: "full"            # ORD-2026-4892

  export:
    person: "abstract"       # (female customer)
    email: "none"            # [REDACTED]
    customer_id: "abstract"  # (customer ID)
    order: "abstract"        # (order reference)

  log_safe:
    person: "masked"         # ******* *********
    email: "masked"          # j*******************e
    customer_id: "masked"    # ******
    order: "partial"         # ORD-****-****

Metadata Policy

Controls which metadata fields are included in the entity_context sent to the LLM:

metadata_policy:
  expose_gender: true       # Needed for gendered salutations (Herr/Frau)
  expose_formality: true    # Needed for formal/informal register
  expose_language: true     # Needed for language-correct output
  expose_role: false        # Role context (recipient, sender, etc.)

Example: Healthcare Policy

name: "healthcare"
version: "1.0.0"
description: "HIPAA-compliant healthcare policy — blocks PHI identifiers, tokenizes names, abstracts addresses"

rules:
  - entity_type: "health_id"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "ssn"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "date_of_birth"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "iban"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "credit_card"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "passport"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "person"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "email"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "phone"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "address"
    protection_level: 3
    action: "abstract"
    conditions: []

  - entity_type: "location"
    protection_level: 2
    action: "abstract"
    conditions: []

  - entity_type: "company"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "customer_id"
    protection_level: 2
    action: "tokenize"
    conditions: []

defaults:
  protection_level: 2
  action: "tokenize"
  restore_mode: "partial"

metadata_policy:
  expose_gender: false
  expose_formality: false
  expose_language: true
  expose_role: false

Key decisions:

Health IDs, SSNs, dates of birth, IBANs, credit cards, and passports are all blocked (HIPAA requirement)
Person names, emails, and phones are tokenized (reversible during rehydration)
Addresses are abstracted at protection level 3 (no raw value stored)
Default restore mode is partial -- only subsets of values are restored

Example: Financial Policy

name: "financial"
version: "1.0.0"
description: "Banking and financial services policy — blocks all financial identifiers, tokenizes names, no raw values in logs"

rules:
  - entity_type: "iban"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "credit_card"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "ssn"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "passport"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "health_id"
    protection_level: 1
    action: "block"
    conditions: []

  - entity_type: "person"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "company"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "email"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "phone"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "address"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "location"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "customer_id"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "order"
    protection_level: 2
    action: "tokenize"
    conditions: []

  - entity_type: "date_of_birth"
    protection_level: 1
    action: "hard_mask"
    conditions: []

  - entity_type: "ip"
    protection_level: 2
    action: "tokenize"
    conditions: []

defaults:
  protection_level: 2
  action: "tokenize"
  restore_mode: "masked"

metadata_policy:
  expose_gender: false
  expose_formality: false
  expose_language: true
  expose_role: true