OGuardAI
OperationsRunbooks

Key Rotation Runbook

Procedures for rotating the AES-256-GCM session encryption key with simple and zero-downtime options

Rotate the AES-256-GCM session encryption key for OGuardAI's sealed session backend.

When to Rotate

  • Session secret was exposed (logs, source control, breach).
  • Scheduled rotation per compliance policy (e.g., every 90 days).
  • Personnel with access to the secret has left the team.

Background

Sealed sessions are encrypted blobs sent to the client. On rehydrate, the server decrypts with the configured key. Wrong key returns GUARDAI_SESSION_EXPIRED. Clients recover by calling /v1/transform again to get a fresh session.

Option A: Simple Rotation (Brief Disruption)

Use when a short spike of session expiration errors is acceptable.

1. Generate new secret:

openssl rand -base64 32

2. Update the secret:

# Kubernetes
kubectl create secret generic oguardai-session-secret \
  --from-literal=session-secret=<NEW_SECRET> --dry-run=client -o yaml | kubectl apply -f -

# systemd
sudo sed -i 's/^GUARDAI_SESSION_SECRET=.*/GUARDAI_SESSION_SECRET=<NEW_SECRET>/' /etc/guardai/oguardai.env

# Docker Compose
export GUARDAI_SESSION_SECRET=<NEW_SECRET>

3. Rolling restart:

kubectl rollout restart deployment/guardai && kubectl rollout status deployment/guardai --timeout=5m
sudo systemctl restart oguardai-server                           # systemd
docker compose -f deploy/docker/docker-compose.yml up -d --no-deps server  # Docker

4. Monitor -- expect a brief spike in SessionExpired errors:

kubectl logs -l app.kubernetes.io/name=guardai -c server --tail=100 | grep 'SessionExpired'

Spike should subside within minutes as clients retry with fresh transforms.

Option B: Zero-Downtime Rotation (KeyRing)

Use when no client-visible errors are acceptable. Requires KeyRing config support.

1. Generate new secret: openssl rand -base64 32

2. Add new key to KeyRing config (last key becomes active for new sessions):

session:
  backend: sealed
  keyring:
    - kid: 0
      secret: <OLD_SECRET>
    - kid: 1
      secret: <NEW_SECRET>

3. Rolling restart -- both keys are now active:

kubectl rollout restart deployment/guardai

New sessions use key 1. Old sessions still decrypt with key 0.

4. Wait one full session TTL (default 3600s):

grep ttl /etc/guardai/oguardai.yaml   # check configured TTL

5. Remove old key from config:

session:
  backend: sealed
  keyring:
    - kid: 1
      secret: <NEW_SECRET>

6. Final rolling restart: kubectl rollout restart deployment/guardai

Impact Summary

MethodClient Impact
SimpleIn-flight rehydrate fails with SessionExpired; client retries via transform then rehydrate
KeyRingNo client-visible errors; old and new sessions both work during transition

Post-Rotation Verification

curl -sf http://guardai.internal:3000/v1/health
curl -s -X POST http://guardai.internal:3000/v1/transform \
  -H "Content-Type: application/json" -d '{"input": "test@example.com"}' | jq .session_state

Confirm session expired error rate returns to baseline within 10 min (simple) or stays flat (KeyRing).