OGuardAI
Guides

Agent Integration

Protect PII across multi-step agent workflows with OGuardAI

The Problem

Agent frameworks (LangGraph, CrewAI, AutoGen) run multi-step workflows:

  1. Read user input (contains PII)
  2. Call LLM (PII leaks to provider)
  3. Call tools (PII leaks to external services)
  4. Store in memory (PII persists unprotected)
  5. Generate response (PII in output)

Each step is a leak point. Agents amplify risk exponentially.

The Pattern: Wrap Every Boundary

User Input -> [OGuardAI Transform] -> Safe Input
                                       |
                                   Agent Loop:
                                     LLM Call (sees tokens only)
                                     Tool Call [OGuardAI Transform arguments]
                                     Tool Result [OGuardAI Transform result]
                                     Memory Store (tokens only)
                                       |
Safe Output -> [OGuardAI Rehydrate] -> Final Output

Example: LangGraph

from guardai_sdk import OGuardAIClient

guardai = OGuardAIClient("http://localhost:3000")

def protected_node(state):
    # Transform input
    result = guardai.transform(state["input"])
    safe_input = result.safe_text
    session = result.session_state

    # Call LLM with safe input
    llm_output = llm.invoke(safe_input)

    # Rehydrate output
    restored = guardai.rehydrate(llm_output, session)
    return {"output": restored.restored_text}

Example: CrewAI

from crewai import Agent, Task, Crew
from guardai_sdk import OGuardAIClient

guardai = OGuardAIClient("http://localhost:3000")

class OGuardAIAgent(Agent):
    """Agent wrapper that auto-protects PII at every boundary."""

    def execute_task(self, task):
        # Transform the task input
        result = guardai.transform(task.description)
        safe_task = Task(
            description=result.safe_text,
            expected_output=task.expected_output
        )

        # Execute with safe input
        output = super().execute_task(safe_task)

        # Rehydrate the output
        restored = guardai.rehydrate(output, result.session_state)
        return restored.restored_text

Example: Multi-Step Tool Chain

# Step 1: User asks about a customer
user_input = "What is Julia Schneider's account balance?"
result = guardai.transform(user_input)
# safe_input: "What is `{{person:p_001}}`'s account balance?"

# Step 2: LLM decides to call a tool
llm_response = llm.invoke(result.safe_text)
# LLM output: "I'll look up `{{person:p_001}}`'s account."

# Step 3: Tool call - transform the arguments via JSON input
import json
tool_args = {"customer_name": "Julia Schneider"}
safe_args_result = guardai.transform(
    json.dumps(tool_args), session_state=result.session_state
)
# safe_args contains tokenized JSON string

# Step 4: Tool returns data - transform the result
tool_result = crm.lookup(customer_name="Julia Schneider")
# tool_result: {"balance": 5000, "email": "julia@firma.de"}
safe_result = guardai.transform(
    json.dumps(tool_result), session_state=result.session_state
)

# Step 5: LLM generates final answer with safe data
final = llm.invoke(f"The result is: {safe_result}")
# LLM output: "`{{person:p_001}}`'s balance is 5000. Contact: `{{email:e_001}}`"

# Step 6: Rehydrate only at the final output
restored = guardai.rehydrate(final, result.session_state)
# "Julia Schneider's balance is 5000. Contact: julia@firma.de"

Key Rules

  1. Transform before every LLM call -- never send raw PII to a model
  2. Transform tool arguments -- tool calls may contain PII from earlier steps
  3. Transform tool results -- external tools return PII that re-enters the loop
  4. Store only tokenized text in memory -- agent memory should never hold raw PII
  5. Rehydrate only at the final output -- restore PII only when returning to the user
  6. Pass session state through the agent context -- all steps share the same session for entity identity

Anti-Patterns

Do NOT: Cache raw PII in agent memory

# BAD: raw PII in memory
memory.save("Julia Schneider's email is julia@firma.de")

# GOOD: tokenized text in memory
result = guardai.transform("Julia Schneider's email is julia@firma.de")
memory.save(result.safe_text)
# Saves: "`{{person:p_001}}`'s email is `{{email:e_001}}`"

Do NOT: Rehydrate in the middle of an agent loop

# BAD: rehydrating mid-loop leaks PII back into the agent context
step1_output = llm.invoke(safe_input)
restored = guardai.rehydrate(step1_output, session)  # PII is now raw again
step2_output = llm.invoke(restored)  # PII sent to LLM again!

# GOOD: keep everything tokenized until the final output
step1_output = llm.invoke(safe_input)
step2_output = llm.invoke(step1_output)  # still tokenized
final = guardai.rehydrate(step2_output, session)  # restore only at the end

Do NOT: Create separate sessions per step

# BAD: separate sessions lose entity identity
result1 = guardai.transform("Julia Schneider")  # session A
result2 = guardai.transform("Julia Schneider")  # session B (different token!)

# GOOD: accumulate into one session
result1 = guardai.transform("Julia Schneider")
result2 = guardai.transform("More about Julia", session_state=result1.session_state)
# Same person gets same token across both calls

Session State Management

OGuardAI uses sealed session state (encrypted blobs) that travel with the request. This means:

  • No server-side state -- the session is in the blob
  • Portable -- pass the blob through any agent framework's state mechanism
  • Secure -- AES-256-GCM encrypted, tamper-proof
  • Accumulative -- each transform call can build on previous session state
# Session accumulation through agent steps
session = None

for step in agent_steps:
    result = guardai.transform(step.input, session_state=session)
    session = result.session_state  # carry forward
    step.safe_input = result.safe_text

# Final rehydration with accumulated session
final = guardai.rehydrate(agent_output, session)

Framework-Specific Notes

LangGraph

  • Store session state in the graph's state dict
  • Use protected_node pattern at each node boundary
  • The session blob is just a string -- fits naturally in LangGraph state

CrewAI

  • Wrap agents with a OGuardAI middleware
  • Transform task descriptions before execution
  • Rehydrate crew output after all agents complete

AutoGen

  • Use OGuardAI in the message preprocessing hook
  • Transform all messages before they reach the LLM
  • Store session state in the conversation context

OpenAI Agents SDK

  • Use the OGuardAI proxy as the base URL
  • The proxy handles transform/rehydrate transparently
  • No code changes needed in the agent definition