Legal & Compliance
Protect attorney-client privilege and confidential case data when using AI for contract review, legal research, and document summarization
Protect attorney-client privilege, client identities, and confidential case data when using AI for contract review, legal research, and document summarization.
The Problem
Law firms and corporate legal departments are under pressure to adopt AI for contract review, due diligence, legal research, and document summarization. The productivity gains are substantial -- AI can reduce first-pass contract review time by 60-80%. But legal data carries unique sensitivity requirements that go beyond standard PII protection.
Attorney-client privilege demands that client identities, case details, and legal strategy remain confidential. Sending a contract containing party names, case references, and deal terms to a third-party LLM creates a privilege waiver risk. Even if the LLM provider's terms of service disclaim data use for training, the act of transmitting privileged information to a third party may be sufficient to challenge privilege in litigation.
Beyond privilege, legal documents contain a dense concentration of sensitive data: party names, counterparty identifiers, contract values, governing law clauses, and custom identifiers like matter numbers and court case IDs. Generic PII detection misses domain-specific patterns (matter numbers, case citations, deal codes), and naive redaction destroys the relational structure that makes legal analysis possible.
How OGuardAI Solves It
OGuardAI sits between your legal application and the AI model. Client-identifying information is tokenized with semantic metadata, preserving the document's logical structure while removing real identifiers. The AI model can analyze contract clauses, compare terms, and summarize obligations without knowing who the parties are.
Legal Application (DMS, CLM, eDiscovery)
|
v
OGuardAI Runtime (privileged data exists only here, transiently)
|
+---> Tokenized text (no client data) ---> LLM Provider
+---> Encrypted session blob ------------> Your Application
|
v
Restored output (channel-specific)
+---> Attorney review: full restore
+---> Client report: formatted names
+---> Court filing: redacted
+---> Knowledge base: abstract identifiersDetected Entity Types
| Entity Type | Examples | Default Action |
|---|---|---|
person | Client names, counterparty names, attorneys | Tokenize with role metadata |
email | Attorney and client email addresses | Tokenize |
address | Office addresses, registered agent addresses | Tokenize |
phone | Direct lines, mobile numbers | Tokenize |
custom* | Matter numbers, case IDs, deal codes | Tokenize (configurable patterns) |
company* | Company names, firm names | Tokenize with role metadata |
date_of_birth | Client DOB in estate/family law matters | Tokenize |
ssn | Client SSN in tax/estate matters | Block |
Entity types marked with * are custom types defined via policy rules, not built-in. See the Extending Entities guide for how to add custom types.
The custom entity type supports configurable regex patterns for domain-specific identifiers. Legal teams can define patterns for their matter numbering scheme (e.g., MTR-\d{4}-\d{6}), court case numbers (e.g., \d{2}-cv-\d{5}), or internal deal codes.
Example Policy
name: legal-privilege
version: "1.0"
description: "Attorney-client privilege protection for legal AI workflows"
rules:
- entity_type: "person"
protection_level: 2
action: "tokenize"
conditions: []
- entity_type: "company"
protection_level: 2
action: "tokenize"
conditions: []
- entity_type: "email"
protection_level: 2
action: "tokenize"
conditions: []
- entity_type: "address"
protection_level: 2
action: "tokenize"
conditions: []
- entity_type: "phone"
protection_level: 2
action: "tokenize"
conditions: []
- entity_type: "iban"
protection_level: 1
action: "block"
conditions: []
- entity_type: "credit_card"
protection_level: 1
action: "block"
conditions: []
- entity_type: "ssn"
protection_level: 1
action: "block"
conditions: []
- entity_type: "passport"
protection_level: 1
action: "block"
conditions: []
defaults:
protection_level: 2
action: "tokenize"
restore_mode: "full"
channel_rules:
attorney_review:
person: { restore_mode: full }
email: { restore_mode: full }
address: { restore_mode: full }
phone: { restore_mode: full }
company: { restore_mode: full }
custom: { restore_mode: full }
client_report:
person: { restore_mode: formatted }
email: { restore_mode: masked }
address: { restore_mode: masked }
phone: { restore_mode: masked }
company: { restore_mode: full }
custom: { restore_mode: masked }
knowledge_base:
person: { restore_mode: abstract }
email: { restore_mode: none }
address: { restore_mode: none }
phone: { restore_mode: none }
company: { restore_mode: abstract }
custom: { restore_mode: none }The default restore_mode: full means that unless a specific output channel overrides it, all entities are restored to their original values. The knowledge_base channel overrides this to abstract for person and company names, replacing them with semantic descriptions like (party A) or (opposing counsel) so that analyses can be reused across matters without exposing client identities.
Example API Call
Transform a contract clause
curl -X POST http://localhost:3000/v1/transform \
-H "Content-Type: application/json" \
-H "X-API-Key: your-key-here" \
-d '{
"input": "ASSET PURCHASE AGREEMENT\n\nThis Agreement is entered into as of January 15, 2026, by and between Meridian Technologies Inc., a Delaware corporation (\"Buyer\"), and Apex Digital Solutions LLC, an Oregon limited liability company (\"Seller\").\n\nMatter: MTR-2026-004817\n\nSection 4.2 Indemnification. Seller shall indemnify Buyer against all losses arising from breaches of representations in Section 3. The indemnification cap is USD 5,000,000. Claims must be submitted to Robert Langford (robert.langford@meridiantech.com) within 18 months of closing.\n\nGoverning Law: State of Delaware.",
"policy": "legal-privilege"
}'Response (tokenized)
{
"safe_text": "ASSET PURCHASE AGREEMENT\n\nThis Agreement is entered into as of January 15, 2026, by and between {{company:co_001}}, a Delaware corporation (\"Buyer\"), and {{company:co_002}}, an Oregon limited liability company (\"Seller\").\n\nMatter: {{custom:ct_001}}\n\nSection 4.2 Indemnification. Seller shall indemnify Buyer against all losses arising from breaches of representations in Section 3. The indemnification cap is USD 5,000,000. Claims must be submitted to {{person:p_001}} ({{email:e_001}}) within 18 months of closing.\n\nGoverning Law: State of Delaware.",
"session_id": "01916c2a-5d3e-7000-8000-000000000003",
"session_state": "eyJ2IjoxLCJzaWQiOi...",
"entity_context": [
{ "token": "{{company:co_001}}", "type": "company", "role": "buyer" },
{ "token": "{{company:co_002}}", "type": "company", "role": "seller" },
{ "token": "{{person:p_001}}", "type": "person", "role": "contact" }
],
"stats": { "entities_detected": 5, "entities_transformed": 5, "entities_blocked": 0 }
}The contract structure, legal terms, indemnification cap, governing law, and temporal clauses pass through unchanged. The AI model can analyze the indemnification clause, compare it against market terms, and flag unusual provisions -- all without knowing who the parties are.
Rehydrate for the attorney
curl -X POST http://localhost:3000/v1/rehydrate \
-H "Content-Type: application/json" \
-H "X-API-Key: your-key-here" \
-d '{
"output": "<LLM-generated contract analysis with tokens>",
"session_state": "eyJ2IjoxLCJzaWQiOi...",
"output_channel": "attorney_review"
}'The attorney sees the full analysis with all party names, matter numbers, and contact details restored. The knowledge base version uses abstract identifiers so the analysis can be reused across matters without exposing client identities.
Compliance and Privilege Notes
| Requirement | How OGuardAI Addresses It |
|---|---|
| Attorney-client privilege | Client identities and matter details never reach the LLM provider; privilege cannot be waived by transmission to a non-privileged third party |
| ABA Model Rule 1.6 (Confidentiality) | Reasonable measures to prevent disclosure: tokenization, AES-256-GCM encryption, session expiry, trust boundary enforcement |
| GDPR (client PII in EU matters) | PHI/PII tokenization satisfies data minimization (Art. 5(1)(c)); no persistent storage of personal data |
| Litigation hold compatibility | Session blobs can be preserved for litigation hold; token mappings are deterministic and reproducible within the session |
| Cross-border data restrictions | Tokenized text can cross jurisdictional boundaries without triggering data transfer restrictions, since it contains no personal data |
| Conflicts check | Abstract restore mode in the knowledge base prevents inadvertent conflicts -- attorneys cannot identify parties in anonymized precedent analyses |
Ethical Considerations
Legal AI introduces specific ethical obligations. OGuardAI addresses the data protection dimension, but attorneys retain responsibility for:
- Reviewing and verifying all AI-generated legal analysis before relying on it
- Ensuring that AI use complies with applicable bar rules and court orders
- Maintaining competence in understanding the technology's capabilities and limitations (ABA Model Rule 1.1, Comment 8)
- Disclosing AI use to clients where required by jurisdiction or engagement terms
Related Resources
- Customer Support Case Study -- Walkthrough of a support workflow with formal language handling
- GDPR Compliance -- GDPR-specific documentation for legal teams handling EU client data
- Compliance Controls Mapping -- HIPAA, GDPR, SOC 2, PCI DSS control mapping
- Extending Entity Types -- How to add custom entity patterns for matter numbers and case IDs