Contract Review Agent Blueprint
AI agent that reviews contracts: extracts clauses, flags risks, compares versions, and generates summaries. Ready-to-run with PDF and Markdown contract input.
Contract Review Agent
An AI agent that reviews contracts like a paralegal doing first-pass analysis. It extracts key clauses, flags unfavorable terms, compares contract versions (redline analysis), and generates executive summaries. Input is any PDF or Markdown contract.
Note:
Not legal advice. This agent identifies patterns and flags potential concerns. A qualified lawyer must review its output before any legal decisions. Use as a first-pass triage tool, not a replacement for legal counsel.
Agent File Structure
Setup
Install Dependencies
Install the OpenAI client plus PDF support.
pip install openai pymupdf
Create config.json
Configure the agent. risk_categories_path points to the JSON file defining what to flag.
{
"openai_api_key": "sk-...",
"model": "gpt-4o",
"max_iterations": 6,
"risk_categories_path": "risk_categories.json"
}
Verify
Run the agent on a sample contract to verify setup.
python agent.py --contract "./samples/vendor-agreement.pdf"
The agent should output extracted clauses, risk flags, and a summary.
System Prompt
You are a contract review specialist. Your role is first-pass analysis — identify
what's in the contract, flag potential concerns, and summarize. You are not a lawyer.
Always include this disclaimer in your output.
Protocol:
1. THOUGHT: What type of contract is this? What should I look for?
2. ACTION: Extract clauses by category (parties, term, payment, liability,
termination, IP, confidentiality, governing law, etc.)
3. For each clause: summarize in plain English, note any unusual or one-sided terms
4. Cross-reference against risk categories — flag matches with severity
5. If a second contract version is provided, perform redline comparison
6. FINAL_REVIEW: Executive summary + clause table + risk flags + disclaimer
Rules:
- Flag missing clauses as HIGH risk when they are standard for this contract type
- Flag one-sided terms with the party they favor
- Use PLAIN ENGLISH summaries — the recipient may not be a lawyer
- If you're uncertain about a clause's implication, say so rather than guessing
- Always end with: "This is an automated first-pass review. Consult a qualified
lawyer before making decisions based on this analysis."
Risk Categories
{
"risk_categories": [
{
"name": "Unlimited Liability",
"severity": "critical",
"patterns": [
"indemnify.*without limitation",
"unlimited liability",
"liable for.*all.*damages",
"waive.*all.*claims"
],
"description": "Party assumes unlimited financial exposure — negotiate a cap."
},
{
"name": "Automatic Renewal",
"severity": "high",
"patterns": [
"automatically renew",
"auto-renew",
"shall renew.*unless.*notice"
],
"description": "Contract renews without explicit action — may lock you in unexpectedly."
},
{
"name": "One-Sided Termination",
"severity": "high",
"patterns": [
"terminate.*at any time.*without cause",
"sole discretion to terminate",
"immediate termination.*without notice"
],
"description": "Only one party can terminate — negotiate mutual or notice-period terms."
},
{
"name": "IP Assignment",
"severity": "critical",
"patterns": [
"assign.*all.*intellectual property",
"work product.*becomes.*property of",
"hereby assigns.*all.*right.*title",
"irrevocably assign"
],
"description": "You're giving away ownership of your work product — negotiate a license instead."
},
{
"name": "Non-Compete Overreach",
"severity": "medium",
"patterns": [
"non-compete",
"shall not.*compete.*for.*years",
"restricted from.*engaging.*similar business"
],
"description": "May restrict future work — check geographic and time scope for reasonableness."
},
{
"name": "Data Privacy Gap",
"severity": "high",
"patterns": [
"no.*data.*processing.*agreement",
"no.*privacy.*policy",
"sell.*personal.*data",
"share.*personal.*information.*third.*party"
],
"description": "Missing or weak data protection terms — required under GDPR/CCPA."
},
{
"name": "Vague Scope",
"severity": "medium",
"patterns": [
"as.*reasonably.*requested",
"other.*services.*as.*needed",
"additional.*work.*at.*client.*discretion"
],
"description": "Scope is open-ended — you may be on the hook for undefined work."
}
]
}
Tool Definitions
Agent Tools
Values: path: string
Values: categories?: string[] (default: all)
Values: contract_text: string, categories?: string[]
Values: version_a_path: string, version_b_path: string
Values: clauses_json: object
Tool Implementation
# tools.py
import json
import os
import re
def read_contract(path):
full = path if os.path.isabs(path) else os.path.join(os.getcwd(), path)
if not os.path.exists(full):
return f"ERROR: File not found: {path}"
if path.endswith(".pdf"):
import fitz # pymupdf
doc = fitz.open(full)
text = []
for i, page in enumerate(doc):
text.append(f"--- Page {i+1} ---\n{page.get_text()}")
return "\n".join(text)
with open(full, "r") as f:
return f.read()
def extract_clauses(client, contract_text, model, categories=None):
all_categories = categories or [
"parties", "term", "payment", "liability", "termination",
"intellectual_property", "confidentiality", "governing_law",
"indemnification", "warranty", "limitation_of_liability",
"force_majeure", "assignment", "dispute_resolution"
]
prompt = f"""Extract clauses from this contract by category.
Return a JSON object with category names as keys.
For each category, provide: the clause text (exact quote) and a plain-English summary.
If a category is not present in the contract, set its value to null.
Contract text:
{contract_text[:15000]}
Categories to extract: {', '.join(all_categories)}
Return ONLY valid JSON."""
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
temperature=0.1,
response_format={"type": "json_object"}
)
return response.choices[0].message.content
def flag_risks(contract_text, risk_categories_path="risk_categories.json"):
if not os.path.exists(risk_categories_path):
return f"ERROR: Risk categories file not found: {risk_categories_path}"
with open(risk_categories_path) as f:
categories = json.load(f)["risk_categories"]
# Limit scan to first 20K chars to avoid regex performance issues
scan_text = contract_text[:20000]
findings = []
for cat in categories:
for pattern in cat["patterns"]:
matches = re.finditer(pattern, scan_text, re.IGNORECASE | re.DOTALL)
for m in matches:
context_start = max(0, m.start() - 80)
context_end = min(len(scan_text), m.end() + 80)
findings.append({
"category": cat["name"],
"severity": cat["severity"],
"matched_text": m.group().strip(),
"context": scan_text[context_start:context_end].replace("\n", " "),
"description": cat["description"]
})
unique = {f["matched_text"] + f["category"]: f for f in findings}
result = list(unique.values())
if len(contract_text) > 20000:
result.append({"warning": "Contract text exceeds 20K characters. Only first 20K scanned for risks."})
return json.dumps(result, indent=2)
def compare_versions(client, model, path_a, path_b):
text_a = read_contract(path_a)
text_b = read_contract(path_b)
prompt = f"""Compare these two contract versions and identify changes.
Version A:
{text_a[:8000]}
Version B:
{text_b[:8000]}
Return a JSON object with:
- added: clauses present in B but not A
- removed: clauses present in A but not B
- modified: clauses that changed (show old vs new text)
- summary: one-sentence summary of the changes
Return ONLY valid JSON."""
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
temperature=0.1,
response_format={"type": "json_object"}
)
return response.choices[0].message.content
def summarize_contract(client, model, clauses_json):
prompt = f"""Given these extracted contract clauses, write an executive summary
in plain English. Include: what this contract is, key obligations of each party,
critical risks, and recommended next steps. Keep it under 300 words.
Clauses:
{clauses_json}
Return ONLY the summary text, no JSON wrapper."""
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
temperature=0.2
)
return response.choices[0].message.content
Agent Initialization
# agent.py
import json
import os
import argparse
from openai import OpenAI
import tools as agent_tools
TOOL_SCHEMAS = [
{
"type": "function",
"function": {
"name": "read_contract",
"description": "Read a PDF or Markdown contract file",
"parameters": {
"type": "object",
"properties": {"path": {"type": "string"}},
"required": ["path"]
}
}
},
{
"type": "function",
"function": {
"name": "extract_clauses",
"description": "Extract clauses by category from contract text",
"parameters": {
"type": "object",
"properties": {
"contract_text": {"type": "string"},
"categories": {"type": "array", "items": {"type": "string"}}
},
"required": ["contract_text"]
}
}
},
{
"type": "function",
"function": {
"name": "flag_risks",
"description": "Cross-reference contract text against risk categories",
"parameters": {
"type": "object",
"properties": {"contract_text": {"type": "string"}},
"required": ["contract_text"]
}
}
},
{
"type": "function",
"function": {
"name": "compare_versions",
"description": "Compare two contract versions (redline analysis)",
"parameters": {
"type": "object",
"properties": {
"version_a_path": {"type": "string"},
"version_b_path": {"type": "string"}
},
"required": ["version_a_path", "version_b_path"]
}
}
},
{
"type": "function",
"function": {
"name": "summarize_contract",
"description": "Generate a plain-English executive summary",
"parameters": {
"type": "object",
"properties": {"clauses_json": {"type": "string"}},
"required": ["clauses_json"]
}
}
}
]
SYSTEM_PROMPT = """You are a contract review specialist. Your role is first-pass
analysis — identify what's in the contract, flag potential concerns, and summarize.
You are not a lawyer. Always include a disclaimer.
Protocol:
1. Read the contract
2. Extract clauses by category (parties, term, payment, liability, termination,
IP, confidentiality, governing law)
3. Flag risks against risk categories — report severity and description
4. Summarize in plain English
5. FINAL_REVIEW with: executive summary, clause table, risk flags, disclaimer
Rules:
- Flag missing clauses as HIGH risk when standard for this contract type
- Flag one-sided terms with the party they favor
- Use PLAIN ENGLISH — the recipient may not be a lawyer
- If uncertain, say so rather than guessing
- End with disclaimer about consulting a qualified lawyer"""
def run_agent(contract_path, config, compare_path=None):
client = OpenAI(api_key=config["openai_api_key"])
model = config.get("model", "gpt-4o")
query = f"Review this contract: {contract_path}."
if compare_path:
query += f" Compare it against: {compare_path}."
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": query}
]
for i in range(config.get("max_iterations", 6)):
response = client.chat.completions.create(
model=model,
messages=messages,
tools=TOOL_SCHEMAS,
temperature=0.2
)
msg = response.choices[0].message
messages.append(msg)
if msg.content and "FINAL_REVIEW:" in msg.content:
return msg.content.split("FINAL_REVIEW:", 1)[1].strip()
if not msg.tool_calls:
messages.append({
"role": "user",
"content": "Continue the review. Extract clauses, flag risks, then provide FINAL_REVIEW."
})
continue
for tool_call in msg.tool_calls:
func_name = tool_call.function.name
func_args = json.loads(tool_call.function.arguments)
if func_name == "read_contract":
result = agent_tools.read_contract(func_args.get("path", contract_path))
elif func_name == "extract_clauses":
result = agent_tools.extract_clauses(client,
func_args.get("contract_text", ""), model,
func_args.get("categories"))
elif func_name == "flag_risks":
text = func_args.get("contract_text", "")
result = agent_tools.flag_risks(text,
config.get("risk_categories_path", "risk_categories.json"))
elif func_name == "compare_versions":
result = agent_tools.compare_versions(client, model,
func_args.get("version_a_path", contract_path),
func_args.get("version_b_path", compare_path or contract_path))
elif func_name == "summarize_contract":
result = agent_tools.summarize_contract(client, model,
func_args.get("clauses_json", ""))
else:
result = f"Unknown tool: {func_name}"
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
return "Agent reached max iterations."
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--contract", required=True, help="Path to contract file (PDF or MD)")
parser.add_argument("--compare", help="Optional: second version to compare against")
parser.add_argument("--config", default="config.json")
args = parser.parse_args()
with open(args.config) as f:
config = json.load(f)
result = run_agent(args.contract, config, args.compare)
print(result)
Walkthrough
Reviewing a software vendor agreement with unfavorable terms.
Agent reads the contract
read_contract returns a 12-page PDF. Full text with page markers loaded into context. The agent identifies it as a Master Services Agreement with a Statement of Work appendix.
Extracts all clauses
extract_clauses returns a JSON object with 12 clause categories. Key findings:
- Liability: "Vendor's total liability shall not exceed fees paid in the preceding 3 months." → Low cap, but mutual — standard.
- IP Assignment: "All work product shall become the sole property of Client." → Full IP transfer, no license option.
- Termination: "Client may terminate at any time without cause upon 5 days' notice." → One-sided. Vendor has no equivalent right.
Flags risks
flag_risks cross-references the full text against risk categories:
- [CRITICAL] IP Assignment — "hereby assigns all right, title, and interest"
- [HIGH] One-Sided Termination — "terminate at any time without cause"
- [MEDIUM] Vague Scope — "other services as reasonably requested"
Delivers FINAL_REVIEW
FINAL_REVIEW:
CONTRACT TYPE: Software Master Services Agreement
PARTIES: Client (Acme Corp) and Vendor (DevCo LLC)
KEY OBLIGATIONS:
- Vendor provides software development services per SOW
- Client pays $15,000/month, net-30 terms
- Term: 12 months
CRITICAL RISKS:
[CRITICAL] IP Assignment: All work product becomes Client's sole property.
Recommend: Negotiate a perpetual license instead of full assignment, or
increase fees to account for IP transfer value.
[HIGH] One-Sided Termination: Client can terminate without cause in 5 days.
Vendor has no equivalent right. Recommend: Add mutual termination clause
with 30-day notice for both parties.
[MEDIUM] Vague Scope: "other services as reasonably requested" creates
open-ended obligations. Recommend: Cap additional work at X hours/month
or require a separate SOW for scope changes.
DISCLAIMER: This is an automated first-pass review. Consult a qualified
lawyer before making decisions based on this analysis.
Customization
Risk Configuration
Values: path to .json file
Values: gpt-4o, gpt-4o-mini
Values: 1-10 (default 6)
Note:
PDF quality matters. Scanned PDFs (images of text) will not produce usable output. The agent works with text-based PDFs and Markdown files. Use OCR preprocessing for scanned documents.
Key Takeaway
Contract review agents are best at surface-level pattern matching and clause extraction — the kind of work that consumes paralegal hours. They will not catch subtle legal implications or jurisdiction-specific nuances. The risk categories JSON is the most important file: it defines what the agent flags. Customize it for your industry's standard concerns before running on real contracts.
Related Articles
Testing Agent Blueprint
Complete testing agent that reads source code, identifies test gaps, generates unit tests, runs them, and fixes failures. Ready-to-run with pytest and Jest integration.
AI Agent Blueprints & Configurations
Ready-to-run AI agent blueprints, configurations, and local setup guides. Build research agents, code reviewers, and content writers with copy-paste implementations.
Agent Evaluation & Benchmarking
How to measure agent performance — standard benchmarks (SWE-bench, AgentBench, WebArena), custom evaluation dimensions, trajectory scoring, and building an eval harness.