AI Document Review Agent v2.0
A full-stack document review system: upload a PDF, MinerU parses it, a LangChain v1.1 + DeepSeek pipeline flags grammar issues and over-definitive language paragraph by paragraph, streams each issue back onto the PDF at its exact location, with custom rules and human-in-the-loop review.
A real, runnable full-stack system — FastAPI + React/FluentUI front to back — built on LangChain v1.1 and DeepSeek. Every flagged issue is streamed live onto the PDF at its exact bounding box, accepted or dismissed through a human-in-the-loop gate, and persisted to SQLite.
Overview
AI Document Review Agent v2.0 reviews uploaded PDF documents (the bundled sample is a Chinese labor-contract termination agreement) and flags two kinds of issues by default:
- Grammar & Spelling (语法与拼写) — genuine typos, wrong characters, punctuation and grammar errors. Low risk.
- Definitive Language (绝对化表述) — over-committal wording like "必须 / 保证 / 一定 / 完全 / 绝对" used in a formal-promise context, which carries legal/compliance risk. High risk.
Beyond the two presets, reviewers can define custom rules (name + description + examples + risk level), which are injected into the prompt so the model learns new issue types on the fly. Every issue is pinned to its exact location in the PDF and rendered as a highlight the reviewer can accept or dismiss.
Architecture (real stack)
React + FluentUI (Vite) FastAPI backend
├─ Files page (upload/list) ├─ /api/v1/files (upload, list)
├─ Review page ├─ /api/v1/review/{id}/issues (SSE stream)
│ ├─ react-pdf viewer ├─ /api/v1/rules (custom-rule CRUD)
│ ├─ annotpdf highlights └─ /api/v1/.../hitl/{start,resume}
│ └─ @microsoft/fetch-event-source
services/
├─ lc_pipeline.py (LangChain v1.1 + DeepSeek)
├─ mineru_client.py (MinerU v4 PDF parsing)
├─ hitl_agent.py (HumanInTheLoopMiddleware)
└─ bbox.py (coordinate mapping)
SQLite (app.db): issues · rules · feedback
Backend: fastapi, langchain==1.1.3, langchain-deepseek==1.0.1, pymupdf, aiosqlite, sse-starlette.
Frontend: react 18, @fluentui/react-components, react-pdf, annotpdf, vite. LLM: DeepSeek Chat
(via LangChain v1's init_chat_model(provider="deepseek"), falling back to an OpenAI-compatible client).
How a review runs
- Upload — the PDF is saved server-side (
/api/v1/files). - Parse (MinerU v4) —
mineru_client.pyuploads the PDF to MinerU, polls for results, and downloads the parsed JSON: paragraph text + per-paragraph bounding boxes + layout. - Chunk — paragraphs are batched (32 per chunk) to fit the model context.
- Review (LangChain + DeepSeek) — for each chunk,
lc_pipeline.pybuilds a system prompt (preset issue types + any custom rules + an extensive exclusion list so list numbers, form placeholders, and underscores aren't flagged), callsllm.ainvoke, and parses the output with aPydanticOutputParserinto structured issues — deliberately avoiding provider-specificresponse_formatso DeepSeek works cleanly. - Locate (3-level bbox fallback) — each issue's text is mapped to a PDF bounding box: first the PDF text
layer (pymupdf search), then MinerU's
layout.json, finally the paragraph-level bbox. - Stream (SSE) — issues are pushed to the browser as
event: issuesover Server-Sent Events as each chunk finishes, so highlights appear progressively rather than after a long wait. - Review loop (HITL) — accept / dismiss / feedback each issue. Mutations are routed through a LangChain v1
HumanInTheLoopMiddlewaretool call (hitl_agent.py) so updates are gated by an explicit human approval step. Everything persists to SQLite.
What makes it v2.0
The "v2.0" is a real re-architecture, visible in the code:
- PromptFlow → native LangChain. v1 leaned on Azure PromptFlow for orchestration; v2.0 moves the core review
logic into a native LangChain v1.1 pipeline (
lc_pipeline.py), so it runs locally without Azure AI Foundry. - Custom rule engine. Reviewers define their own issue types at runtime; rules are merged into the prompt and carry their own risk level.
- LangChain HITL middleware. Issue mutations go through a framework-level human-approval gate rather than hard-coded business logic.
- Finer bbox localization. The 3-level fallback keeps highlights accurate even when the PDF text layer is damaged or MinerU layout is incomplete.
- SSE instead of polling. Progressive, single-direction streaming of issues to the UI.
Honest scope
This is a complete, runnable system, not a demo script: full FastAPI backend, full React/FluentUI frontend, SQLite persistence, working review loop. To run it you supply a MinerU API key (PDF parsing is a cloud call) and a DeepSeek API key. The default issue set is two types (grammar + definitive language) tuned for Chinese business documents; everything else is custom-rule territory. The HITL approval UI is wired at the API layer; the front-end surface for it is the lightest-finished part.
What this project signals
- Correct, current LangChain v1.1 usage — provider-based model init,
PydanticOutputParserinstead of provider-specific structured output, framework-level human-in-the-loop. - Full-stack delivery — FastAPI + React/FluentUI + SQLite + SSE, front to back.
- Document-grounding discipline — every issue is pinned to a real PDF location with a robust fallback chain, so a reviewer can always verify against the source.
The live demo actually runs
This demo is real, not a replay. The Definitive Language (绝对化表述) detector runs the project's rule logic client-side on whatever text you paste — instant, no API key. The 'Deep review with DeepSeek' button calls a server-side route that runs DeepSeek live for grammar/spelling + deeper review (key stays server-side, input length-capped, rate-limited). Try it: paste a paragraph with 必须/保证/一定 and a typo.