AI Document Review Agent v2.0

A real, runnable full-stack system — FastAPI + React/FluentUI front to back — built on LangChain v1.1 and DeepSeek. Every flagged issue is streamed live onto the PDF at its exact bounding box, accepted or dismissed through a human-in-the-loop gate, and persisted to SQLite.

Overview

AI Document Review Agent v2.0 reviews uploaded PDF documents (the bundled sample is a Chinese labor-contract termination agreement) and flags two kinds of issues by default:

Grammar & Spelling (语法与拼写) — genuine typos, wrong characters, punctuation and grammar errors. Low risk.
Definitive Language (绝对化表述) — over-committal wording like "必须 / 保证 / 一定 / 完全 / 绝对" used in a formal-promise context, which carries legal/compliance risk. High risk.

Beyond the two presets, reviewers can define custom rules (name + description + examples + risk level), which are injected into the prompt so the model learns new issue types on the fly. Every issue is pinned to its exact location in the PDF and rendered as a highlight the reviewer can accept or dismiss.

Architecture (real stack)

React + FluentUI (Vite)            FastAPI backend
  ├─ Files page (upload/list)        ├─ /api/v1/files            (upload, list)
  ├─ Review page                     ├─ /api/v1/review/{id}/issues  (SSE stream)
  │   ├─ react-pdf viewer            ├─ /api/v1/rules            (custom-rule CRUD)
  │   ├─ annotpdf highlights         └─ /api/v1/.../hitl/{start,resume}
  │   └─ @microsoft/fetch-event-source
                                     services/
                                       ├─ lc_pipeline.py   (LangChain v1.1 + DeepSeek)
                                       ├─ mineru_client.py (MinerU v4 PDF parsing)
                                       ├─ hitl_agent.py    (HumanInTheLoopMiddleware)
                                       └─ bbox.py          (coordinate mapping)
                                     SQLite (app.db): issues · rules · feedback

Backend: fastapi, langchain==1.1.3, langchain-deepseek==1.0.1, pymupdf, aiosqlite, sse-starlette. Frontend: react 18, @fluentui/react-components, react-pdf, annotpdf, vite. LLM: DeepSeek Chat (via LangChain v1's init_chat_model(provider="deepseek"), falling back to an OpenAI-compatible client).

How a review runs

Upload — the PDF is saved server-side (/api/v1/files).
Parse (MinerU v4) — mineru_client.py uploads the PDF to MinerU, polls for results, and downloads the parsed JSON: paragraph text + per-paragraph bounding boxes + layout.
Chunk — paragraphs are batched (32 per chunk) to fit the model context.
Review (LangChain + DeepSeek) — for each chunk, lc_pipeline.py builds a system prompt (preset issue types + any custom rules + an extensive exclusion list so list numbers, form placeholders, and underscores aren't flagged), calls llm.ainvoke, and parses the output with a PydanticOutputParser into structured issues — deliberately avoiding provider-specific response_format so DeepSeek works cleanly.
Locate (3-level bbox fallback) — each issue's text is mapped to a PDF bounding box: first the PDF text layer (pymupdf search), then MinerU's layout.json, finally the paragraph-level bbox.
Stream (SSE) — issues are pushed to the browser as event: issues over Server-Sent Events as each chunk finishes, so highlights appear progressively rather than after a long wait.
Review loop (HITL) — accept / dismiss / feedback each issue. Mutations are routed through a LangChain v1 HumanInTheLoopMiddleware tool call (hitl_agent.py) so updates are gated by an explicit human approval step. Everything persists to SQLite.

What makes it v2.0

The "v2.0" is a real re-architecture, visible in the code:

PromptFlow → native LangChain. v1 leaned on Azure PromptFlow for orchestration; v2.0 moves the core review logic into a native LangChain v1.1 pipeline (lc_pipeline.py), so it runs locally without Azure AI Foundry.
Custom rule engine. Reviewers define their own issue types at runtime; rules are merged into the prompt and carry their own risk level.
LangChain HITL middleware. Issue mutations go through a framework-level human-approval gate rather than hard-coded business logic.
Finer bbox localization. The 3-level fallback keeps highlights accurate even when the PDF text layer is damaged or MinerU layout is incomplete.
SSE instead of polling. Progressive, single-direction streaming of issues to the UI.

Honest scope

This is a complete, runnable system, not a demo script: full FastAPI backend, full React/FluentUI frontend, SQLite persistence, working review loop. To run it you supply a MinerU API key (PDF parsing is a cloud call) and a DeepSeek API key. The default issue set is two types (grammar + definitive language) tuned for Chinese business documents; everything else is custom-rule territory. The HITL approval UI is wired at the API layer; the front-end surface for it is the lightest-finished part.

What this project signals

Correct, current LangChain v1.1 usage — provider-based model init, PydanticOutputParser instead of provider-specific structured output, framework-level human-in-the-loop.
Full-stack delivery — FastAPI + React/FluentUI + SQLite + SSE, front to back.
Document-grounding discipline — every issue is pinned to a real PDF location with a robust fallback chain, so a reviewer can always verify against the source.

Demo strategy

The live demo actually runs

This demo is real, not a replay. The Definitive Language (绝对化表述) detector runs the project's rule logic client-side on whatever text you paste — instant, no API key. The 'Deep review with DeepSeek' button calls a server-side route that runs DeepSeek live for grammar/spelling + deeper review (key stays server-side, input length-capped, rate-limited). Try it: paste a paragraph with 必须/保证/一定 and a typo.

Public preview can be enabled later without redesigning the case-study layout