Context Engineering Middleware
Fighting Context Rot: six context modules × five strategies (Write/Select/Compress/Isolate/Cache), all realized as stackable LangChain middleware, applied Cache-first (90% off) then Isolate.
Six modules fill the context window; stack the five strategies in the course's "Cache-first, Isolate-later" priority and watch both window tokens and relative cost drop.
Why this local version exists
The decision priority, the five strategies (Write/Select/Compress/Isolate/Cache), the Compress sub-techniques, and the 90%-off prompt cache come from the Part 9 courseware. The token/cost numbers are illustrative.
Five strategies trimming the context window
Six modules fill the window; stack the five strategies in the "Cache-first, Isolate-later" priority order — window and cost both drop.
Six context modules
Conversation history + tool context together are >50% of the window (the two biggest taps).
Window
180K
Relative cost
100%
① Cache
Static prefix cache hit, cache-read at 10% (90% cost off), day one
② Compress · tool-result clearing
Drop verbose raw tool output, keep "decision + why" — zero cost
③ Compress · observation masking + trim
Mask old observations + trim_messages hard-truncate history
④ Isolate · sub-agent
SubAgentMiddleware moves external-knowledge work into its own context
⑤ Write + Select
Offload task state to a scratchpad, JIT-retrieve memory
What to try
Stack the strategies and watch the window shrink module by module.
Note Cache goes first (90% off, day one) and the system-prompt prefix turns "cached".
See Isolate spin external-knowledge work into a sub-agent context only when needed.
What this demo proves
You understand Context Rot — bigger window ≠ better — and manage it deliberately.
You can name each module and pick the right strategy, realized as stackable LangChain middleware.
You order savings: zero-cost (Cache / tool-result clearing) first, complex (Isolate / Write) on demand.
Framework
Six modules × five strategies (Write/Select/Compress/Isolate/Cache)
As middleware
trim_messages · SummarizationMiddleware · SubAgentMiddleware (deepagents)
Cost lever
Prompt cache: read at 10% (90% off), prefix byte-stable