AI-Assisted Document Intake for a Boutique Litigation Firm

The situation

Every new matter started the same way: a paralegal received a production from opposing counsel — sometimes hundreds of pages, sometimes thousands — and had to triage it. What is in here? What is responsive? What is privileged? What is just noise? The work was high-skill, high-volume, and bottlenecked at the partner level for the questions paralegals could not answer.

The firm wanted AI assist on the early triage stages without taking on a confidentiality problem. Sending privileged content to a third-party LLM API was off the table.

What we built

A document intake pipeline that runs entirely inside the firm's cloud tenancy. Productions are uploaded to a controlled S3 bucket; OCR and extraction run on AWS-native services; embeddings are generated by a model the firm controls; the LLM that drafts triage summaries and proposed responsiveness tags is Amazon Bedrock under the firm's BAA.

The output is a draft triage report — by document, by topic, by potential responsiveness — that a paralegal reviews and corrects before the matter file is finalized. Every claim in the report links back to the specific document and page that supports it.

Decisions that mattered

Tenancy isolation as a hard requirement. Each matter has its own vector index. Documents from one case cannot bleed into retrieval for another. The model's context window is rebuilt fresh for every query against only the relevant matter's index.

Citation by construction. The drafting prompt requires every claim to cite a source chunk by document ID and page. Outputs without citations are rejected at the orchestration layer before reaching the paralegal. This is not a soft norm — it is enforced in code.

Privilege detection as a separate stage. Privilege flagging runs as its own model call with its own prompt and a much narrower scope. We did not blend it into the general triage prompt because privilege calls require a higher false-negative bar than topic categorization. A document the privilege detector is unsure about is escalated, never silently classified.

Logging at the chunk level. Every retrieval that informs a model output is logged. If a partner later asks "why did the system characterize this document this way," we can show the exact chunks the model saw and the prompt it was given.

Outcome

Paralegals start a new matter with a structured triage report instead of a stack of PDFs. The report is wrong in places — it always is — but it is wrong in inspectable ways. The paralegal corrects, the system learns the corrections through prompt examples (not by training on client data), and the next matter starts with the lessons baked in.

Partner involvement shifted to where it belongs: the genuinely ambiguous calls. Their time stopped being spent answering "is this responsive" thirty times a day.

Working with us

We started with an architecture review focused on the confidentiality boundary — what crosses, what does not, and how to prove that to a client whose general counsel asks. The build was incremental, with the first phase covering only OCR and indexing, the second adding triage drafting, and the third adding privilege detection.

If your firm is evaluating AI for matter intake or document review and your confidentiality posture is non-negotiable, we should talk.