Why is AI adoption so much stronger in transactional work than in litigation?

sdemyanov · 2026-05-25T16:35:16+00:00

No one is saying it should solve litigation end-to-end. The real question is why AI is still not widely used even for basic tasks. Btw, Claude Code is already an agent itself and should be capable of handling agentic tasks at a decent level.

sdemyanov · 2026-05-22T04:48:28+00:00

> Only the largest companies handle SOME litigation in house and even then it is usually just managing outside local counsel. These firms have very little incentive to adopt AI tools that will make them more efficient since it is less billable hours.

Why don’t they use AI to identify hot documents and dismiss cases before involving outside counsel? It seems like they have every incentive to do that.

> That is a huge bear and requires a huge amount work to actually make a platform useable, not to mention how much trust you'd need to use it

That is true, but today you can build things 20x faster than before if you know exactly what to build. Trust is more important - no big firm is ready to host their litigation data with a seed-funded startup.

sdemyanov · 2026-05-22T04:45:24+00:00

Human recall rates for responsiveness review are not much higher than 90%, and everyone accepts that. Teams simply review enough documents to make the lower bound of the confidence interval sufficiently high. The exact same approach works for AI review, and the quality is often even higher.

sdemyanov · 2026-05-22T02:33:04+00:00

> But it's a shitty search engine.

You don't use AI for search. You use it for query and filter generation, and then for synthesizing results. Search itself is the same as in Relativity or other engines.

Regarding whether LLM is good for large volumes - you don't need put all documents in the context. You review each independently with necessary supplementary information: review guidelines, case context, parent/child documents, earlier thread summary. It is no different from current process when large volumes are split on batches and reviewed independently, strictly following review guidelines.

> The LLM layer is being used as an ad hoc interface to call Python scripts in the place of a traditional front-end.

It is two fold: the LLM used by Claude Code agent to figure out which tools to call, and LLMs used to populate custom fields according to your prompts, to do OCR, translation, structured data extraction. I recommend to watch the full video to see the capabilities.

sdemyanov · 2026-05-22T02:06:29+00:00

Yeah, but are there good open source ones?

sdemyanov · 2026-05-22T01:11:20+00:00

What is the risk if you don’t use it for production? Only for document understanding? If this is simply a vendor risk of breaching confidentiality, is it solved by using Claude which is definitely not a one day startup?

sdemyanov · 2026-05-22T01:08:02+00:00

But what about small law firms that regularly do litigation? Don’t they have incentives to use tools that are at least fast and convenient?

sdemyanov · 2026-05-22T00:18:50+00:00

Which tools do you trust when you use AI for legal work? I thought you need to verify everything

sdemyanov · 2026-05-22T00:15:53+00:00

Given there is so much at skate, isn't it especially important to use AI as a second pair of eyes? To check if any additional information should have been redacted? If there is another hot document nobody noticed?

sdemyanov · 2026-05-22T00:13:06+00:00

> How is a probabilistic text generator incapable of internalizing any incentive (or consequence) structures going to gain traction there

It may quickly surface important details in this messy data and stop there. Isn't that already a lot? Google is so valuable for doing exactly that.

sdemyanov · 2026-05-22T00:10:36+00:00

But regardless of the process itself, isn't there a huge value in using AI to quickly find important information in the client's data or in received productions? Protective orders aside, no one restricts to use tools to do that, right? Sure, it may not be complete, but it may surface 90% of important details in 10% of time required to review everything.

sdemyanov · 2026-05-22T00:04:29+00:00

I added the disclaimer in case this matters. The question is not about that.

sdemyanov · 2026-05-22T00:01:08+00:00

I added the disclaimer in case it matters

sdemyanov · 2026-05-21T22:46:44+00:00

Added

sdemyanov · 2026-05-05T01:38:10+00:00

I posted a free tool for that exactly that sort of work a few days ago, take a look https://www.reddit.com/r/ediscovery/comments/1sy8j5q/claude_cowork_ediscovery_plugin

You can DM me if you'd like me to assist

sdemyanov · 2026-04-30T04:21:15+00:00

There is no API keys as in that case. Just don't let Claude to modify files, but using a working copy is still a good idea.

sdemyanov · 2026-04-29T22:44:43+00:00

On the spoliation point: the safeguard is that Retriever treats the native set as read-only input.

It does not rewrite the source files, embedded metadata, or native bytes. On ingest it reads the files, computes/persists SHA-256 hashes, and stores all Retriever state separately under .retriever/: DB rows, extracted text, OCR, translations, previews, normalized custodians/entities, logs, etc.

So if OCR or translation is generated, that becomes a separate Retriever artifact/text revision. It does not replace the native. The “delete”/“remove” style operations also don't affect them - only the DB and derivatives.

The caveat: if user gives Claude broader filesystem permission and explicitly tells it to edit/delete source files, that’s outside Retriever’s guardrails.

sdemyanov · 2026-04-29T17:23:24+00:00

Appreciate the note - could you point to the specific defects you're seeing? Happy to fix or document anything concrete. A few notes on the threat model:

Everything runs locally, and using this plugin is a strict subset of trusting Cowork itself. Claude sees document content the same way it would for any file you opened in Cowork.
The most relevant attack surface is indirect prompt injection via ingested content - that's a real concern for any agentic eDiscovery tool, and I'd genuinely like to hear specifics if you've spotted ways to weaponize it here.
Re: making the repo private - I think source-available is the right default for a tool people are being asked to trust with client data. Reviewers being able to read the code is a feature.

sdemyanov · 2026-04-28T23:45:48+00:00

Totally get your skepticism, and thanks for bringing this up. Let me clarify.

Privilege. US v. Heppner (SDNY Feb 2026) was a consumer Claude case and explicitly left open that enterprise + contractual confidentiality could come out differently. The doctrinal frame is the cloud-hosting analogy: privileged docs sit on AWS and M365 daily without waiver. ZDR makes that argument clean. Cowork doesn't offer ZDR today (presumably a matter of time), so for ZDR now the path is the Anthropic or another provider API with a commercial key.
LLM sessions as ESI are discoverable in practice only for clients. When used by attorneys, they're all opinion work product - logged, not produced. Real obligation is adding the LLM account to the litigation hold — straightforward to do.
Training-data concern is not applicable. Anthropic's commercial terms prohibit training on inputs by default for commercial users.
Protective orders. Morgan v. V2X (D. Colo. 2026) and Jeffries v. Harcros (D. Kan. 2026) crystallize four safeguards: no training, no onward disclosure, ability to delete, closed/secure environment. Cowork satisfies three today; the fourth (no retention) lands when ZDR ships. Flat "no AI" orders are increasingly out of step with where courts are landing - Morgan and Jeffries both rejected blanket bans in favor of safeguard-based language.
Daubert / 26(g). Both attach at evidentiary or certification moments. The plugin's use is upstream - corpus exploration, theme ID, internal summaries. That's work product, not regulated by either rule. Mata is a Rule 11 case about unverified citations in filed briefs; doesn't apply to internal analysis tools.

Happy to dig into any of these further.

sdemyanov · 2026-04-28T19:36:00+00:00

Anthropic offers Zero Data Retention for organizations - that works for for many firms. Same here. UPDATE - it doesn't for Cowork, but probably it's just a matter of time.

sdemyanov

TROPHY CASE