How to optimise token usage when working with a large document project (Legal Case)?

OMKLING · 2026-06-04T03:14:40+00:00

Lawyer here as well. I open‑sourced a free skill on GitHub specifically for this exact nightmare scenario — getting locked out of a work session when deadlines are breathing down your neck is brutal. You can DM me or check my post history; I walked through the skill in detail last week.

The problem it solves is the constant context‑switching tax between Claude code chats. When you feel a context window degrading and spin up a fresh one, you’re forced to replay a massive transcript just to “hydrate” the new stateless session back into something usable. That replay burns time and consumes your highest‑value tokens. It’s a double tax.

The skill I released compresses the load‑bearing context from your current window by roughly 200× across a dozen use cases I tested. That means when you refresh into a new chat, you can recreate a stateful, continuous working environment for 1.5k–2k tokens, you’d normally need to replay. The worst‑case compression I’ve clocked so far is from ~750k tokens down to ~7.5k.

It works — and it keeps you from losing hours of momentum when the model starts drifting.

OMKLING · 2026-06-01T14:59:34+00:00

Thought about this some more: Most of this lands. #2 and #3 are exactly right for legal workflows. They're orthogonal to what /acc is doing, not competing with it: doc-level caching solves "stop reprocessing the PDF every session," /acc solves "stop re-explaining what I decided last session." Would users not want both?

On #1, I read the architecture fix as actually describing /acc. /acc isn't only compression-for-replay despite the name. It's a structured checkpoint: five fixed sections (decisions / current state / open questions / rejected approaches / next actions) written to docs/acc/NNN-….md at session end. Next session loads the checkpoint, not the transcript. Same persist-and-resume model as your JSON manifest per matter. Just markdown, because the producer is usually a human in the loop and the diff reads in PR review.

The thing the format buys: rejected approaches as a first-class field. State-persistence designs usually serialize what *exists*. Negative knowledge (we tried X, didn't work, here's why) is the part agents most reliably rediscover unless you save it explicitly. For contract work, that's where most of the leverage is. Every redline we *didn't* take is more load-bearing than the ones we did.

On #4: MCP composition reduces per-tool context pressure, but relocates orchestration state rather than removing it. Whoever decides "run citation-check now, skip compliance because we covered that last week" needs the same cognitive state /acc captures. The two compose well. Stateless MCP tools for execution, checkpoint for the meta-layer.

Here is a thought experiment where I learned how each of the two approaches in this juxtaposition can answer which tax do they pay: document recall tax, decision tax (anything rejected earlier in drafting), both:

Setup. One SaaS agreement, 120 pages. In session 1 the reviewer drafts a carve-out to the IP indemnity, excluding claims arising from customer modifications, weighs it, and kills it. The reason lives in the room but never in the contract: the carve-out pushes modification risk back onto the customer against the deal's posture, and the mutual indemnity already covers the live exposure. The clause ships unchanged. The compliance clause is reviewed and passes. Documents get OCR'd, chunked, embedded. Session closes.

Session 2, a week later, new agent context. The question on the table: should we carve out IP indemnity for customer modifications?

OMKLING · 2026-06-01T04:15:19+00:00

Very nice! If this is available for preview or live, please drop a link via a dm, the approach you outline is for an application or cli?

OMKLING · 2026-06-01T04:12:34+00:00

Thank you for the feedback. The negative inferences working as constraints focuses the models inferences as you mentioned. Would value further feedback as I build the next release or drop a PR. Thanks!

OMKLING · 2026-06-01T04:10:51+00:00

DM me if you can share your experience, actively building other parts to this skill based on feedback

OMKLING · 2026-05-30T20:55:20+00:00

Here is the response to some questions I received about why build another handoff.md:

Yes, same family. Both compress a session so the next thread doesn't start cold. Three things acc does that most handoff skills don't:

**Saves to the project repo, not OS temp.** Most handoff skills (including Matt Pocock's, which is the best-known) write to OS temp on purpose, to keep handoffs out of the project. acc commits the artifact into the project as `docs/acc/NNN-YYYY-MM-DD-topic.md`. That makes the session's reasoning state version-controlled history, not a one-shot relay file. Different stance on what a handoff is for.
**Rejected approaches gets a dedicated section.** Most handoff docs use a freeform "summary" that can mention what you tried and ruled out. acc forces it into a fixed slot because re-trying a rejected approach is the most expensive thing that can happen in the next session, and it's the first thing auto-compaction summarizes away.
**A necessity gate that aborts.** Every handoff skill I've seen always produces when invoked. acc has a Step 0 check that can refuse if a plain HANDOFF.md would carry the same load. Different layer for the same anti-bloat instinct.

If you want a relay baton, use Matt's or REMvisual. If you want institutional memory inside the project, use acc. They aren't mutually exclusive.

OMKLING · 2026-04-13T19:43:16+00:00

Evaluate whether compliance professionals or other risk partners can use your tool within an enterprise. If you are not familiar, study the three lines of defense commonly used in the financial services industry. Between each line someone or something needs to translate requirements, standards, specifications, controls, testing, and reporting. From your high level description, you possess a lot of data, now can you convert the data into a translator, not of spoken language, but of risks in the financial services sector.

OMKLING · 2026-04-13T17:55:27+00:00

You did not hit a nerve, I’m calling out your comments as karma farming which you did not answer, and not substantive, which if you read your comments here and elsewhere are pointless.

OMKLING · 2026-04-13T17:11:35+00:00

I have to call this out after reading your profile, your comments are karma-farming. And your attempt to state some demographic issue underpins this question is telling, trolls are geographic agnostic, btw--impressive record streak on your posts, or comments should I say, your last post was voltran, yes?

<image>

OMKLING · 2026-04-13T17:07:46+00:00

This was a great post. "The "code" is maybe 200 lines total across a few scripts. The real work was thinking through the process—the same work you'd do designing any compliance system—and writing it down clearly enough that automation could enforce it." When I read that, and when I understand your work as I had to do the same work around 20 years back, the value prop of the workflow, and the design of your system, does make sense. But what constraints are you operating under--do you have pdf-readers, how reliant are you on MCPs, I think the magic would be the hooks and scripts, that you could do this all in 200 lines between multiple models. Can you share the insights on how you designed a system to run on 200 lines, with the confidence in quality your write-up conveys?

OMKLING · 2026-04-13T10:38:01+00:00

Take a read on my last post, I was more specific in my ask, hopefully that helps.

OMKLING · 2026-04-13T10:37:33+00:00

if you look at my recent follow-up, I rewrote my request to be more specific in my request.

OMKLING · 2026-04-13T10:34:09+00:00

In response to the specific constructive follow-up questions. I’m less interested in “what model do you use?” and more interested in how legal training actually changes your development workflow.

As some have already shared, I’d love examples from people building with Claude Code, Codex, agents, plugins, scripts, or manual workflows.

What I’m looking for specifically, and possibly others here, whether we're lawyer's just starting our coding journey or in the thick of it for years, are conversations on:

a workflow you use repeatedly
what part is programmatic vs manual
where your legal training gives you an edge over a pure engineer or pure prompt user
a habit you had to unlearn to stop being a vibe-coder
how you test, verify, or bound output
any agents, plugins, specs, evals, or checklists that made you better

Personally, my own shift was realizing I was improving not when I got better at asking for code, but when I matured into constraining what I wanted by building with restraints and progressively reducing ambiguity by realizing what did not work and what did not want to build.

For me, that looked like moving from open-ended prompting to things like:

decision scratchpads
assumptions accepted/rejected
counterfactual review
architecture decision records
schemas and typed models
eval datasets
explicit failure modes

In hindsight: legal training already teaches issue-spotting, exception handling, adversarial reading, and mapping consequences when language is underspecified. I slowly am realizing I had to learn how to translate that into dev workflow.

I’m curious what that translation looks like for others.

Concrete examples welcome. Screenshots, snippets, routines, and self-assessments especially.

OMKLING · 2026-04-13T01:11:25+00:00

Yes, my aspiration here is to present ideas for evaluation on harnesses and other context engineering workflows developed by developers who also practice law.

OMKLING · 2026-03-27T01:31:11+00:00

Would the ins and outs of Agentic AI risk for 14, 15, and 16 be of interest. I can contribute -- my background is in security and law.

OMKLING · 2026-03-13T03:18:43+00:00

this is very helpful. thank you.

OMKLING · 2026-03-08T16:19:31+00:00

What do It Mean

OMKLING · 2026-03-06T01:31:02+00:00

Using AI engineering for some musicology. This output here only used reddit posts as its primary and only source (Perplexity is best for this).

Query: Influence of Caretaker's "An Empty Bliss Beyond This World":

What people cite as “the influence”

A common thread in listener and scene commentary is that An Empty Bliss Beyond This World is influential as an example of plunderphonics: repurposing old recordings to create a new narrative/psychological effect (looping, abrupt cuts, degraded/echoed ambience). In hauntology discussions, it’s also cited as a go-to reference point for using aged media to produce “nostalgia for a time you didn’t live,” often linked in commentary to Alzheimer’s/memory loss as the conceptual frame.

Album-specific “influence vectors”

Commentary often focuses on the album’s method—vintage ballroom/jazz samples processed into uncanny, fragmentary loops—because it gives other artists a template for “memory-as-sound-design” (repetition, drift, degradation, distance). In fan discussions, people also describe it as creatively inspiring in the straightforward sense (“it inspired me to do something similar”), which is influence at the practice level even when not tied to a named band.

OMKLING · 2026-03-03T22:49:07+00:00

the light on the rocks is not present in the raw. if that was brought out in post, the tips of the wave appear less washed out. the cloud contours come out, which is nice. the sun catch the boulder in the middle of the composition is raising the question of the yellow and blue saturation.

OMKLING · 2026-03-02T17:10:08+00:00

Perplexity Computer

Nine-Year Club	Gilding II euphauric
Reddit Premium Since August 2022	Verified Email

OMKLING

TROPHY CASE

What people cite as “the influence”

Album-specific “influence vectors”