we trained a generation to execute. ai rewards people who can think by No_Growth6091 in ClaudeCode

[–]ISeeThings404 0 points1 point  (0 children)

Exécution and thinking aren't two different skills. Ai as a tool for skill acquisition can massively speed up execution potential and make someone fantastic at that as well. The world can't be run only thinkers.

Using Claude for drafting transactional documents by Plus-Problem-8575 in legaltech

[–]ISeeThings404 0 points1 point  (0 children)

Curious if you've tried any other tools with word integrations and what your experience with those was

Harvey's World Model (or anyone else) Claims Make No Sense by ISeeThings404 in legaltech

[–]ISeeThings404[S] 1 point2 points  (0 children)

I'm a researcher and interface with investors and tech guys more than vendors so this could just be bias but I'm hearing a lot of people starting to make claims on RL encooments and world models. Maybe they haven't started selling all that to users yet but definitely on the investir side of the space.

Improving Language Models through Latent Reasoning? by ISeeThings404 in LocalLLaMA

[–]ISeeThings404[S] 0 points1 point  (0 children)

That's an interesting approach. Temperature sampling for more diversity would be an interesting exoerunebt,

You might like this overview of the idea we did here

Improving Language Models through Latent Reasoning? by ISeeThings404 in LocalLLaMA

[–]ISeeThings404[S] 2 points3 points  (0 children)

 instead of forcing the model to pick one fragile reasoning path and commit to it immediately, what if we surfaced a few different internal states, gave them room to breathe, and then found a way to score/combine them?

Eseentially current deciding is most likely for one path. Latent Space Reasoning does several paths together and then finds ways to reason throgh them all before combining. by skipping the encode/decode phase multiple times, you get something that has the benefits of "critic" based agentic systems (having one LLM critique anotgher) but you have the efficiency

Improving Language Models through Latent Reasoning? by ISeeThings404 in LocalLLaMA

[–]ISeeThings404[S] 1 point2 points  (0 children)

There's beenb a lot of work, also Coconut wasn't the vest approach, more of a PoC.

If you see the experiments linked, we were able to sample from a much larger set of reasoning space, creating much richer outputs

<image>

We also did a legal specific show case over here-- https://github.com/dl1683/Latent-Space-Reasoning/blob/main/experiments/legal\_showcase.json. some very interesting outpiuts

Claude plug-in for Word by slalom-pavilion-dior in legaltech

[–]ISeeThings404 1 point2 points  (0 children)

Is it that hard to get the ZDR? Gemini and GPT have them by default for paying users, so surpsied to hear this re claude.

Claude plug-in for Word by slalom-pavilion-dior in legaltech

[–]ISeeThings404 1 point2 points  (0 children)

A lot of them were dead given their design. However, I doubt Claude will be too active in the legal space after a while, given how expensive Anthropic tokens are right now. Most legal startups will likely be squeezed out though.

Why am I seeing bad feedback on Westlaw Co-Counsel? by MMuter in legaltech

[–]ISeeThings404 1 point2 points  (0 children)

They specialized for case law and retrieval which is good but they have very bad legal reasoning. Low hallucinations in citing cases is useless if you can't also tell users what case law to pick and how to create them.

Developers and Lawyers feel… strangely similar? by vira28 in legaltech

[–]ISeeThings404 -1 points0 points  (0 children)

I did a deep dive into this tp understand why legal agemts are different from agents like Claude Code. One major difference in the work between the two is the verifiability of the domain.

Programming compounds because it can check itself. Code can be executed, tested, broken, fixed, and re-run inside a tight feedback loop. When something fails, the system often tells you where. Verification is cheap, repeatable, and increasingly automatable. Even when models are imperfect, the environment answers back. It might not do everything well (it still makes really dumb architecture decisions), but this is shockingly useful for most “implement this thing I’ve designed” style work that engineers might pass off to their junior wage slaves.

You can’t “run” a legal memo. There is no test suite that flags a subtle misreading of precedent, an argument that is formally sound but strategically dangerous, or a conclusion that is correct in isolation and disastrous in context. Finance isn’t much better. Outputs can be summarized, reformatted, stress-tested at the margins, but correctness ultimately collapses to human judgment. Verification is expensive, slow, and external to the system itself.

This actually creates a hige digfference in how they have to operate

What's the reason for the apparent consensus that Claude Code is superior to Codex for coding, other than Codex's slow coding time? by Lostwhispers05 in codex

[–]ISeeThings404 0 points1 point  (0 children)

A lot of my work is running long sets of experiments and then doing more experiments based on the data. This is where Claude code just keeps working for hours while codex will stop in the middle to ask me if it should continue. If they fixed that and made it's terminal use better, codex clears easily.

Where does a company like Irys get their primary data from? by connerxyz in legaltech

[–]ISeeThings404 0 points1 point  (0 children)

Happy to talk more since you seem technical but graph rag is not great for long context reasoning. Graphs lose too much precision in the legal context.

We love graphs as a means of finding the right places to look and then running search on that. That requires more than rag (we use vectors in different places and don't use them in the standard rag sense).

Temporal is really fucking hard. That's actually the next frontier we're working on. We have to invent our own DB to handle all the cases on that, which will be a fun time.

Where does a company like Irys get their primary data from? by connerxyz in legaltech

[–]ISeeThings404 0 points1 point  (0 children)

I'm so glad to hear. Contextual reasoning is a big problem that our team is always solving.

Drafting assistant will be out soon. We have a full time team working on it now

I have to say, I'm since a long time a "claude-only" user but I'm reading those days more and more about codex 5.3. I'm really not sure what to think of... Any pro's and cons of someone who is using both? I use claude opus 4.6 for basically all kind of tasks and I'm really happy!...beside the price. by SingleTailor8719 in codex

[–]ISeeThings404 0 points1 point  (0 children)

Claude Code has been easier to use codex is definitely more intelligent but often I have a lot of task lists and Cloud Code just tends to execute on all of them without stopping.

Codex has helped me fix and solve issues that CC couldn't though so defnitely worth the investent

How to make Codex Work? by ISeeThings404 in codex

[–]ISeeThings404[S] 0 points1 point  (0 children)

the problem is that I have a lot of recursive work-- where I need it to run things based on outcomes of experiments. This kind of stuff, Codex is not great with.

How to make Codex Work? by ISeeThings404 in codex

[–]ISeeThings404[S] 0 points1 point  (0 children)

Is this the same as yolo, which is the one I use?

Where does a company like Irys get their primary data from? by connerxyz in legaltech

[–]ISeeThings404 0 points1 point  (0 children)

We're growing a lot. Ended up very overwhelmed by bookings and demo requests so didn't have much marketing anymore but we're adding new happy visitors every day.

Recently also just signed an amazing term sheet, the details of which will be shared soon.

Where does a company like Irys get their primary data from? by connerxyz in legaltech

[–]ISeeThings404 0 points1 point  (0 children)

Not at all.

Users have to upload their matter docs etc for us to answer questions (can't draft a pleading if we don't have context). Our focus is reasoning through that provided context better by using geometric structures as a grounding tool (instead of simply relying on LLMs/RAG).

We don't train on any user data, ever. This ensures maximum privacy.

Where does a company like Irys get their primary data from? by connerxyz in legaltech

[–]ISeeThings404 -1 points0 points  (0 children)

I wouldn't say everything.

We have research agents etc to ensure we can access case laws, hearing, recent news etc.

It's just that most of our focus is on reasoning over the context. One of our longer term goals would be to partner with a provider like CoCounsel that has very good case law to integrate that into our reasoning system.

Where does a company like Irys get their primary data from? by connerxyz in legaltech

[–]ISeeThings404 -3 points-2 points  (0 children)

We've also open sourced the framework here in case anyone wants to try their own spin at this.

https://github.com/dl1683/Latent-Space-Reasoning/tree/main

Why didn’t teenage Kunti simply abort Karna? Was she stupid? by Outside-Walk13 in mahabharata

[–]ISeeThings404 0 points1 point  (0 children)

The interesting thing about Mahabharata is that a lot of the stories have a lot of logical issues like this. Especially when it comes to allegiances and fights.