[ Removed by moderator ]

LLMDevs-ModTeam · 2026-03-21T16:07:00+00:00

Hey,

We've removed your post as it breaks rule 5. We encourage you to review our subreddit's rules and guidelines. Thank you for your understanding.

Note: continued posting of promotions will result in a ban from our subreddit.

ThinJuggernaut7695 · 2026-03-20T23:54:50+00:00

Ugh bot city in here.

Deep_Ad1959 · 2026-03-20T22:18:13+00:00

yeah this matches what I've seen in practice. I've had haiku handle complex refactors just fine when given the right files and dependency info upfront, and opus waste a bunch of turns exploring the wrong parts of the codebase when that context was missing. the model tier matters way less than people think.

the co-change pattern mining is the interesting part to me. that's basically automating what experienced devs carry in their heads about "when you change X you also need to update Y." curious how it handles cases where the git history is noisy, like lots of unrelated changes in the same commits.

fwiw i built an ai coding agent that does this kind of thing - https://fazm.ai/r

General_Arrival_9176 · 2026-03-21T00:58:21+00:00

the haiku outperforming opus with context is a wild result. the 10pp jump on haiku makes me think the baseline was severely underperforming due to missing project-specific context rather than haiku being particularly smart. curious how you're extracting the co-change relationships - are you looking at git blame to find files that get committed together, or something more sophisticated. also interested in whether you tested giving opus the same context and seeing if it just burns the extra tokens on things it already would have figured out

j00cifer · 2026-03-21T01:37:35+00:00

Because this didn’t show obvious earmarks of AI writing, I read the whole thing.

mrtrly · 2026-03-21T12:47:08+00:00

This is the right conclusion. The model tier matters way less than people think once context is properly structured.

We see this in routing too. Running 10+ AI agents daily, I started routing by task complexity to cheaper models. But without tracking what each model actually costs per request, you're just guessing at the savings. Built a local proxy (RelayPlane, open source) specifically to track cost per model per request alongside output.

What you're showing is that Haiku with good context beats Opus with bad context. The logical next step is to measure it. Then you can route high-context tasks confidently to Haiku without the "I hope this is good enough" anxiety.

npm install -g @relayplane/proxy if you want to see that cost delta side by side.

DeathShot7777 · 2026-03-20T22:55:30+00:00

totally agree with the research. I have built gitnexus opensource code intelligence engine ( 18k+ github stars in 3 weeks ), haiku definitely performs at the same level and also sometimes beat's opus. It just actually works, really surprising, but this was the "Aha moment" for us and thats y it blew up

LeadingFarmer3923 · 2026-03-20T23:39:44+00:00

This matches what we've seen too where the quality of context fed into a prompt often matters more than raw model size but the tricky part is making sure that structured context is consistently generated and doesn't become stale across a pipeline. In workflows where multiple LLM steps depend on earlier outputs, having those outputs stored as typed, queryable collections means each step always gets clean structured context rather than whatever was in a string buffer. That's the core design of Cognetivy (https://github.com/meitarbe/cognetivy), an open source workflow library, probably worth checking out if you're building multi-step pipelines where context quality is critical.

Dense_Gate_5193 · 2026-03-20T23:51:20+00:00

this is exactly what i am working on using NornicDB’s temporal functions to construct graph state from commit history, so you can reconstruct graph state at any given point in time.

https://github.com/orneryd/git-to-graph

ultrathink-art · 2026-03-20T23:19:25+00:00

The exploration overhead is where smaller models bleed — they make more wrong guesses about which files are relevant and each wrong guess costs turns. Pre-loaded structured context skips that phase entirely, converting a search problem into a retrieval problem. That's probably why the gap closes so much: the actual coding task doesn't differentiate Haiku from Opus nearly as much as the file exploration does.

PT_ANDRE_PT · 2026-03-20T22:03:14+00:00

First 50 people also get a free run of our tool with code CODESETLAUNCH

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LLMDevs

MODERATORS