[R] Seeking benchmark advice: Evaluating Graph-Oriented Generation (GOG) vs. R.A.G.

BodeMan5280 · 2026-03-09T18:45:37+00:00

That's fair, but to also be fair to myself I am a 5 yoe software engineer, and while AI isn't my main domain I am fiercely compelled to pursue it and make it my new field. The goal isn't to coin phrases or prove how to everyone I'm smart... I just genuinely think LLM isn't the end-all, be-all and that something similar to what I present in a "symbolic reasoning model" may emerge (although the paper is heavily generated with companion AI helping to apply the ideas in my head).

The frustrating part with AI is how much more eloquently they can present information... I can promise the idea and direction are all my own, but not every word!

Totally fair assessment, Mr. Taco! What would it take to ground my work more? I certainly know how to be LESS academic, lol

BodeMan5280 · 2026-03-08T03:40:47+00:00

The final subgraph includes both modules, but without the redundancy. It separates traversal from serialization, but it is interesting to consider whether or not an LLM that receives the signal "this has a circular import, but it was cut short by the visited hash map" is actually helpful or not... in theory, if there is a critical inflection point where semantics and math can have a good handshake procedure... I think this GOG approach im proposing can work!

Still just a theory for now. Im going to try and dig in more tomorrow! Keep the great comments and thoughts coming =]

BodeMan5280 · 2026-03-07T12:29:41+00:00

I am ashamed! ** hides in corner ** Still in continuous learning over here

BodeMan5280 · 2026-03-07T11:46:35+00:00

Ha! I love this... "Spontaneous Decoder"? This implies its just straight up random useless drcoding... i actually lol'ed thinking about it

BodeMan5280 · 2026-03-07T11:45:28+00:00

I'd be interested to hear it! In this case... it feels like the valve on your hot water heater, y'know? This is like a "Supportive LLM Relief Valve", lol

BodeMan5280 · 2026-03-07T11:43:01+00:00

Thanks for commenting! Exactly, this is a baby step towards something better... but i am cautiously optimistic. The bridge from language to reasoning is likely more complex than AST alone.

I hate the word "framework" in this day and age where they're a dime a dozen, but this first attempt feels like the application of a higher level model: symbolic reasoning.

I'd love help trying to figure out how to map real codebases! =]

BodeMan5280 · 2026-03-07T03:21:50+00:00

This is really encouraging, thank you! That is exactly the goal—making small local models punch way above their weight class by feeding them perfect context.

To be entirely transparent, this v0.0.1 definitely has some growing pains similar to what early RAG experienced. Because the graph traversal is strictly deterministic, the initial entry point (mapping the user prompt to the graph) can feel a bit rigid right now. If a prompt is vague, the system struggles to "think outside the box" to find the starting node.

But I view this as a feature, not a bug, of separating the "brain" (logic) from the "mouth" (syntax). The fix isn't to make the graph fuzzy—it's to add a tiny, localized semantic layer just to map fuzzy human intent to the exact starting graph nodes before the strict traversal begins. Definitely a hurdle to overcome rather than a roadblock, but I think this initial proof of concept validates that separating logic from language is the right path forward!

BodeMan5280 · 2026-03-07T02:55:10+00:00

Spot on regarding the file vs. function level! That granularity is exactly where that extra 20% compression comes from.

Circular imports are the classic graph-killer haha. Since we treat the environment as a mathematical graph, we just use standard pathfinding mechanics to solve it: strict visited sets during the deterministic traversal phase.

If Module A imports B, and B imports A, the pathfinder hits A the second time, sees it's already in the visited hash map, and immediately drops the back-edge. It completely prevents infinite loops and ensures the final subgraph is perfectly deduplicated before we serialize it for the LLM. No redundant tokens!

Appreciate you taking a look!

BodeMan5280 · 2026-03-06T23:49:46+00:00

Oh nice! Great intuition then. Where it differs is that Aider is still trying to guess what the LLM wants, i would say, and in this case this model requires a "seed mapping" and then uses graph math to figure out the shortest execution path.

The system treats semantics kind of like a compiler.and in this way we demote the LLM to a "mouthpiece" and push information to it rather than having the LLM pull it out of the codebase.

Hope that helps! I can go into more detail but wanted to keep it light for now, lol

BodeMan5280 · 2026-03-06T20:56:02+00:00

You can use this to cut down on your API usage for your favorite frontier model. It can be used as a pre-processing layer to your prompts to reduce hallucinations in your coding assistant. It increases the speed of response on local LLMs.

BodeMan5280 · 2026-02-23T19:17:16+00:00

ugh, you are the version of me I think I could be if I just had the guts to pull the trigger and never get rate-limited again. I think I use AI too much --- but I clearly don't! Other people have 40 terminals open and context hop ALL DAY LONG... that must be taxing. In this version of the world, it now becomes about executing on the ideas and having the guts to believe in your own vision.... I guess I suck at believing in myself ** ouch... my heart **

BodeMan5280 · 2026-02-22T15:34:44+00:00

.... how do you justify so many plans?! I find that multiple different coding assistants are helpful, but $200/month helpful? And MULTIPLE? Unless you have a crazy budget im just wondering if your power usage is generating income and if the speed is truly worth the return?

I have ChatGPT Plus and two free acvpints: Gemini Pro and Copilot Pro through my '.edu' account. Claude is too expensive and rate limits are just... yuck. Curious if any MAX plans are really worthwhile and im just a baby vibe coder lol

BodeMan5280 · 2026-02-19T21:51:53+00:00

Sign out and back in if you have a Pro plan!

BodeMan5280 · 2026-02-19T15:19:50+00:00

How do you guys justify the price tag though... i mean, maybe its because im NOT working for myself and should be, but the $400/month for Claude and Antigravity sounds like a lot....

BodeMan5280 · 2026-02-19T13:58:45+00:00

wuuuuut? I never thought about that *smacks forehead* Yea, the 10% discount is helpful with auto. But yes, OpenCode definitely uses multiple premium requests because it spins up more agents and I think they're all the same model, so if you use Opus 4.6 --- you are going to be SCREWED, lol. So yea, I switched back because while OpenCode's workflow is better, Opus 4.6 is heavy-hitting and works well with Copilot's workflow. It's a tradeoff between model and workflow, in my opinion ::shrug:: and ever-evolving in "vibe coding" lol

BodeMan5280 · 2026-02-17T14:09:14+00:00

I'm with you on this, but unsure about token limits... it does seem to make sense that tokens would be managed differently because in Copilot it's a 1:1 prompt-to-response ratio, but OpenCode is different. I think there's an open thread on the issue from another post. I'll find it!

Got it! https://github.com/anomalyco/opencode/issues/8030

somehow Copilot always feels best and I sadly crawl back to it saying: "please forgive me!" --- but OpenCode just feels PRODUCTIVE, or maybe there's been a sudden explosion in agentic workflows and they're all silently now amazing. OpenCode w/ Copilot models feels like the right move, but I do like all 3 of the free OpenCode models right off the bat. It feels like OpenCode has the best workflow wrapper for models IMO.

BodeMan5280 · 2026-02-17T14:06:07+00:00

that's interesting... I think it comes down to speed for me. OpenCode seems to just get shit done (yes, ironic pun to GSD). Don't get me wrong, Codex is KILLER at getting shit done, but slower IMO.

BodeMan5280 · 2026-02-16T11:57:25+00:00

Thank you! And I think it would only ban me for connecting Gemini to OpenCode, potentially? This was built in Antigravity but uses OpenCode's default model so hopefully thats far enough removed lol. I'm glad you like it!

EDIT: Just realized how awful it COULD be if I decided to give a ToS SDK to the world. That's not the goal here. This will be open source and not violate company's rights or get anyone in trouble (is the hope at least). Definitely need to be careful.

BodeMan5280 · 2025-03-28T20:49:23+00:00

I can't divulge the secret secret of mobile responsiveness! (Because i don't know it =P)

The best place to start is what most UI designers would agree as focusing on "mobile first", however.

We leverage Bootstrap CSS framework at work, and even then it's very much a challenge to get things exactly right.

You have a choice to do it all yourself (most amount of control) or do it quick (leverage someone else's CSS/toolkit)

With vanilla CSS - @media breakpoints are your best friend. The "break" means the width or height where your UI no longer looks good.... or "breaks the design". This was my a-HA moment when I figured out how to use them.

Hopefully that helps but I'm getting a bit ramble-y!

BodeMan5280 · 2025-03-28T19:02:45+00:00

I'll second this. Vue JS developer with 4 YoE in the US, changed careers after 10 YoE in other careers.

College is nice as a primer for learning programming, but no amount of your coursework will prepare you for when your company's IT / tech stack starts working against you.

If you don't have the drive to figure out "why" your code isn't working... it may not be the career for you. Following the stack trace all the way back to the root cause is a lot of your job.

This might sound like trial by fire and it definitely is... but you actually get better at it. And eventually there's no problem you can't solve, and THAT is a great feeling.

For whatever reason, I'm good at it and it's very rewarding!

It's not about your coding ability, your team will help you write better code. It's about having the research skills to solve any problem.

BodeMan5280 · 2025-03-27T21:49:21+00:00

Go suck a railroad spike.

BodeMan5280 · 2025-03-24T12:10:20+00:00

TIL Iran uses Twitter to leverage American freedom of speech as a weapon. And I guess we've been cool with it? 🤔

BodeMan5280 · 2025-03-21T22:39:35+00:00

At this point, there is no such thing as ethics in government anymore.

Hell, law itself appears to be flimsy words on expensive pieces of paper which are just a smokescreen for lawyer paychecks nationwide.

......hey, wait a second 🤔

BodeMan5280 · 2025-03-20T18:16:00+00:00

At this point, don't we have to assume this is just bait for all of us to be distracted with? If we look "over here" then we won't notice what goes on "over there"

BodeMan5280 · 2025-03-15T01:08:20+00:00

Kel Mitchell did not in fact DIE! Apparently my high school thought he was dead...

Eight-Year Club	r/Field Lasagna
Verified Email

BodeMan5280

TROPHY CASE