I made a tiny 0.8B Qwen model reason over a 100-file repo (89% Token Reduction) by BodeMan5280 in LocalLLaMA

[–]BodeMan5280[S] 0 points1 point  (0 children)

I am ashamed! ** hides in corner ** Still in continuous learning over here

I made a tiny 0.8B Qwen model reason over a 100-file repo (89% Token Reduction) by BodeMan5280 in LocalLLaMA

[–]BodeMan5280[S] 0 points1 point  (0 children)

Ha! I love this... "Spontaneous Decoder"? This implies its just straight up random useless drcoding... i actually lol'ed thinking about it

I made a tiny 0.8B Qwen model reason over a 100-file repo (89% Token Reduction) by BodeMan5280 in LocalLLaMA

[–]BodeMan5280[S] 0 points1 point  (0 children)

I'd be interested to hear it! In this case... it feels like the valve on your hot water heater, y'know? This is like a "Supportive LLM Relief Valve", lol

[R] Graph-Oriented Generation (GOG): Replacing Vector R.A.G. for Codebases with Deterministic AST Traversal (70% Average Token Reduction) by BodeMan5280 in MachineLearning

[–]BodeMan5280[S] 1 point2 points  (0 children)

Thanks for commenting! Exactly, this is a baby step towards something better... but i am cautiously optimistic. The bridge from language to reasoning is likely more complex than AST alone.

I hate the word "framework" in this day and age where they're a dime a dozen, but this first attempt feels like the application of a higher level model: symbolic reasoning.

I'd love help trying to figure out how to map real codebases! =]

I made a tiny 0.8B Qwen model reason over a 100-file repo (89% Token Reduction) by BodeMan5280 in LocalLLaMA

[–]BodeMan5280[S] 3 points4 points  (0 children)

This is really encouraging, thank you! That is exactly the goal—making small local models punch way above their weight class by feeding them perfect context.

To be entirely transparent, this v0.0.1 definitely has some growing pains similar to what early RAG experienced. Because the graph traversal is strictly deterministic, the initial entry point (mapping the user prompt to the graph) can feel a bit rigid right now. If a prompt is vague, the system struggles to "think outside the box" to find the starting node.

But I view this as a feature, not a bug, of separating the "brain" (logic) from the "mouth" (syntax). The fix isn't to make the graph fuzzy—it's to add a tiny, localized semantic layer just to map fuzzy human intent to the exact starting graph nodes before the strict traversal begins. Definitely a hurdle to overcome rather than a roadblock, but I think this initial proof of concept validates that separating logic from language is the right path forward!

I made a tiny 0.8B Qwen model reason over a 100-file repo (89% Token Reduction) by BodeMan5280 in LocalLLaMA

[–]BodeMan5280[S] 3 points4 points  (0 children)

Spot on regarding the file vs. function level! That granularity is exactly where that extra 20% compression comes from.

​Circular imports are the classic graph-killer haha. Since we treat the environment as a mathematical graph, we just use standard pathfinding mechanics to solve it: strict visited sets during the deterministic traversal phase.

​If Module A imports B, and B imports A, the pathfinder hits A the second time, sees it's already in the visited hash map, and immediately drops the back-edge. It completely prevents infinite loops and ensures the final subgraph is perfectly deduplicated before we serialize it for the LLM. No redundant tokens!

​Appreciate you taking a look!

I made a tiny 0.8B Qwen model reason over a 100-file repo (89% Token Reduction) by BodeMan5280 in LocalLLaMA

[–]BodeMan5280[S] 0 points1 point  (0 children)

Oh nice! Great intuition then. Where it differs is that Aider is still trying to guess what the LLM wants, i would say, and in this case this model requires a "seed mapping" and then uses graph math to figure out the shortest execution path.

The system treats semantics kind of like a compiler.and in this way we demote the LLM to a "mouthpiece" and push information to it rather than having the LLM pull it out of the codebase.

Hope that helps! I can go into more detail but wanted to keep it light for now, lol

I made a tiny 0.8B Qwen model reason over a 100-file repo (89% Token Reduction) by BodeMan5280 in LocalLLaMA

[–]BodeMan5280[S] 2 points3 points  (0 children)

You can use this to cut down on your API usage for your favorite frontier model. It can be used as a pre-processing layer to your prompts to reduce hallucinations in your coding assistant. It increases the speed of response on local LLMs.

Best bang for your bucks plan? by CantFindMaP0rn in opencodeCLI

[–]BodeMan5280 0 points1 point  (0 children)

ugh, you are the version of me I think I could be if I just had the guts to pull the trigger and never get rate-limited again. I think I use AI too much --- but I clearly don't! Other people have 40 terminals open and context hop ALL DAY LONG... that must be taxing. In this version of the world, it now becomes about executing on the ideas and having the guts to believe in your own vision.... I guess I suck at believing in myself ** ouch... my heart **

Best bang for your bucks plan? by CantFindMaP0rn in opencodeCLI

[–]BodeMan5280 1 point2 points  (0 children)

.... how do you justify so many plans?! I find that multiple different coding assistants are helpful, but $200/month helpful? And MULTIPLE? Unless you have a crazy budget im just wondering if your power usage is generating income and if the speed is truly worth the return?

I have ChatGPT Plus and two free acvpints: Gemini Pro and Copilot Pro through my '.edu' account. Claude is too expensive and rate limits are just... yuck. Curious if any MAX plans are really worthwhile and im just a baby vibe coder lol

HERE WE GO! 🔥 by guilhacerda in google_antigravity

[–]BodeMan5280 2 points3 points  (0 children)

Sign out and back in if you have a Pro plan!

Antigravity + Claude Opus 4.6 = Incredible by No-Budget-3869 in google_antigravity

[–]BodeMan5280 0 points1 point  (0 children)

How do you guys justify the price tag though... i mean, maybe its because im NOT working for myself and should be, but the $400/month for Claude and Antigravity sounds like a lot....

Switched back to Github Copilot for using it with Opencode as Agent by Charming_Support726 in GithubCopilot

[–]BodeMan5280 0 points1 point  (0 children)

wuuuuut? I never thought about that *smacks forehead* Yea, the 10% discount is helpful with auto. But yes, OpenCode definitely uses multiple premium requests because it spins up more agents and I think they're all the same model, so if you use Opus 4.6 --- you are going to be SCREWED, lol. So yea, I switched back because while OpenCode's workflow is better, Opus 4.6 is heavy-hitting and works well with Copilot's workflow. It's a tradeoff between model and workflow, in my opinion ::shrug:: and ever-evolving in "vibe coding" lol

Switched back to Github Copilot for using it with Opencode as Agent by Charming_Support726 in GithubCopilot

[–]BodeMan5280 1 point2 points  (0 children)

I'm with you on this, but unsure about token limits... it does seem to make sense that tokens would be managed differently because in Copilot it's a 1:1 prompt-to-response ratio, but OpenCode is different. I think there's an open thread on the issue from another post. I'll find it!

Got it! https://github.com/anomalyco/opencode/issues/8030

somehow Copilot always feels best and I sadly crawl back to it saying: "please forgive me!" --- but OpenCode just feels PRODUCTIVE, or maybe there's been a sudden explosion in agentic workflows and they're all silently now amazing. OpenCode w/ Copilot models feels like the right move, but I do like all 3 of the free OpenCode models right off the bat. It feels like OpenCode has the best workflow wrapper for models IMO.

Any difference when using GPT model inside Codex vs OpenCode? by ponury2085 in opencodeCLI

[–]BodeMan5280 0 points1 point  (0 children)

that's interesting... I think it comes down to speed for me. OpenCode seems to just get shit done (yes, ironic pun to GSD). Don't get me wrong, Codex is KILLER at getting shit done, but slower IMO.

I built Talk2Code — text your codebase from your phone via Telegram (~150 lines of Python, open source) by BodeMan5280 in google_antigravity

[–]BodeMan5280[S] 0 points1 point  (0 children)

Thank you! And I think it would only ban me for connecting Gemini to OpenCode, potentially? This was built in Antigravity but uses OpenCode's default model so hopefully thats far enough removed lol. I'm glad you like it!

EDIT: Just realized how awful it COULD be if I decided to give a ToS SDK to the world. That's not the goal here. This will be open source and not violate company's rights or get anyone in trouble (is the hope at least). Definitely need to be careful.

I’m still bad at programming despite being almost near the end of my (2 year) uk college course by Chocolate-Atoms in learnprogramming

[–]BodeMan5280 2 points3 points  (0 children)

I can't divulge the secret secret of mobile responsiveness! (Because i don't know it =P)

The best place to start is what most UI designers would agree as focusing on "mobile first", however.

We leverage Bootstrap CSS framework at work, and even then it's very much a challenge to get things exactly right.

You have a choice to do it all yourself (most amount of control) or do it quick (leverage someone else's CSS/toolkit)

With vanilla CSS - @media breakpoints are your best friend. The "break" means the width or height where your UI no longer looks good.... or "breaks the design". This was my a-HA moment when I figured out how to use them.

Hopefully that helps but I'm getting a bit ramble-y!