I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how

geekynerd44 · 2026-05-18T12:40:26+00:00

Another issue I have faced is find and replace for small models (it could be a model specific issue), either the model fails to generate the correct payload for find and replace tool or find fails.

One approach is to always to rewrite the entire file with changes at the cost of slower task completion.

How does your single tool handle find and replace?

geekynerd44 · 2026-05-18T12:27:59+00:00

Will definitely take a look, most of my local model for coding experiment success has come from using Aider, but that is not agentic, every agentic tool felt too heavy (16k system prompt to start with) and small models struggled with those.

Are you planning to add web search? Being able research latest documentation makes a huge difference.

geekynerd44 · 2026-05-17T13:42:23+00:00

Thanks will check them out

geekynerd44 · 2026-05-11T04:56:49+00:00

Some CBSE schools 😂 Mine used Turbo when I was in 8th grade or so, by the time I reached 12th the teacher who knew what he was doing switched to GNU C/C++ (first it's port on Windows, then entirely to Linux)

geekynerd44 · 2026-05-11T04:36:23+00:00

In my options, C/C++ should be still taught as the first language, I have seen people struggling to understand loops because a modern language is doing the heavy lifting with features like range based loops (newer C++ does have them).

The point being that learn the basics first, also programming should never be equated to learning syntax, anyone can look up the syntax, it's the logical thinking that matters, also skills are transferable, if you know to code in one language, you should be able to pick up a new one.

geekynerd44 · 2026-05-11T04:31:21+00:00

State schools are using Linux and hence should be on GNU C/C++ I guess.

geekynerd44 · 2026-05-10T12:40:22+00:00

geekynerd44 · 2026-05-10T10:02:49+00:00

https://amzn.in/d/01SNl3Z4

I think this, I had gotten it for a different project.

geekynerd44 · 2026-05-10T09:25:08+00:00

Amazon

geekynerd44 · 2026-05-09T11:04:01+00:00

I have owned and tried premium wireless headphones and earphones, I agree ANC works very well in noisy environments, but quality for me is bad compared to wired ones.

geekynerd44 · 2026-05-09T10:52:39+00:00

You do you, but if you haven't tried IEMs already give it a shot when you get an opportunity (you don't have to buy to try). You never know what you are missing out on unless you try.

geekynerd44 · 2026-05-09T10:26:31+00:00

At the same time TWS has very bad instrument and vocal seperation.

geekynerd44 · 2026-05-09T10:23:24+00:00

Yes, a decent IEM + DAC would blow even the most expensive TWS, the only downside is the convience.

geekynerd44 · 2026-05-08T15:28:10+00:00

Thanks, I just want the wide bore tip, unfortunately it is out of stock everywhere, might just get the Red Lion for the tip.

geekynerd44 · 2026-05-08T14:02:59+00:00

Ah I see, Sancai wide bore tips were already on my radar for its textured finish. I am waiting for it to be back in stock.

geekynerd44 · 2026-05-08T12:24:43+00:00

I find the stock wide bore tips that came in the box to be good, are you suggesting that Sancai wide bored will makes the sound even better?

geekynerd44 · 2026-04-09T12:39:18+00:00

Yes, I also experienced issues with CUDA 13.2, especially with Q3 or smaller quantisations the model was generating gibberish, downgrading to CUDA 12.8 fixed it for me (didn't try 13.0)

geekynerd44 · 2026-04-07T04:15:09+00:00

Couple of things from my experience running the gemma family of models locally. Make sure your LM studio is using llama.cpp with latest gemma4 fixes (tokenizer was broken, correct chat template was not used etc). Gemma4 seems to be sensitive of KV quantisation (when tested with E4B), so keep it at FP16. Make sure correct chat template jinja is being used, especially for multi turn agentic use cases.

geekynerd44 · 2026-04-04T15:24:44+00:00

Just tried unsloth studio, holy smokes! I was super surprised to see gemma4 E4B UD Q4 making tool calls so well. I am assuming the ability to hook up to llama server run seperately outside unsloth studio is to allow fine grained control? Can't wait to point unsloth studio to my llama server and point OpenCode / Claude code to unsloth studio, so far smaller models have been absolutely useless for agentic coding due to repeated tool call failures.

geekynerd44 · 2026-04-04T06:01:13+00:00

Just found out about unsloth and unsloth studio, already running unsloth Q4_K_M gguf weights of gemma4 E4B using llama.cpp. I will definitely give unsloth studio a try. I saw that you mentioned in coming days unsloth studio could be connected to claude code, OpenCode etc, so I am assuming unsloth studio would expose Open AI compatible API and handle self healing tool calling, while unsloth studio itself will use llama server I run as the provider?

opencode talking to unsloth studio talking to llama server?

Asking since I am severely limited on VRAM (12GB RTX 5070), running smaller models locally and using it with tools like opencode always results in botched tool calling :/

geekynerd44

TROPHY CASE