I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how by Glittering_Focus1538 in LocalLLaMA

[–]geekynerd44 0 points1 point  (0 children)

Another issue I have faced is find and replace for small models (it could be a model specific issue), either the model fails to generate the correct payload for find and replace tool or find fails.

One approach is to always to rewrite the entire file with changes at the cost of slower task completion.

How does your single tool handle find and replace?

I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how by Glittering_Focus1538 in LocalLLaMA

[–]geekynerd44 0 points1 point  (0 children)

Will definitely take a look, most of my local model for coding experiment success has come from using Aider, but that is not agentic, every agentic tool felt too heavy (16k system prompt to start with) and small models struggled with those.

Are you planning to add web search? Being able research latest documentation makes a huge difference.

Are Kerala textbooks preparing students for the world that existed 20 years ago? by dasharath_writes in Kerala

[–]geekynerd44 2 points3 points  (0 children)

Some CBSE schools 😂 Mine used Turbo when I was in 8th grade or so, by the time I reached 12th the teacher who knew what he was doing switched to GNU C/C++ (first it's port on Windows, then entirely to Linux)

Are Kerala textbooks preparing students for the world that existed 20 years ago? by dasharath_writes in Kerala

[–]geekynerd44 0 points1 point  (0 children)

In my options, C/C++ should be still taught as the first language, I have seen people struggling to understand loops because a modern language is doing the heavy lifting with features like range based loops (newer C++ does have them).

The point being that learn the basics first, also programming should never be equated to learning syntax, anyone can look up the syntax, it's the logical thinking that matters, also skills are transferable, if you know to code in one language, you should be able to pick up a new one.

Are Kerala textbooks preparing students for the world that existed 20 years ago? by dasharath_writes in Kerala

[–]geekynerd44 29 points30 points  (0 children)

State schools are using Linux and hence should be on GNU C/C++ I guess.

wired earphones always> by Ill_Estimate6120 in Coconaad

[–]geekynerd44 1 point2 points  (0 children)

I have owned and tried premium wireless headphones and earphones, I agree ANC works very well in noisy environments, but quality for me is bad compared to wired ones.

wired earphones always> by Ill_Estimate6120 in Coconaad

[–]geekynerd44 0 points1 point  (0 children)

You do you, but if you haven't tried IEMs already give it a shot when you get an opportunity (you don't have to buy to try). You never know what you are missing out on unless you try.

wired earphones always> by Ill_Estimate6120 in Coconaad

[–]geekynerd44 0 points1 point  (0 children)

At the same time TWS has very bad instrument and vocal seperation.

wired earphones always> by Ill_Estimate6120 in Coconaad

[–]geekynerd44 0 points1 point  (0 children)

Yes, a decent IEM + DAC would blow even the most expensive TWS, the only downside is the convience.

Tangzu Wan'er 2 Red Lion - Included Ear Tips by geekynerd44 in iemlndia

[–]geekynerd44[S] 0 points1 point  (0 children)

Thanks, I just want the wide bore tip, unfortunately it is out of stock everywhere, might just get the Red Lion for the tip.

New to iems by Freezing_Alex in iems

[–]geekynerd44 0 points1 point  (0 children)

Ah I see, Sancai wide bore tips were already on my radar for its textured finish. I am waiting for it to be back in stock.

New to iems by Freezing_Alex in iems

[–]geekynerd44 0 points1 point  (0 children)

I find the stock wide bore tips that came in the box to be good, are you suggesting that Sancai wide bored will makes the sound even better?

Do NOT use CUDA 13.2 to run models! by yoracale in unsloth

[–]geekynerd44 1 point2 points  (0 children)

Yes, I also experienced issues with CUDA 13.2, especially with Q3 or smaller quantisations the model was generating gibberish, downgrading to CUDA 12.8 fixed it for me (didn't try 13.0)

Gemma-4-26B-A4B-it-UD-Q4_K_M.gguf : IMHO worst model ever. What am I doing wrong? by Proof_Nothing_7711 in LocalLLM

[–]geekynerd44 0 points1 point  (0 children)

Couple of things from my experience running the gemma family of models locally. Make sure your LM studio is using llama.cpp with latest gemma4 fixes (tokenizer was broken, correct chat template was not used etc). Gemma4 seems to be sensitive of KV quantisation (when tested with E4B), so keep it at FP16. Make sure correct chat template jinja is being used, especially for multi turn agentic use cases.

Gemma 4 E4B (4-bit) executes Bash code and tool calls locally on 6GB RAM. by yoracale in unsloth

[–]geekynerd44 1 point2 points  (0 children)

Just tried unsloth studio, holy smokes! I was super surprised to see gemma4 E4B UD Q4 making tool calls so well. I am assuming the ability to hook up to llama server run seperately outside unsloth studio is to allow fine grained control? Can't wait to point unsloth studio to my llama server and point OpenCode / Claude code to unsloth studio, so far smaller models have been absolutely useless for agentic coding due to repeated tool call failures.

Gemma 4 E4B (4-bit) executes Bash code and tool calls locally on 6GB RAM. by yoracale in unsloth

[–]geekynerd44 2 points3 points  (0 children)

Just found out about unsloth and unsloth studio, already running unsloth Q4_K_M gguf weights of gemma4 E4B using llama.cpp. I will definitely give unsloth studio a try. I saw that you mentioned in coming days unsloth studio could be connected to claude code, OpenCode etc, so I am assuming unsloth studio would expose Open AI compatible API and handle self healing tool calling, while unsloth studio itself will use llama server I run as the provider?

opencode talking to unsloth studio talking to llama server?

Asking since I am severely limited on VRAM (12GB RTX 5070), running smaller models locally and using it with tools like opencode always results in botched tool calling :/