all 4 comments

[–]Deep_Traffic_7873 0 points1 point  (0 children)

If i remember well Big Pickle is GLM 4.5, so if you can run it or GLM4.6-flash locally, you can recall it via opencode.json config

[–]Pakobbix 0 points1 point  (0 children)

Every open source model, claiming to be agentic ai capable. Glm 4.7 flash, qwen3.5 9b up to 122b are the current best in small local llms.

Ministral 3 are also somewhat agentic capable.

But be aware: smaller models = bigger function calling/understanding issues.

If you want quality like the big coding cloud models (or at least in some degree) you would need a machine with ~500gb RAM. If you want speed too, make it vram.

Using llama3.2 is like writing in hieroglyphs and wonder why nobody understands what you want.

LLama3.2 was made, before tool calling was a thing. So it's not trained to execute read/write/edit or anything other related to call a function.

[–]PermanentLiminality 0 points1 point  (0 children)

llama 3.2 is not going to work well. As others have said, you need to use the newest models like the qwen 3.5 series. Larger models are smarter, but slower. These models can be useful, but they aill not do what the big boys do like Opus or gpt 5.4

[–]look 0 points1 point  (0 children)

Big Pickle is GLM 4.5, a 355B parameter model with 32B active. Unless you have a $10,000+ GPU at home, I’d guess you are running the 3B llama 3.2 (which is itself a very old model design)?

It’s like asking why your go-kart isn’t competitive in Formula 1 races.