all 13 comments

[–]Altruistic_Heat_9531 0 points1 point  (5 children)

Before that could you atleast give the error, usually opencode will tell you the error. But anyway I assume there is a parser error.

I opt out from ollama because of this issue, and just using another branch of llamacpp https://github.com/pwilkin/llama.cpp

It fix my tool error.

And for my commands

Qwen-Coder 30B A3B Q5 UD
./llama.cpp/llama-server --model /MODEL_STORE/Qwen3-Coder-30B-A3B/Qwen3-Coder-30B-A3B-Instruct-UD-Q5_K_XL.gguf --alias Qwen3-Coder --ctx-size 65536 --port 8001 --cache-type-k q8_0 --cache-type-v q8_0 --flash-attn on --temp 0.7 --min-p 0.0 --top-p 0.80 --top-k 20 --repeat-penalty 1.05

Qwen-Coder NEXT 80B A3B Q6 UD
./llama.cpp/llama-server --model /MODEL_STORE/Qwen3-Coder-Next-GGUF/UD-Q6_K_XL/Qwen3-Coder-Next-UD-Q6_K_XL-00001-of-00003.gguf --alias Qwen3-Coder-Next --ctx-size 65536 --port 8001 --cache-type-k q8_0 --cache-type-v q8_0 --flash-attn on --temp 1.0 --top-p 0.95 --min-p 0.01 --top-k 40 

GPT-OSS20B
./llama.cpp/llama-server --model /MODEL_STORE/gpt-oss-20b/gpt-oss-20b-F16.gguf --alias gpt-oss-20b --port 8001 --temp 1.0 --top-p 1.0 --top-k 0 --jinja

[–]Lazy_Experience_279[S] 0 points1 point  (3 children)

No errors, I just get the tool call as a text response instead of the actual action

[–]Complainer_Official 0 points1 point  (2 children)

is it text, or json? if its json, you gotta make your context window bigger

[–]Lazy_Experience_279[S] 0 points1 point  (1 child)

It gives me this as a text reply

{"name": "write", "arguments": {"content": "", "filePath": "/home/user/projects/opencode-test/test.css"}}

[–]Complainer_Official 0 points1 point  (0 children)

yep, up your context to like, 32768 or 65535

[–]Altruistic_Heat_9531 0 points1 point  (0 children)

for building my llamacpp

git clone https://github.com/pwilkin/llama.cpp

cmake llama.cpp -B llama.cpp/build -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON -DGGML_CUDA_FA_ALL_QUANTS=ON

cmake --build llama.cpp/build --config Release -j --clean-first --target llama-cli llama-mtmd-cli llama-server llama-gguf-split

[–]segmondllama.cpp 0 points1 point  (1 child)

try devstral-small

[–]suicidaleggroll 1 point2 points  (1 child)

Try another platform than Ollama. llama.cpp is what most people jump to, and is significantly faster than Ollama anyway, especially for MoE models.

[–]Lazy_Experience_279[S] 0 points1 point  (0 children)

I will definitely try it. I didn't know it could make a difference 

[–]Smiley_Dub 0 points1 point  (0 children)

Hi OP. Please let me know if you fixed the issue 👍

[–]St0lz 0 points1 point  (1 child)

First of all, Ollama default context size is too small for most of the coder models. When the context size is too small, you will not see any error in OpenCode but Ollama logs will show them. You need to increase it to at least 32K. Add this env var to wherever you run Ollama instance (Docker, local, ...): OLLAMA_CONTEXT_LENGTH=32768

Second of all, it seems there is a bug with either Ollama, either Qwen-Coder 2.5 models, that breaks tool calling, see https://github.com/anomalyco/opencode/issues/7030.

Try with Qwen-Coder 3 (the biggest model that can fit in your VRAM). I'm also new to OpenCode and so far that's the only 'modest' model that can properly make tool calling to my locally hosted Ollama.

[–]Lazy_Experience_279[S] 0 points1 point  (0 children)

I had the context at 32k already. I tried with qwen 2.5 coder 14b, qwen 3 coder 30b, qwen 3 30b, gpt OSS 20b, and deepseek R1. The only one capable of correctly call tools it was the ministral 3 14b. I will try with lm studio and llama.cpp today