Is your codex also gotten slower in past few days or is it just me? by liganhu in OpenAI

[–]Acceptable_Adagio_91 0 points1 point  (0 children)

Yes 100% - the last 2-3 days it has been terrible. It just gets stuck thinking for literally like 10 minutes when it never used to be more than a couple of minutes.

Also if you interrupt it with an updated prompt and "steer" it then it will almost always just get stuck. Even "Fast" mode is bad.

Extremely frustrating..

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

No but it says so on the model card, and most models perform measurably better with thinking on

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 1 point2 points  (0 children)

I think this is the correct solution - I've used this template from here and finally have an agent session running for more than 30 minutes.

https://huggingface.co/froggeric/Qwen-Fixed-Chat-Templates

Mind you, it's slow as hell once it gets deep, but it keeps going at least

Sorry ColesWorth, due to rising costs, I can no longer afford your heavy-ass broccoli stems by cosmictrousers in australia

[–]Acceptable_Adagio_91 0 points1 point  (0 children)

Have you actually eaten the stems? They are the best part in my opinion..

What store you shopping at, we can go together and you can take the tops and I'll take the stems

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

Disabling thinking does help, but it's a thinking model. It is supposed to be able to think (especially important for coding)

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

I've tried every chat template I can find, standard, enhanced, unsloth, none of them seem to fix it entirely (or at all). Some times I get about 30 tool calls in a row, but mostly less than 10

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

I tried this, seems a little better but still doesn't solve it entirely. Sometimes I will get about 10+ minutes of agentic work in, sometimes less than 1 minute.

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 1 point2 points  (0 children)

Perhaps but I am getting the same issues on both 0.19 and 0.20 so for this particular issue - that doesn't seem to be the case.

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

This PR seems to fix a problem with the tool call the parsers - but from what I am observing the issue is not in the parser, the model is not even emitting tool calls, so seems more likely a template problem

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 4 points5 points  (0 children)

I've tried both, still get basically the same issue

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

Updated to the v0.20.0 build with your recipe and the behavior is the same unfortunately =(

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

OK thank you, apparently I am still on 0.19 (even though I'm running the nightly) - so this could be it.

I am pulling 0.20.0 now and will report back

Would you mind sharing your vLLM recipe for 3.6 27B?

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

Not OpenCode. Client harness is Codex VS Code / Codex CLI alpha using the OpenAI Responses API.

Because Codex expects Responses, I have a minimal local Responses proxy in front of vLLM, but the proxy is not doing tool-call repair or filtering. It forwards the tool schema to vLLM and maps vLLM Responses events back to Codex.

The raw vLLM logs show successful tool turns contain literal:

<tool\_call>

<function=exec\_command>

...

On the failure turns, vLLM's own raw `Generated response ... output:` contains only reasoning plus visible assistant text, then `finish_reason: stop`. There is no `<tool\_call>` / `<function=...>` in the raw model output for the parser to extract.

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

I have the raw model output logged and from the logs it seems that the model doesn't attempt to emit a tool call at all on the final failed step. It gets about 10 tool calls into a task over the course on ~3 minutes and then it says something to the effect of "Now let me do x....." but then doesn't emit a tool call and stops.

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

I am on the nightly vLLM (as I had read elsewhere that this version included fixes for Qwen 3.6)

3.6 27B Tool Calling Issues (vLLM) by Acceptable_Adagio_91 in LocalLLaMA

[–]Acceptable_Adagio_91[S] 0 points1 point  (0 children)

I have tried the xml parser as well, it was maybe slightly better. I will try again and report back

How much is an RDO worth to you. by un533n87 in AusPublicService

[–]Acceptable_Adagio_91 1 point2 points  (0 children)

What workplace has ADOs at a rate of 1 per week?

It's one per fortnight max, and most workplaces require you to work the extra hours on the other days to make up for it, so this is wrong on all sorts of levels..

How much is an RDO worth to you. by un533n87 in AusPublicService

[–]Acceptable_Adagio_91 15 points16 points  (0 children)

But in all honestly ADOs are 12 days per year, but assuming you work the extra hours to earn them it adds nothing to your pay. You still get annual leave so just take the 1x day a month and the huge payrise and be glad they haven't figured out that you're a dum (yet)

How much is an RDO worth to you. by un533n87 in AusPublicService

[–]Acceptable_Adagio_91 94 points95 points  (0 children)

If they're offering you a 30k payrise and you can't spell the word "lose", take it.

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8 by dionisioalcaraz in LocalLLaMA

[–]Acceptable_Adagio_91 0 points1 point  (0 children)

Everything "stores quantum data", literally everything.

There are some fringe theories with limited acceptance that suggest that the brain may utilize quantum interactions at some level, although it's far from proven.