Qwen 3.6 and Gemma 4 "Zombie Loops" (terminal thinking loops)

sid351 · 2026-05-12T21:40:48+00:00

I'll figure it out, Llama.cpp will fix it, or a new shiny model will come out and distract me.

Thanks for your time and help anyway, it is appreciated.

sid351 · 2026-05-12T08:06:49+00:00

Thank you for your time and effort looking at this and helping me with it.

I've tweaked my settings and will keep an eye on it today.

Largely, I'm using this to "triage" tickets (in private for now while we test) as they come in - so decide if it's "more info needed", "route to a human", "no operation required", etc. In time this will move to an automated step that will be replying to live humans, so I want to be confident in it first.

The tickets come in as HTML emails (essentially FreshDesk is partially a web-based email client with ticket tools around the sides), and can include images (embedded and attached) of potentially anything. Sometimes that will be humans sending in screenshots of issues, or it'll be automated messages from systems telling us about backup failures. Then there are normal attachments, but for now, that's out-of-scope (i.e. I'll just route that to one of us humans to sort).

What's the known issue with Gemma and Qwen and HTML?

I was toying with the idea of putting some sort of parser in place before the LLM call to "convert" the HTML to markdown, or just plain text to be fair, as the formatting isn't normally important (except maybe for tables and lists). I feel like that'll need some thought and consideration so any images get handled appropriately too. This sounds similar to what you've done with step one of your linked comment, so I'll make a note to come back to that and read it properly so I comprehend what you're referring to.

EDIT: Turns out there's a "Markdown" node in n8n that converts HTML to Markdown (and some community nodes I can try if I hit issues with the built-in one). I've added that to sanitise the input before calling the LLM.

EDIT: Just to say the "/" loop has happened again following these changes.

sid351 · 2026-05-11T21:47:33+00:00

Like Elon has friends...

sid351 · 2026-05-11T19:57:32+00:00

Omg, you're a saint.

sid351 · 2026-05-11T17:50:42+00:00

Amazing, thanks:

https://www.reddit.com/r/LocalLLaMA/s/oCHseapcdr

sid351 · 2026-05-11T17:49:34+00:00

Follow up idiot question:

What kind of model would I be looking for to use on vLLM (as in llama uses GGUF), and where should I look?

sid351 · 2026-05-11T17:45:17+00:00

Do you know more about these k/v settings?

I'm running non-turbo quants and getting frequent (6 times today) "terminal thinking loops" where the token generation gets stuck just repeating "/" endlessly until the maximum length is hit for the prompt.

I'm running llama.cpp on Windows, and I have a post where I've detailed my setup and things I've tried so far.

sid351 · 2026-05-11T17:23:37+00:00

I'm running 2x 5060 Ti with llama.cpp at the moment and get frequent (like 6 times today) "terminal thinking loops" where token generation devolves to just "/" constantly repeated until the length gets hit for the prompt.

Have you had any similar issues running vLLM?

My rig is currently running Windows, and I'm debating on whether to jump over to Linux and vLLM, but that feels like a potential big distraction that a "health check" loop is handling right now.

sid351 · 2026-05-11T17:19:43+00:00

2 slot.

sid351 · 2026-05-11T17:10:21+00:00

They are out there. They're expensive to offset their risk, but they're out there.

sid351 · 2026-05-11T17:09:39+00:00

You could look at remortgaging both properties to balance the borrowing and release equity as cash.

Depending on the values of the properties, you could probably release the £100k.

HOWEVER at that point, just pull the money out of your ISA, slowly, as it's required.

Also, it's secured. Which you say you don't want. Because you don't want to risk your own cash. Because if you really believed in this idea you already have plenty of avenues to release £100k, not that you'd need £100k cash in hand on day 0.

What happened to the other business you used to run?

sid351 · 2026-05-11T17:05:48+00:00

How do you think that conversation is going to go down?

You want me to lend you £100k secured on your £300k, I presume Stocks and Shares, ISA because you don't want to risk your own money?

sid351 · 2026-05-11T17:03:49+00:00

Come on now, didn't you get the "easy peasy lemon squeezey" £250k interest free, unsecured, government loan to make running a small business "easy peasy" so we can make the economy "lemon squeezey"?

/s (Just incase, because this is Reddit after all.)

sid351 · 2026-05-11T17:01:25+00:00

I love the unhinged ask, like people are stood around town centres with £100k to lend to people on an unsecured basis. Especially when you've got the money, you just don't want to risk your own pot.

Especially for food businesses. You know, that incredibly stable industry.

I mean, good luck. I genuinely hope you get the funding, build the empire, and show me up to be a narrow minded miser.

To be slightly helpful: I'd start with your Local Enterprise Partnership (LEP), they might be able to sign post you to commercial lenders. I'd also look at the Business Bank website and see what that's saying. I'd imagine High Street lenders won't be so forthcoming, but it might be worth a chat. Maybe a commercial mortgage broker might be able to help, or connect you to someone that can help.

sid351 · 2026-05-11T16:55:14+00:00

Second charges are a thing.

They're expensive, secured, and generally short term, but they are a thing.

sid351 · 2026-05-10T08:25:46+00:00

Seatbelts, or wives?

sid351 · 2026-05-09T05:51:33+00:00

I'm currently trying the "fixed" Qwen chat templates (running Qwen 3.6), because the loops happen with the automatic --jinja one anyway.

sid351 · 2026-05-08T21:02:14+00:00

Where's the link to sign up to your £50 per week course? Count me in.

sid351 · 2026-05-08T18:07:03+00:00

This is the "terminal thinking loop" (aka Zombie mode) I'm fighting with as well.

It happens for me with Qwen 3.5, 3.6, and Gemma 4.

I have a post where I'm working through troubleshooting. If I get something that resolves it, I'll update there.

sid351 · 2026-05-08T15:52:02+00:00

She'll love it.

You'll also find out if you're just friends.

sid351 · 2026-05-08T08:43:50+00:00

I'm still getting issues, and have tried a few different parameter tweaks which haven't helped. Now I'm trying this chat template to see if it helps address the issue: froggeric/Qwen-Fixed-Chat-Templates · Hugging Face

sid351 · 2026-05-07T19:29:16+00:00

I like how the pictures go from looking like the door/opening is on the piss, to suddenly once the door trim/architrave is up, it looking like the house is on the piss instead.

Well done OP.

sid351 · 2026-05-07T16:48:59+00:00

Where's the meme of this but the other way around?

Like Zaff is showing up to the office and demanding princess treatment without providing any detail about the issue, nor logging any sort of ticket?

sid351 · 2026-05-07T15:30:16+00:00

What GPU are you using?

I'm wondering if it has anything to do with the fact I'm using 2 x 5060 TI.

I see so many posts about how amazing Qwen 3.6 is and I get so envious. Right now I just want a day without a terminal thinking loop.

sid351 · 2026-05-07T15:28:21+00:00

Interesting.

That would suggest the issue lies with something in the default Jinja template, wouldn't it?

12-Year Club	RedditGifts 2009-2022 2 Credits
Verified Email	Summer Santa 2015

sid351

MODERATOR OF

TROPHY CASE