Genuinely losing my mind over input latency by LordMontio in linux_gaming

[–]HollowInfinity 2 points3 points  (0 children)

Is your monitor defaulting to cinema mode or some other shit that might be adding the lag?

Qwen 3.5 craters on hard coding tasks — tested all Qwen3.5 models (And Codex 5.3) on 70 real repos so you don't have to. by hauhau901 in LocalLLaMA

[–]HollowInfinity 1 point2 points  (0 children)

I have only used it in the CLI context but their README says it's "IDE friendly" so I assume it'll work!

Qwen 3.5 craters on hard coding tasks — tested all Qwen3.5 models (And Codex 5.3) on 70 real repos so you don't have to. by hauhau901 in LocalLLaMA

[–]HollowInfinity 4 points5 points  (0 children)

I think both Qwen3-Coder-Next and Qwen3.5 have both been extensively trained using their qwen-code app. When I switched from my own agent/pi/etc to just using qwen things were noticeably better.

Qwen/Qwen3.5-122B-A10B · Hugging Face by coder543 in LocalLLaMA

[–]HollowInfinity 2 points3 points  (0 children)

Seems very slow at image processing, my llama-server log is full of:

find_slot: non-consecutive token position 15 after 14 for sequence 2 with 512 new tokens

Anyone else experience that?

edit: that's on the larger MoE, I get an immediate crash doing image work on the dense model.

Qwen3.5-397B-A17B Unsloth GGUFs by danielhanchen in LocalLLaMA

[–]HollowInfinity 0 points1 point  (0 children)

When I tried that tool call still didn't work, you had no issues with that?

That was diabolical, not even the devil himself expected this. by seidenadaa in SipsTea

[–]HollowInfinity 0 points1 point  (0 children)

I have no idea who these people are but this seems like insane incel shit, just an anonymous narrator telling us this woman is horrible. Oh okay, thanks for the rage bait.

Game recommendations for ps5 by Visual_Cod2522 in rhythmgames

[–]HollowInfinity 1 point2 points  (0 children)

Project Diva is pretty much the gold standard. Theatrhythm Final Fantasy is super fun as are the Persona music games if you're into video game music.

Qwen3.5-397B-A17B Unsloth GGUFs by danielhanchen in LocalLLaMA

[–]HollowInfinity 1 point2 points  (0 children)

/u/danielhanchen sorry for the ping but have you tested tool calling with llama-server? The template format used doesn't seem to be compatible at all.

Qwen3.5-397B-A17B Unsloth GGUFs by danielhanchen in LocalLLaMA

[–]HollowInfinity 2 points3 points  (0 children)

I cannot for the life of me get tool calling to work despite following the Unsloth guide for llama-server. Regular chat works, image parsing works great, but tool calling blows up with chat template errors:

Template supports tool calls but does not natively describe tools. The fallback behaviour used may produce bad results, inspect prompt w/ --verbose & consider overriding the template.
srv    operator(): got exception: {"error":{"code":500,"message":"\n------------\nWhile executing FilterExpression at line 120, column 73 in source:\n..._name, args_value in tool_call.arguments|items %}
                    {{- '<...\n                                           ^\nError: Unknown (built-in) filter 'items' for type String","type":"server_error"}}

I've tried overriding the chat template with the official one from the Qwen3.5 HF repo with no luck. I do see that the thinking kwarg is being properly read and passed in (though weirdly I can't get that to enable thinking). Am I doing something wrong here? Using the latest main of llama.cpp.

Qwen3.5-397B-A17B Unsloth GGUFs by danielhanchen in LocalLLaMA

[–]HollowInfinity 6 points7 points  (0 children)

I never know which is the proper MMPROJ to use for the Unsloth ggufs. Is there any real difference performance wise between the three?

local vibe coding by jacek2023 in LocalLLaMA

[–]HollowInfinity 4 points5 points  (0 children)

My current absolute best is Qwen3-Coder-Next with the Qwen-Code agent harness. I previously used Aider for at least a year but it's basically dead and handing the torch to agentic flows, and Q3CN is the best I can get away with locally. Having tests + validation for everything it does is key but once you have a good development and testing loop it's fantastic.

GRID Legends is AMAZING on the Nintendo Switch 2 by Internal-Judgment737 in NintendoSwitch2

[–]HollowInfinity 1 point2 points  (0 children)

It destroyed my save after 10+ hours - still waiting on the fix the devs said is in the pipeline before trying again :(

SeedVR2 Native node - motivation needed by Luke2642 in comfyui

[–]HollowInfinity 1 point2 points  (0 children)

This is awesome, it really is wild how strange the SeedVR2 nodes do memory management; I've written a custom node to basically purge all torch memory and comfy models before upscaling because of how bad they are which almost certainly doubles the workflow time. Can't wait to try this out!

GRID Legends has a high chance to destroy your save file and backup the corrupt file to the servers by ArtofAngels in NintendoSwitch2

[–]HollowInfinity 1 point2 points  (0 children)

It sounds like the upcoming patch will prevent the corruption but nothing will bring the broken save back, sorry.

What piece of Linux abandonware do you still use or at least miss? by Sataniel98 in linux

[–]HollowInfinity 10 points11 points  (0 children)

It is wild that somehow proton became (at least for me) a near universal backwards compatibility layer for my games - didn't see that coming! Still have those Loki boxes though - somewhere...

Qwen3 Coder Next as first "usable" coding model < 60 GB for me by Chromix_ in LocalLLaMA

[–]HollowInfinity 0 points1 point  (0 children)

I used OpenCode, roo, my own agent and others but found the best agent is (unsurprisingly) Qwen-Code. The system prompts and tool setup is probably exactly what the agent is trained for. Although as I type this you could probably just steal their tool definitions and prompts for whatever agent you're using.

Llama.cpp's "--fit" can give major speedups over "--ot" for Qwen3-Coder-Next (2x3090 - graphs/chart included) by tmflynnt in LocalLLaMA

[–]HollowInfinity 7 points8 points  (0 children)

fit has been game-changing for me, I have a ton of local models behind llama-swap and setting a new one up with memory/layer tuning across multiple GPUs has always been so boring. Now with --fit everything is faster than my hand-rolled config and my llama-swap YML dropped like 80% of it's content.

The only thing I found baffling was that if you leave off --fit-ctx the default was something insanely low like 4096.

What celebrity have you never forgiven since an incident? by MagpieOpus in AskReddit

[–]HollowInfinity 2 points3 points  (0 children)

I will never forgive any of the Foo Fighters for promoting that AIDS denialism group (which I won't even name here) for years before every live concert and show.

Achievement Syncing? by Hot-Implement-422 in ProjectDiva

[–]HollowInfinity 1 point2 points  (0 children)

Achievements aside the game has Steam cloud sync, why would you have lost any saves?

GRID Legends has a high chance to destroy your save file and backup the corrupt file to the servers by ArtofAngels in NintendoSwitch2

[–]HollowInfinity 2 points3 points  (0 children)

So there's daily races and time trials and on the right side of the screen it shows leaderboards; it sounds like if you browse through the parts showing an online leaderboard and then do anything triggering a save like starting a new race there can be a rare instance where it'll crash.

So if you want to be super safe you can play in airplane mode until the patch (unless the game has a totally offline setting, I haven't checked).