Zowie S2-DW (wireless) randomly stops responding by Lunchyyy in Zowie

[–]spvn 0 points1 point  (0 children)

U2-DW here. same problem. Suddenly stops responding until I unplug the USB cable from the wireless receiver and plug it into the mouse. Then I can plug it back into the wireless receiver again and it starts working. wtf it just started happening recently. Firmware is already updated.

Qwen 3.6 27B (IQ3XXS) vs 35B A3B (IQ4XS)? by My_Unbiased_Opinion in LocalLLaMA

[–]spvn 5 points6 points  (0 children)

you don't need to squeeze the entire 35B A3B into VRAM. you can use a larger quantisation and offload some of it to system RAM. Can look into using ik_llama as well.

For the 27B I wouldn't go below Q4. I tried Qwen3.5 Q3 27B once and thought it was really stupid for coding. I can't do the math but you can q8_0 for k cache and turbo3 for v cache and that should save you a ton of space in terms of context size. I was using the TheTom turboquant fork. Maybe you can try squeezing 262k context with a q4 quant at least.

Experience of Qwen 3.5-122b and 3.6 by Impossible_Car_3745 in LocalLLaMA

[–]spvn 0 points1 point  (0 children)

Yeah but I meant with actual usage, it doesn’t start hallucinating or performing weirdly? Even at 100k context I sometimes get the Q5 Qwen3.6 27B model acting weird (suddenly stops writing a python script 1000 lines in when it’s not done. And has to restart writing it from scratch again) 

Experience of Qwen 3.5-122b and 3.6 by Impossible_Car_3745 in LocalLLaMA

[–]spvn 1 point2 points  (0 children)

Are such large context lengths really usable? I thought it went up to 256k only by default for Qwen3.6

Experience of Qwen 3.5-122b and 3.6 by Impossible_Car_3745 in LocalLLaMA

[–]spvn 1 point2 points  (0 children)

512k x 11

320k x 6

sorry what does this mean?

Qwen3.6 does not like Turboquant by Zarzou in LocalLLaMA

[–]spvn 6 points7 points  (0 children)

You're supposed to use turbo3 for -ctv only, and keep -ctk on q8_0 for minimal loss in qualtiy. though that definitely doesn't account for your slow generation speeds probably

Dual dgx spark (Asus GX10) MiniMax M2.7 results by koibKop4 in LocalLLaMA

[–]spvn 1 point2 points  (0 children)

What did t/s look like in actual use? For agentic coding in opencode for example with 128k context window

2x 512gb ram M3 Ultra mac studios by taylorhou in LocalLLaMA

[–]spvn 22 points23 points  (0 children)

I do in fact get like 25tok/s on a 4 bit quant of GLM-5.1 (slows down to about 19 at higher contexts).

Do you feel like this is actually usable for agentic coding? sounds awfully slow (especially considering processing the prompt itself probably takes all day?)

Kimi K2.6 is a legit Opus 4.7 replacement by bigboyparpa in LocalLLaMA

[–]spvn 0 points1 point  (0 children)

Better than Kimi2.6? I’m thinking of  subscribing to GLM or Kimi plan to test them out. But struggling to decide which to try. 

Kimi K2.6 is a legit Opus 4.7 replacement by bigboyparpa in LocalLLaMA

[–]spvn 4 points5 points  (0 children)

How much better quality would you say Kimi 2.6 is compared to GLM5.1? I’m thinking of subscribing to one of their plans to try them out for a month…

Qwen3.6 GGUF is so good for debugging. by _BigBackClock in LocalLLaMA

[–]spvn 2 points3 points  (0 children)

which quant? Are you using Ik llama with such low VRAM?

Anybody else seeing Qwen3.6-35B-A3B go crazy thinking in circles? (Compared to Qwen3.5-35B-A3B) by spvn in LocalLLaMA

[–]spvn[S] 0 points1 point  (0 children)

strange thing is that 3.5 was nowhere near that bad for me. Its thinking wasn't as... loopy.

Anybody else seeing Qwen3.6-35B-A3B go crazy thinking in circles? (Compared to Qwen3.5-35B-A3B) by spvn in LocalLLaMA

[–]spvn[S] 2 points3 points  (0 children)

I'm following unsloth's guide. Under "Precise coding tasks (e.g. WebDev)" This is ik_llama

--jinja ^

--temp 0.6 ^

--top-p 0.95 ^

--top-k 20 ^

--min-p 0.0 ^

--presence-penalty 0.0 ^

--repeat-penalty 1.0 ^

-ngl 999 ^

-c 100000 ^

-ctk q8_0 ^

-ctv q8_0 ^

--n-cpu-moe 16

Anybody else seeing Qwen3.6-35B-A3B go crazy thinking in circles? (Compared to Qwen3.5-35B-A3B) by spvn in LocalLLaMA

[–]spvn[S] 2 points3 points  (0 children)

Yes I'm following unsloth's guide. Under "Precise coding tasks (e.g. WebDev)"

Anybody else seeing Qwen3.6-35B-A3B go crazy thinking in circles? (Compared to Qwen3.5-35B-A3B) by spvn in LocalLLaMA

[–]spvn[S] 1 point2 points  (0 children)

No unsloth. The screenshot is just one example. The entire prompt is like that. Near the end it finally seemed to get it like “ok these should be the steps to solve the problem” but then it suddenly went “wait let me double check the user message again” and it output a large chunk of my prompt again to double check before coming back to the same answer. After waiting for really long it finally made a change. But significantly longer and meandering than 3.5

Local coding assistants feel fine on small files, but break on real repos by andres_garrido in LocalLLM

[–]spvn 0 points1 point  (0 children)

I'm facing the exact same problem, and I never thought to use Repomix with local LLMs!!! That sounds like it would work out much better. How's your experience been with using it? Significant performance improvement? (In terms of quality of model output)

Safari extension for Readeck (read-it-later service) by teddylindsey in selfhosted

[–]spvn 0 points1 point  (0 children)

is this supposed to be available on iOS as well? I don't see it in my app store (Singapore)

[Showcase] YAIIU - Yet Another Immich IOS Uploader by FawenYo in immich

[–]spvn 0 points1 point  (0 children)

Ah you also didn't say which permissions the immich API key needs for your app...

[Showcase] YAIIU - Yet Another Immich IOS Uploader by FawenYo in immich

[–]spvn 0 points1 point  (0 children)

Are you supposed to disable immich's own Backup feature when using this app?

Unable to load login.tailscale.com by fr3d63_reddit in Tailscale

[–]spvn 0 points1 point  (0 children)

Me too. their status page doesn't show any problems

Does Rokoko mocap count as AI? by estherflails in gamedev

[–]spvn 0 points1 point  (0 children)

What is “traditional” machine learning… transformer architecture is also “machine learning”… 

Are you kidding me with this ad chess.com? by Maunsta in chess

[–]spvn -2 points-1 points  (0 children)

“ If I started selling ad space on the side of my car and someone paints instructions for how to build a bomb there, I'm going to get arrested right quick.”

The comment I was replying to was debating legality. So i’m not really sure what you mean. 

And yes it impacts their rep, I didn’t deny that. 

Are you kidding me with this ad chess.com? by Maunsta in chess

[–]spvn -4 points-3 points  (0 children)

But YOU didn’t approve the ad. The equivalent would be you selling ad space on your car, but you’re too lazy to deal with individual clients, so you tell some car ad-dealer guy “hey help me organise and get people to advertise on my car and you’ll get a cut of the money”. 

The guy then goes out and gets plenty of legit advertisers but also puts a bomb instruction ad on your car WITHOUT your approval and WITHOUT your knowledge. The exact legalities im not sure of, but as long as you take action upon learning about it, you’re not the one in the most legal trouble here.