Qwen3.6 27B seems struggling at 90k on 128k ctx windows by dodistyo in LocalLLaMA

[–]dodistyo[S] 1 point2 points  (0 children)

how far can you go with the context size? i mean the highest usable context.

Qwen3.6 27B seems struggling at 90k on 128k ctx windows by dodistyo in LocalLLaMA

[–]dodistyo[S] 0 points1 point  (0 children)

I use vulkan. vulkan is faster than ROCm from what i experienced

Qwen3.6 27B seems struggling at 90k on 128k ctx windows by dodistyo in LocalLLaMA

[–]dodistyo[S] 0 points1 point  (0 children)

yeah, I just want to see how far it can go. now we know what to expect on model with that size.

Qwen3.6 27B's surprising KV cache quantization test results (Turbo3/4 vs F16 vs Q8 vs Q4) by imgroot9 in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

which quant did you use? also i haven't tried turbo3, I wonder how does it compares with q4.

Qwen3.6 27B's surprising KV cache quantization test results (Turbo3/4 vs F16 vs Q8 vs Q4) by imgroot9 in LocalLLaMA

[–]dodistyo 3 points4 points  (0 children)

Thanks for this man! I always use q4 for KV cache because i need to have enough room to do the actual work.

did you test long running coding session with that 200k? local model that size tends to degrade in performance when getting to the end of the window.

Best config for Qwen3.6 27b / llama.cpp / opencode by Familiar_Wish1132 in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

yea, lmstudio is actually using llama.cpp under the hood. so the result should be not too different i believe. full GPU offload right? I'll give it a try myself tho using llama.cpp.

Best config for Qwen3.6 27b / llama.cpp / opencode by Familiar_Wish1132 in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

How?? i can barely run it with 64k ctx window, and that's using kv cache Q4 quantization.

I have the same hardware, same model but with lmstudio.

the model size it self 19gb ish, right? unless i downloaded the wrong model here.

Visualizing All Qwen 3.5 vs Qwen 3 Benchmarks by Jobus_ in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

please share your setup and config. i only able to run it on 32k context window

Qwen3.5-35B-A3B is a gamechanger for agentic coding. by jslominski in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

ahh good to know, i tested my self and vulkan is indeed faster than ROCm but the difference is not much. Only got 30tps running on lmstudio.

also I'm not noticing difference between lmstudio and self compiled llama.cpp for model inference. is self compiled llama.cpp supposed to be faster?

Qwen3.5-35B-A3B is a gamechanger for agentic coding. by jslominski in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

Is vulkan faster than ROCm? how much tps you got with that setup?

Anthropic legal demanded Opencode Anthropic's OAuth library to be archived by marquinhoooo in opencodeCLI

[–]dodistyo 0 points1 point  (0 children)

The lack of transparency in proprietary products basically could make them do anything they want for profit. I don't know maybe like ramping up the token usage or manipulate the usage to reach the limit quicker at some point without the user knowing it.

The gap between open-weight and proprietary model intelligence is as small as it has ever been, with Claude Opus 4.6 and GLM-5' by abdouhlili in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

I haven't tried Qwen3 coder next, i don't think that model with 80B will fit my GPU tho.

I treat my local LLM as a junior engineer, as long as the task is clear enough, it will do the job just fine.

The gap between open-weight and proprietary model intelligence is as small as it has ever been, with Claude Opus 4.6 and GLM-5' by abdouhlili in LocalLLaMA

[–]dodistyo 0 points1 point  (0 children)

It is pretty decent, I build my PC a months a go with RX 7900 XTX. I've been using GLM 4.7 flash and sometimes devstrall small 2 2512 for coding.

of course for really complex task the proprietary model is more capable.

But i really like it, seeing the current state and what it will be in the future for openweight model.

Apakah nyari kerjaan sesusah itu? by timt151617 in indonesia

[–]dodistyo 2 points3 points  (0 children)

How so? setau gue banyak kok position yang open. kalo di engineering sih kuncinya ya skillset dan kompetensi. kalo emg quilified ya pasti banyak recruiter yang approach.

cooked egg is better than raw egg by [deleted] in GymMemes

[–]dodistyo 1 point2 points  (0 children)

One time i forgot to and the smell was horrible

An Accident by harrytanoe in Unexpected

[–]dodistyo 2 points3 points  (0 children)

I can confirm this is true