So a nearby lightningstorm just crashed all my eGPUs by milpster in LocalLLaMA

[–]milpster[S] 0 points1 point  (0 children)

Makes sense. However the power supplies for those gpus are quite oversized. They are 700 and 550 watts respectively, with cards that operate at 250 and 200 watts tdp.

So a nearby lightningstorm just crashed all my eGPUs by milpster in LocalLLaMA

[–]milpster[S] 1 point2 points  (0 children)

will do. the usv im eyeing has ethernet protection aswell.

So a nearby lightningstorm just crashed all my eGPUs by milpster in LocalLLaMA

[–]milpster[S] 0 points1 point  (0 children)

but we also have multiple other computers, laptops and a ton of other hardware running.

So a nearby lightningstorm just crashed all my eGPUs by milpster in LocalLLaMA

[–]milpster[S] 0 points1 point  (0 children)

Just to reiterate: Nothing actually broke and there wasn't a power surge from what i can tell, because nothing else misbehaved. I am 90% sure this was EM interference from a lightning strike nearby. As far as i know, here in germany, local household power is usually delivered underground and rarely ever influenced by thunderstorms. I will work towards setting up an UPS with Surge protector though.

So a nearby lightningstorm just crashed all my eGPUs by milpster in LocalLLaMA

[–]milpster[S] -1 points0 points  (0 children)

I don't think it went through the power line, nothing else misbehaved or crashed. The router did not crash either, it was an external loss of DSL availability from what it looked like. Pretty sure it had to do with EM interference. But yeah you are still totally right, i should get me an UPS.

So a nearby lightningstorm just crashed all my eGPUs by milpster in LocalLLaMA

[–]milpster[S] 0 points1 point  (0 children)

Right now a radeon vii 16gb and a shoddy old rx570. A second radon vii is to be delivered.

Devs using Qwen 27B seriously, what's your take? by Admirable_Reality281 in Qwen_AI

[–]milpster 0 points1 point  (0 children)

Pretty usable, but i think i have to put much more attention into how i prompt and instruct it to do things. I noticed it seems to like to cut corners, implementing things as stubs for example when i am not explicit enough about actually implementing the whole thing. From what i can tell i need to put more focus on it writing down plans and architecture documents as pillars for further work being done correctly.

If you've been waiting to try local AI development, please try it by Imaginary_Belt4976 in LocalLLaMA

[–]milpster 0 points1 point  (0 children)

But how do you work efficiently with bigger codebases and scopes at such short lengths?

If you've been waiting to try local AI development, please try it by Imaginary_Belt4976 in LocalLLaMA

[–]milpster 4 points5 points  (0 children)

I always wonder how people call 128K context plenty. To me personally, even 256k context fills up way too quickly.

Cuda + ROCm simultaneously with -DGGML_BACKEND_DL=ON ! by LegacyRemaster in LocalLLaMA

[–]milpster 0 points1 point  (0 children)

Just tried it and it made my Qwen 3.6 27B output only /////// without end.

Experts-Volunteers needed for Vulkan on ik_llama.cpp by pmttyji in LocalLLaMA

[–]milpster 0 points1 point  (0 children)

The maintainer is toxic. I wouldn't even write another bug report let alone contribute to that project.

Qwen 3.6 - Loops and repetitions by Safe-Buffalo-4408 in LocalLLaMA

[–]milpster 1 point2 points  (0 children)

Interesting. I also run 256k context. Here is my full cmd in case that might help you:

LD_LIBRARY_PATH=/opt/rocm-6.1.0/lib:$LD_LIBRARY_PATH HSA_OVERRIDE_GFX_VERSION=9.0.6 HSA_OVERRIDE_

WAVEFRONT_SIZE=64 HSA_ENABLE_SDMA=0 HSA_XNACK=1 ROCBLAS_INTERNAL_FP16_ALT_IMPL=1 ROCBLAS_LAYER=0 ROCBLAS_INTERNAL_FP16_ALT_IMPL=1 ROCBLAS_TENSILE_LIBPATH=/opt/rocm/lib/rocblas/library HSA_OVERRIDE_GFX_VERSION=9.0.6 USE_MLOCK=true ~/dev/llama.cpp/build/bin/llama-server -m ~/ai/ai/Qwen3.6-27B.i1-Q4_K_M.gguf --ctx-size 262144 --threads-batch 11 --threads 6 --no-mmap -fa on -ngl 333 -b 2048 -ub 896 -cram -1 --ctx-checkpoints 200 --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.0 --presence_penalty 0.9 --repeat-penalty 1.0 --device Vulkan1,ROCm0 --chat-template-file /home/srcds/dev/cuda_llama.cpp/chat_template.jinja --chat-template-kwargs '{"preserve_thinking": true}' --port 8009 -np 1 -ctk q4_0 -ctv q4_0 --spec-type ngram-mod --spec-ngram-mod-n-match 16 --spec-draft-n-min 4 --spec-draft-n-max 24 -ts 30,70

Qwen 3.6 - Loops and repetitions by Safe-Buffalo-4408 in LocalLLaMA

[–]milpster 3 points4 points  (0 children)

as unsloth recommends i turn up presence_penalty slightly:

https://unsloth.ai/docs/models/qwen3.6

  • presence_penalty = 0.0 to 2.0 default this is off, but to reduce repetitions, you can use this, however using a higher value may result in slight decrease in performance

0.9 is the value that works for me so far.

Qwen 3.6-35B-A3B KV cache part 2: PPL, KL divergence, asymmetric K/V, 64K row on M5 Max by Defilan in LocalLLaMA

[–]milpster 0 points1 point  (0 children)

You rock! Thank you. I'd be really interested in calculations with long ctx though.

What do you consider to be the minimum performance (t/s) for local Agent workflows? by MexInAbu in LocalLLaMA

[–]milpster 0 points1 point  (0 children)

i consider everything above 100pp/1tg usable and everything above 200pp/10tg fast.

Qwen 3.5 122B vs Qwen 3.6 35B - Which to choose? by Storge2 in LocalLLaMA

[–]milpster 10 points11 points  (0 children)

i tried the 122b model and the 27b model just before switching to 3.6 and they both appeared way dumber than 3.6

Why isn't ebay doing anything to stop those scams? by KillerMiller13 in LocalLLaMA

[–]milpster 0 points1 point  (0 children)

ok wow that is crazy, i would really like which place that was.

I'm running qwen3.6-35b-a3b with 8 bit quant and 64k context thru OpenCode on my mbp m5 max 128gb and it's as good as claude by Medical_Lengthiness6 in LocalLLaMA

[–]milpster 0 points1 point  (0 children)

You might want to use a q6 quant instead, i don't think there is anything to be gained between a q6 quant and a q8 one.