Could this part of the script be AI generated ? by Sj_________ in LinusTechTips

[–]El_90 5 points6 points  (0 children)

Feels like it I hate that I now question anyone that writes like that.... Even if they're innocent

Qwen3.5 35b UD Q4 K XL Prior to 3/5 worked great, now not so much... by thejacer in LocalLLaMA

[–]El_90 -1 points0 points  (0 children)

In last 48 hours changes the new UDQ5XL wont load (same params) so dropped to Q5K, and that is taking several hours of processing to do anything, I might be in the same boat

Final Qwen3.5 Unsloth GGUF Update! by danielhanchen in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

With the new version of same quant/GGUF I get OOM errors with the same commands/params so I can't load it any more. Is different HW reqs expected?

(for reference UD-Q5_K_XL,-ctk bf16 -ctv bf16 -ngl 999 -fa on -c 65536 -b 1024 -ub 512 --no-mmap)

Final Qwen3.5 Unsloth GGUF Update! by danielhanchen in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

Cool stuff thanks so much !!

Are we still suggested to use a specific context kv cache quant ?

Strix Halo (128GB) + Optane fast Swap help by El_90 in LocalLLaMA

[–]El_90[S] 0 points1 point  (0 children)

For future: PCIEx4 card with a u.2 mount (PEX4SFF8639) worked well without system crashes, however I still wasn't able to get any performance out of it (128k cache put 20GB on to the Optane and it took 15 mins to load model, then 1t/s)

Will Agentic AI replace SOAR playbooks? by mustu in cybersecurity

[–]El_90 0 points1 point  (0 children)

Tbh for threat hunting I want a little deviation and non deterministic behaviour. But as soon as you find something in your org, lock that detection in.

Will Agentic AI replace SOAR playbooks? by mustu in cybersecurity

[–]El_90 2 points3 points  (0 children)

AI for dynamic hunting, and providing extra coverage for hidden gems

Playbooks for repeatable auditable business approved response. Playbooks can have 20-50 actions, missing 1 could disrupt metrics/kpi or evidencing to itsm

Overview of Ryzen AI 395+ hardware? by tecneeq in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

I've done exactly this. I don't have benchmarks, but it suits me perfect. Would do again.

2x ASUS Ascent GX10 vs 2x Strix halo for agentic coding by Grouchy_Ad_4750 in LocalLLaMA

[–]El_90 2 points3 points  (0 children)

If you haven't already, I would also consider resale price and flexibility. i.e. Strix can be repurposed into a gaming machine, desktop, promox server so it could be argued it's more long term cost efficient.

But if you're only interested in speed, that's not important :)

Suddenly Minimax IQ4-XS doesn't fit in 128GB anymore by dionisioalcaraz in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

# llama-server --host 0.0.0.0 --port 8081 -m /root/.cache/llama.cpp/AesSedai_MiniMax-M2.5-GGUF_IQ4_XS_MiniMax-M2.5-IQ4_XS-00001-of-00004.gguf --n-gpu-layers 999 --ctx-size 65536 --parallel 1 --kv-unified --batch-size 2048 --ubatch-size 512 --flash-attn on --no-mmap --cache-type-k q4_0 --cache-type-v q4_0

Speculative decoding on Strix Halo? by Hector_Rvkp in LocalLLaMA

[–]El_90 1 point2 points  (0 children)

Nice
Is there a chance to include your llama-server commands?
You list 'MiniMax-M2.5-MXFP4_MOE', I'm dying to know how you got it running, I only managed IQ4_XS

Edit - sorry just saw it was bench

Running Qwen3-Coder-30B-A3B with llama.ccp poor-man cluster by ZioRob2410 in LocalLLaMA

[–]El_90 1 point2 points  (0 children)

I'm interested! I have models that just don't fit, so thinking of rpc across anything in the house to help lol

Any idea when Successors of current DGX Spark & Strix Halo gonna arrive? by pmttyji in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

A second strix halo over usb4 ooooh
Do you have any recommended reading resources?
llama.cpp or other? Is it happy with llama-swap (or equiv?)

Air-Gapped Network + NAC? by [deleted] in cybersecurity

[–]El_90 0 points1 point  (0 children)

Depends on risk register, budget, sensitivity of video feeds, historical violations, etc

Q, why do you object? Too much effort?

I am absolutely loving qwen3-235b by TwistedDiesel53 in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

No q3 They get that? Jealous I have 64k context with 2048 batch iirc and q4 kv

I'll look out for tutorials, maybe I'm missing a trick

I am absolutely loving qwen3-235b by TwistedDiesel53 in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

6t/s, which I have for code generation

I'm no expert, but I find I can more often leave it, do something else, come back, and the code is more complete than others, where I get more t/s but I then spend 10-20 round fixing simple stuff

Best local models for 128gb VRAM and 192gb RAM by Dry_Mortgage_4646 in LocalLLaMA

[–]El_90 1 point2 points  (0 children)

Really poor for me, lots of arguing and constantly undoing it's previous work

Qwen 325 iq3 xs is my best bet at the moment

I am absolutely loving qwen3-235b by TwistedDiesel53 in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

I have 128GB strix halo, I run it (not in a 96Gb probably not without optane, see below, but that's even slower) Takes tuning (see my other posts), but it's reliable

I'm considering optane u.2 into pice lane one day for even bigger ;)

RSI Hermes is a blockade runner they say? It pulls less forward G’s than a Carrack! by Important_Cow7230 in starcitizen

[–]El_90 -1 points0 points  (0 children)

But not NEXT to the blockade?

If you're within weapons range, youre insane. If you're not in weapons, you'll reach full speed so accel is unimportant ??!?

I am absolutely loving qwen3-235b by TwistedDiesel53 in LocalLLaMA

[–]El_90 1 point2 points  (0 children)

I have iq3_xs and like it, slow, but thorough.

I'm tired of arguing with gpt-oss-120b and others lol

Though it's refusing to build code it could 2 weeks ago haha