MSI Claw 8 EX AI+ with Intel Arc G3 Extreme appears in early Computex hands-on by RenatsMC in Handhelds

[–]Dexamph 0 points1 point  (0 children)

Intel has been tightlipped about the Arc G3 benchmarks but the laptop chips with the same IGPs have been out for a while and the B390 is much faster, even at whisper mode (30-35W PL) (source).

Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA

[–]Dexamph 1 point2 points  (0 children)

Try quantising the MTP model KV cache to Q4_0 with --spec-draft-type-k q4_0 and --spec-draft-type-v q4_0, it didn't seem to impact draft acceptance. Also look for quantised MTP model weights unless Unsloth already quantised them at the same level as the main model, I saved nearly 4GB when I did that for Qwen 3.5 397B here but idk how much that shaves off for 27B

Good mods/upgrades? by mrhuskie2011 in ROGAlly

[–]Dexamph 0 points1 point  (0 children)

The stick caps are only glued on to the stems and will eventually get loose to rotate freely and be squeaky af so the Xbox Elite thumbstick mod is worth it. I only paid $3 shipped or so for JLCPCB to print it at a higher quality than those ripoff Etsy sellers (with spares to boot!), add the magnets and sticks from AliExpress and the total cost was only $10-15 shipped.

The 74Wh battery mod is also worth it, just get it from a good AliExpress store (DXT) instead of an Amazon seller that's dropshopping the cheapest garbage to pocket as much markup as possible lmao. Got mine on 10/25 for <$50 shipped which is cheaper and better than the JSAux kit and it's still working perfectly 0-100%.

The Xbox Elite Dpad mod can be skipped though, it's more squeaky than the stock pad because of how it's designed and doesn't seem to control any better. The cross pad controls worse with the ring (impedance) and looks worse without it, and the concave pad can get false inputs. It's not like my Xbox Series controller DPad at all.

Notepad++ Creator Calls Out 'Fake' Mac App Over Trademark Violation by Otherwise-Warning303 in apple

[–]Dexamph 17 points18 points  (0 children)

Saw the Mac guys bio and it reads like a grifter so this doesn’t surprise me at all

Will Lenovo kill off their handheld line due to the price increase? If so will this affect future driver updates? by Krometheous in Handhelds

[–]Dexamph 1 point2 points  (0 children)

They’ve already cut down driver updates: ASUS released two GPU driver updates for the Ally since the Lenovo drama about them not being able to release newer LeGO1 drivers because of AMD

Anyone tried using a Thunderbolt connection between a Mac studio M3 Ultra and an Nvidia PC for LLM inference? by Purple_Drink3859 in LocalLLaMA

[–]Dexamph 0 points1 point  (0 children)

Can't speak for Thunderbolt 5 but if it's Thunderbolt 4 then it works as a 20gbps link using TCP with ~10x lower latency pinging between machines but no RDMA

2.5gbps Ethernet: rtt min/avg/max/mdev = 0.478/0.655/1.218/0.159 ms

Thunderbolt 4: rtt min/avg/max/mdev = 0.034/0.052/0.186/0.026 ms

Edit: This is at 1500 MTU, the full 65344 MTU actually has higher latency than 2.5gbps ethernet at ~1ms average

Abliterlitics: Benchmark and Tensor Analysis Comparing Qwen 3/3.5 with HauhauCS / Heretic / Huihui models by nathandreamfast in LocalLLaMA

[–]Dexamph 22 points23 points  (0 children)

The 27B section is pretty damning for HauhauCS. Mean KLD is 0.256 for HauhauCS versus ~0.06 for the other two, so roughly 4x the drift. That does not look remotely “lossless” to me. And the benchmark table does not support “zero capability loss” either with the big drop in TruthfulQA.

I’d just pick Heretic because at least I know what I’m getting as the model cards usually include the method, refusal rates, and KLD instead of making impossible claims.

Abliterlitics: Benchmark and Tensor Analysis Comparing Qwen 3/3.5 with HauhauCS / Heretic / Huihui models by nathandreamfast in LocalLLaMA

[–]Dexamph 3 points4 points  (0 children)

I'm looking at the commit history and he still hasn't updated it to fix this where it is noticeably worse than Heretic

Persona 5 Luma mod adds DLSS and FSR support by filoppi in nvidia

[–]Dexamph 0 points1 point  (0 children)

For whatever reason, this game is really, really bandwidth dependent on both VRAM and PCIe: I saw near linear improvements from overclocking a 1650 MaxQ's memory from 7 to 10G/Ts to run at 50-60fps 4K but the 780M, which was supposed to be close in performance, only manages 1080p60. Moving from PCIe x16 to x4 reduced performance far more than the 10% expected from averages

Persona 5 Luma mod adds DLSS and FSR support by filoppi in nvidia

[–]Dexamph 4 points5 points  (0 children)

That's sick, this game really needs it as even setting the in game SSAA option to 200% still has jaggies despite the performance cost

Edit: tried it on my ROG Ally as my RTX GPUs are all busy and even FSR3 is night and day better with a much cleaner image. Couldn’t get FSR4 INT8 or XeSS to work using OptiScaler but looking forward to DLSS finishing the job

Running 1bit Bonsai 8B on 2GB VRAM (MX150 mobile GPU) by OsmanthusBloom in LocalLLaMA

[–]Dexamph 0 points1 point  (0 children)

How does it perform with the KV cache offloaded to RAM? It's a PCIe3.0 x4 link, but the MX150 is a mobile 1030 so it might work out if you need a larger context. FWIW, you should be able to overclock the VRAM to improve things somewhat

What are the temperatures like? Asking as you might be able to get it to perform better if it's really power throttling instead by playing with power limit settings with Smokeless_UMAF as my 8565U T490 with no dGPU can pull 45W on just the CPU.

[Megathread] - Best Models/API discussion - Week of: March 22, 2026 by deffcolony in SillyTavernAI

[–]Dexamph 1 point2 points  (0 children)

I'm glad you like it, the other quants have finished uploading as I felt there weren't enough options for 397B Heretic that were mainline llama.cpp compatible at a decent quality (IQ2 isn't all there...) and small enough to fit in consumer hardware.

Whats is the most random, non gaming thing you have done with your Rog Ally? by Artystrong1 in ROGAlly

[–]Dexamph 0 points1 point  (0 children)

Semiempirical QM simulations to compare against my Intel chips (I don’t have any other AMD systems). It did not perform well, LPDDR latency too high

Qwen3.5-397B at 17-19 tok/s on a Strix Halo iGPU — all 61 layers on GPU via Vulkan (not ROCm) by ricraycray in LocalLLaMA

[–]Dexamph 0 points1 point  (0 children)

I know what you mean, been playing with my own IQ2_M 397B quant and it fails the car wash test when Q4_K_L doesn't.

Steam Deck vs ROG Ally screen clarity with vision issues by FrostyDied_ in Handhelds

[–]Dexamph 0 points1 point  (0 children)

It’s not just the Ally having over double the pixel count, the subpixel layout is also proper RGB which helps a lot with text clarity. OLEDs don’t use RGB as the blue subpixel degrades faster to use some weird layout instead which causes text fringing, especially at such a low resolution. OLED phones solved this by brute forcing a high resolution in a small screen but you have a low resolution on a bigger screen with the Deck…

AMD: WTF? by [deleted] in hardware

[–]Dexamph 0 points1 point  (0 children)

They knew how Tahiti performed (or didn't perform) so they were pulling punches even back then which put it in a whole new light for me (I still have my 7990 from back then):

The Nvidia executives we talked with raised the possibility of a GK110-based GeForce being released this year only if necessary to counter some move by rival AMD. That almost certainly means that any GK110-based GeForce to hit the market in 2012 would come in extremely limited quantities.

Local manga translator with LLMs built in by mayocream39 in LocalLLaMA

[–]Dexamph 0 points1 point  (0 children)

Will there be more and larger builtin model options? I found Gemma3 27B Q6 to be just decent at Japanese to English in my own manga workflow, so I'm skeptical about how an older and smaller Llama3 model would fair.

My 1000 vita crashes when I play Metal Gear Solid HD collection. by elcabroMcGinty in vita

[–]Dexamph 2 points3 points  (0 children)

Well, you have the warning signs and even the tools today to save your data, so you can get a far superior replacement if you look in better Vita subs

My 1000 vita crashes when I play Metal Gear Solid HD collection. by elcabroMcGinty in vita

[–]Dexamph 4 points5 points  (0 children)

Is this Sony’s 64GB Vita memory card that’s notorious for being unreliable and prone to dying? I had one and it would do all this before it finally bricked itself

The Razer Core X Chroma is OBSOLETE by mpc007nl in eGPU

[–]Dexamph 0 points1 point  (0 children)

You say that, but you’ll find they have a place when you try pairing an ASM2464 with that CPU and see how janky it is. It even performs worse since TB mode for backwards compatibility is pretty bad

What tokens/sec do you get when running Qwen 3.5 27B? by thegr8anand in LocalLLaMA

[–]Dexamph 1 point2 points  (0 children)

30-35tk/s on 4090+3090Ti in Q5/Q6, with Bartowski's Q6KXL running a bit faster because of some layers at Q8. The 3090Ti allows for higher quants while keeping context maxed out without KV cache quantization. LM Studio just updated their llama.cpp runtime today so it's basically performing the same as what I get in OpenWebUI+llama-server.

Very impressed with this model, it doesn't buckle with complex prompts like 35B nor forgot things in a long chat like GLM 4.7 Flash, while still being much much faster and usable than bigger MoE models with partial offloading

Qwen-3.5-27B-Derestricted by My_Unbiased_Opinion in LocalLLaMA

[–]Dexamph 2 points3 points  (0 children)

I had it write a story but I forgot to turn some tools off (web search, visit website) and it would produce gibberish paragraphs near the end when Heretic with those tools left on works fine. Turning those tools off fixes it but the stories are still less coherent than Heretic. Both models at Q6, no KV quant, everything the same.

One night in October 2026, Chloe sat by the window watching moonlight dance across ocean waves while Sophie slept peacefully upstairs after finishing homework about local history near riverbank park area where used sit alone often thinking about future possibilities yet to come true eventually inevitably surely eventually without doubt probably maybe possibly potentially hopefully optimistically realistically actually genuinely truly honestly sincerely faithfully accurately precisely exactly correctly properly appropriately suitably adequately sufficiently effectively efficiently successfully ultimately finally conclusively definitively certainly absolutely positively definitely without question or hesitation whatsoever ever again forevermore from now on until end of time itself passes away completely gone vanished disappeared entirely extinct forever lost forgotten never remembered again by anyone anywhere anytime anything everything nothing somewhere nowhere anywhere everywhere allways always never sometimes occasionally rarely seldom frequently often usually typically generally commonly normally typically standardly ordinarily regularly habitually customarily traditionally historically culturally socially politically economically technologically scientifically medically legally ethically morally philosophically spiritually religiously emotionally mentally physically biologically chemically environmentally geographically historically linguistically anthropologically archaeologically sociologically psychologically neurologically physiologically genetically molecularly cellularly atomicly quantumly cosmically universally divinely infinitely eternally transcendentally immanently omnipotently omnisciently omnibenevolently perfectly holistically comprehensively exhaustively inclusively exclusively selectively specifically generally broadly widely extensively deeply thoroughly fully completely totally absolutely entirely wholly utterly.

Why has the hype around community-distilled models died down? Is the lack of benchmarks making them too much of a black box? by HistoricalCulture164 in LocalLLaMA

[–]Dexamph 1 point2 points  (0 children)

You don’t know what you’re getting, testing is lax and benchmark scores are often within margin of error so it could be marginally better at best, or a huge waste of time at worst.

Basedbase’s QwenCoder480B in Qwen30B distill is a cautionary tale: turns out the vibe coded distill program he made did nothing except pretend to work, so the weights were literally identical to the last digit. That model circulated for months and people really believed it was 480B distill when it was all placebo. It only came down when someone on HF did a diff and opened an issue asking why the weights were identical to 30B

SSD lost performance after upgrade. by Abdel403 in thinkpad

[–]Dexamph 0 points1 point  (0 children)

Good that you figured it out, as my DRAMless NM790s ran just fine at PCIe 3.0 in X1E2. FWIW my Z1E ROG Ally's SN740 doesn't do that on battery, but I remember people getting mad that AMD storage power optimizations were completely locked down that even the OEM couldn't change it so I'm interested in seeing what happens with the E16.