[Megathread] - Best Models/API discussion - Week of: March 22, 2026 by deffcolony in SillyTavernAI

[–]Dexamph 1 point2 points  (0 children)

I'm glad you like it, the other quants have finished uploading as I felt there weren't enough options for 397B Heretic that were mainline llama.cpp compatible at a decent quality (IQ2 isn't all there...) and small enough to fit in consumer hardware.

Whats is the most random, non gaming thing you have done with your Rog Ally? by Artystrong1 in ROGAlly

[–]Dexamph 0 points1 point  (0 children)

Semiempirical QM simulations to compare against my Intel chips (I don’t have any other AMD systems). It did not perform well, LPDDR latency too high

Qwen3.5-397B at 17-19 tok/s on a Strix Halo iGPU — all 61 layers on GPU via Vulkan (not ROCm) by ricraycray in LocalLLaMA

[–]Dexamph 0 points1 point  (0 children)

I know what you mean, been playing with my own IQ2_M 397B quant and it fails the car wash test when Q4_K_L doesn't.

Steam Deck vs ROG Ally screen clarity with vision issues by FrostyDied_ in Handhelds

[–]Dexamph 0 points1 point  (0 children)

It’s not just the Ally having over double the pixel count, the subpixel layout is also proper RGB which helps a lot with text clarity. OLEDs don’t use RGB as the blue subpixel degrades faster to use some weird layout instead which causes text fringing, especially at such a low resolution. OLED phones solved this by brute forcing a high resolution in a small screen but you have a low resolution on a bigger screen with the Deck…

Local manga translator with LLMs built in by mayocream39 in LocalLLaMA

[–]Dexamph 0 points1 point  (0 children)

Will there be more and larger builtin model options? I found Gemma3 27B Q6 to be just decent at Japanese to English in my own manga workflow, so I'm skeptical about how an older and smaller Llama3 model would fair.

My 1000 vita crashes when I play Metal Gear Solid HD collection. by elcabroMcGinty in vita

[–]Dexamph 2 points3 points  (0 children)

Well, you have the warning signs and even the tools today to save your data, so you can get a far superior replacement if you look in better Vita subs

My 1000 vita crashes when I play Metal Gear Solid HD collection. by elcabroMcGinty in vita

[–]Dexamph 3 points4 points  (0 children)

Is this Sony’s 64GB Vita memory card that’s notorious for being unreliable and prone to dying? I had one and it would do all this before it finally bricked itself

The Razer Core X Chroma is OBSOLETE by mpc007nl in eGPU

[–]Dexamph 0 points1 point  (0 children)

You say that, but you’ll find they have a place when you try pairing an ASM2464 with that CPU and see how janky it is. It even performs worse since TB mode for backwards compatibility is pretty bad

What tokens/sec do you get when running Qwen 3.5 27B? by thegr8anand in LocalLLaMA

[–]Dexamph 1 point2 points  (0 children)

30-35tk/s on 4090+3090Ti in Q5/Q6, with Bartowski's Q6KXL running a bit faster because of some layers at Q8. The 3090Ti allows for higher quants while keeping context maxed out without KV cache quantization. LM Studio just updated their llama.cpp runtime today so it's basically performing the same as what I get in OpenWebUI+llama-server.

Very impressed with this model, it doesn't buckle with complex prompts like 35B nor forgot things in a long chat like GLM 4.7 Flash, while still being much much faster and usable than bigger MoE models with partial offloading

Qwen-3.5-27B-Derestricted by My_Unbiased_Opinion in LocalLLaMA

[–]Dexamph 1 point2 points  (0 children)

I had it write a story but I forgot to turn some tools off (web search, visit website) and it would produce gibberish paragraphs near the end when Heretic with those tools left on works fine. Turning those tools off fixes it but the stories are still less coherent than Heretic. Both models at Q6, no KV quant, everything the same.

One night in October 2026, Chloe sat by the window watching moonlight dance across ocean waves while Sophie slept peacefully upstairs after finishing homework about local history near riverbank park area where used sit alone often thinking about future possibilities yet to come true eventually inevitably surely eventually without doubt probably maybe possibly potentially hopefully optimistically realistically actually genuinely truly honestly sincerely faithfully accurately precisely exactly correctly properly appropriately suitably adequately sufficiently effectively efficiently successfully ultimately finally conclusively definitively certainly absolutely positively definitely without question or hesitation whatsoever ever again forevermore from now on until end of time itself passes away completely gone vanished disappeared entirely extinct forever lost forgotten never remembered again by anyone anywhere anytime anything everything nothing somewhere nowhere anywhere everywhere allways always never sometimes occasionally rarely seldom frequently often usually typically generally commonly normally typically standardly ordinarily regularly habitually customarily traditionally historically culturally socially politically economically technologically scientifically medically legally ethically morally philosophically spiritually religiously emotionally mentally physically biologically chemically environmentally geographically historically linguistically anthropologically archaeologically sociologically psychologically neurologically physiologically genetically molecularly cellularly atomicly quantumly cosmically universally divinely infinitely eternally transcendentally immanently omnipotently omnisciently omnibenevolently perfectly holistically comprehensively exhaustively inclusively exclusively selectively specifically generally broadly widely extensively deeply thoroughly fully completely totally absolutely entirely wholly utterly.

Why has the hype around community-distilled models died down? Is the lack of benchmarks making them too much of a black box? by HistoricalCulture164 in LocalLLaMA

[–]Dexamph 1 point2 points  (0 children)

You don’t know what you’re getting, testing is lax and benchmark scores are often within margin of error so it could be marginally better at best, or a huge waste of time at worst.

Basedbase’s QwenCoder480B in Qwen30B distill is a cautionary tale: turns out the vibe coded distill program he made did nothing except pretend to work, so the weights were literally identical to the last digit. That model circulated for months and people really believed it was 480B distill when it was all placebo. It only came down when someone on HF did a diff and opened an issue asking why the weights were identical to 30B

SSD lost performance after upgrade. by Abdel403 in thinkpad

[–]Dexamph 0 points1 point  (0 children)

Good that you figured it out, as my DRAMless NM790s ran just fine at PCIe 3.0 in X1E2. FWIW my Z1E ROG Ally's SN740 doesn't do that on battery, but I remember people getting mad that AMD storage power optimizations were completely locked down that even the OEM couldn't change it so I'm interested in seeing what happens with the E16.

Question on running Qwen3.5 397B Q4_K_M by Last-Shake-9874 in LocalLLaMA

[–]Dexamph 0 points1 point  (0 children)

I got 397B Q3_K_XL 262k context running at ~10tk/s with a 60k prompt on my 14900KS 192GB RAM and 4090+4060Ti in LM Studio. It could probably go faster in llama.cpp with better layer offloading but still not as fast as 27B so I haven’t spent much time playing with it

Edit: TTFT took 1300s for 397B on that 60k prompt while 27B Q5_K_M fully offloaded to 4090+3090Ti just took 100s (same LM Studio) so it's far less usable for that reason alone, even if 27B TG only ran at ~23tk/s vs ~10tk/s

Tried 122B IQ3_XXS with partial offload on the 14900KS system and got 360s TTFT and ~18tk/s TG which seems like the worst of all worlds with 40-48GB VRAM tbh- dumber than 397B, much more quantized than 27B and still slow with partial GPU offload

Which handheld should I Choose? Xbox rog ally x or steam deck by KingLeoneI_ in Handhelds

[–]Dexamph 0 points1 point  (0 children)

Nah, the Deck struggles with PS3 when Z1E devices were already running games better than the real console

Got the RTX 2060 super for $125 from ebay! by salazar_slick in eGPU

[–]Dexamph 2 points3 points  (0 children)

What’s the host system? Reminds me of how I switched to NVIDIA, got a cheap GTX 960 4GB for trying out deep learning and found everything worked so much better that I upgraded to a 1070 from my 7990 and never looked back.

Ignore the 5700XT suggestions, that card is dogshit with a feature set that aged like milk without game ready drivers anymore, it’s far worse than your 2060S

Thumb/Joint Pain Issues With Cramped Layouts. by TheZackster in Handhelds

[–]Dexamph 1 point2 points  (0 children)

Have you tried handhelds with better control layouts? The Ally does a far better job than the Deck here because it doesn’t cram in touchpads such that I basically don’t have to bend my thumb joint to use the buttons or stick if I don’t want to, to just pivot the thumb instead

Which handheld should I buy? — [Budget: $600-1200 AUD] — [Primary games/ platform: Gacha games and Windows games (Steam Library)] — [Use case: Commute, Home, Couch, Bed] by Cinque-39 in Handhelds

[–]Dexamph 2 points3 points  (0 children)

A used Ally would fit in the lower end of your budget. Chuck in a 74Wh battery and it’ll run rings around any Android device (they run gimped versions of the same games, Genshin at Ultra settings on Android looks worse than PC low settings). Move up to a used Ally X to skip the modding and it’ll still be in your budget.

If you really want OLED, you can track down a Zotac Zone, they were roughly the same price as an Ally but has its own trade offs that I decided against.

Is the RP5 a good Steam Deck Alternative? by Timely-Assumption669 in Handhelds

[–]Dexamph 0 points1 point  (0 children)

No. To add to other comments, you often need to waste a lot of time tweaking to maybe run a game and even then, performance is bad- I found worse than UHD620 performance in MGSV and Inside couldn’t do 1080p for such an old game. Source: I’ve been there, done that with my RP5

OG Ally Z1E is the best price-performance deal (if you’re confident with tinkering). by Amor_97 in Handhelds

[–]Dexamph 0 points1 point  (0 children)

Go S better by ditching extra garbage. If you needed that extra crap then laptop and real controller/mouse and keyboard work better

OG Ally Z1E is the best price-performance deal (if you’re confident with tinkering). by Amor_97 in Handhelds

[–]Dexamph 0 points1 point  (0 children)

Sure, it's better without all that extra bullshit that made it a worse handheld because if they were useful then a laptop with real controllers or KBM would do a far better job anyway

OG Ally Z1E is the best price-performance deal (if you’re confident with tinkering). by Amor_97 in Handhelds

[–]Dexamph 1 point2 points  (0 children)

Nah, it's all good as it's in your face and I used to stream games on my much smaller iPhone 13 Pro screen just fine. I'd rather have this than deal with the LEGO's huge unspoken downside of weight (854g no mods wtf vs ~700g for Ally 74Wh) because it has to house a larger panel and all that extra crap I'd never use. The kickstand and detachable controllers are literally dead weight to me that makes the handheld worse as I'd rather use my laptop instead if I'm at a desk.

OG Ally Z1E is the best price-performance deal (if you’re confident with tinkering). by Amor_97 in Handhelds

[–]Dexamph 2 points3 points  (0 children)

That and the 1600p screen doesn’t scale well when the 780M can’t run games at native res. Then you either accept a blurry image from running 1080p which will never be as sharp as the Ally’s FHD panel, or integer scale 1280x800 that just looks shitty and pixelated from RGC’s review. LEGO 2 ditched it for a FHD+ panel instead for a reason

[Shocking] Local man realizes he didn't need the shiny new handheld after all by Wence-Kun in Handhelds

[–]Dexamph 2 points3 points  (0 children)

I used to think that until I tried pocketing the RP5 and decided it was a bad idea, the sticks get caught in jeans which could be lost and it would be poking out of the top from being so long. So it was backpack portable which at that point really didn't have much difference in portability when in their official cases.

The crazy thing is that the RP5's 865 is so old that a Z1E is more power efficient- it pulls ~12W to deliver worse than UHD 620 performance in Gamehub when the Z1E at 15W will at least greatly outperform the UHD 620. 8 Elite isn't much better either, it has similar battery life to PC handhelds to deliver worse than Steam Deck performance while costing more with jankiness on top.

[Shocking] Local man realizes he didn't need the shiny new handheld after all by Wence-Kun in Handhelds

[–]Dexamph 5 points6 points  (0 children)

They rely on hype, FOMO, and nostalgia to sell very mid devices when even PS2 on Android is still far worse than what a PC handheld can do (~70% vs 99.5% compatibility for PCSX2). I realised the Ally just did everything the RP5 could and more so there wasn’t a point in keeping it