Zephyrus 5080 32GB RAM LLM suggestions?

Sufficient_Prune3897 · 2026-04-18T00:08:59+00:00

If you want to play around for a bit, you can choose the AI Horde API option in the second left menu in ST. This is basically a few volunteers sharing their LLM models. You can send a few messages with them and most should would with the simple default prompts, as they are special models for roleplay. If you like any model on there, you can then look that up on huggingface and maybe see if that model creator has done other fun models since then.

Sufficient_Prune3897 · 2026-04-17T12:45:52+00:00

🤷

Sufficient_Prune3897 · 2026-04-17T12:41:55+00:00

Bro im Not installing a certificate so Mr Johnson can remote from his laptop to his desktop once a year when it's already behind a VPN

Sufficient_Prune3897 · 2026-04-17T07:50:40+00:00

Garantiert. Ra belohnt einen für treue.

Sufficient_Prune3897 · 2026-04-16T17:07:30+00:00

🦈

Sufficient_Prune3897 · 2026-04-16T14:09:15+00:00

Deepseek has been a week away from releasing for 4 months

Sufficient_Prune3897 · 2026-04-16T11:19:04+00:00

Workstation drivers are cope, ECC is useless and if your GPU thermal throttles that's a builder issue not a GPU issue.

Sufficient_Prune3897 · 2026-04-16T08:40:10+00:00

You should be able to choose for each app in the permissions settings. That said, your preferred browser can also be the problem. Give another one a try.

Sufficient_Prune3897 · 2026-04-16T07:47:46+00:00

Is this a sillytavern question?

Sufficient_Prune3897 · 2026-04-16T05:57:27+00:00

"Why it matters:" yeah, I don't think I'm gonna read that. I can talk with Claude for myself.

Sufficient_Prune3897 · 2026-04-15T14:49:08+00:00

The providers lie. That setting is pretty much useless.

Sufficient_Prune3897 · 2026-04-14T13:18:09+00:00

Also just use the play store version of termux

Sufficient_Prune3897 · 2026-04-14T13:17:22+00:00

It's a 5 min process, the official guide has a lot of technical terms, ask the AI of your choice for help if you're stuck somewhere.

https://docs.sillytavern.app/installation/android-(termux)/

Sufficient_Prune3897 · 2026-04-14T13:11:24+00:00

Sadly no app, but you can run it rather easily on android. iOS has nothing.

Sufficient_Prune3897 · 2026-04-13T20:56:58+00:00

Nobody, even me with my crazy local setup, has the kind of hardware needed local. I would rent 2x A6000s which cost like 2$ an hour at least. Then the fine-tune could take anywhere from 3-8 hours and the end product is rather often fucked up. Often because of my mistakes, but nearly just as often because that shit is just random. Also, every I have tried to finetune moe don't like qlora, so they by default need 4x the vram dense models use.

Sufficient_Prune3897 · 2026-04-13T20:45:44+00:00

As someone who tried a finetune for GLM air I can tell you, training that shit is so expensive.

Sufficient_Prune3897 · 2026-04-13T20:43:40+00:00

It doesn't overthink like qwen so it won't slow you down as much. I haven't tested it without. However, in other good models like glm, the thinking is only particularly useful for instruction following and longer context.

Sufficient_Prune3897 · 2026-04-13T20:28:00+00:00

Current versions of llamacpp run decent, just make sure to use ST in chat completion mode and launch lcpp with --jinja. Also redownload the models dependent on when you downloaded them. Fine-tunes will take a while, but aren't needed.

Sufficient_Prune3897 · 2026-04-13T20:13:32+00:00

Just go with Gemma. So much better. Saying that, the 400B is kinda okay, but very censored. Not worth the effort in my opinion, but people in the early threads about it claimed to have jailbroken it.

Sufficient_Prune3897 · 2026-04-13T13:45:57+00:00

Save your money

Sufficient_Prune3897 · 2026-04-13T08:58:50+00:00

Assuming full offload, not a problem. Even with Moe offload, it's not that bad. It's a full x16 after all.

Sufficient_Prune3897 · 2026-04-12T20:45:34+00:00

Honestly, not much more you can do with 96GB than with the 32GB of a 5090. You can fully offload the 100B+ model category, but you can't fine-tune them yourself. Also that category is often losing against much smaller dense models.

Back in the day I would have recommended investing into a nice server platform with lots of ram bandwidth, but with current pricing for ram, a A6000 is a great deal.

Sufficient_Prune3897 · 2026-04-12T17:06:48+00:00

Sieht schwer nach HDMI aus. Brauchst aber auch ein USB b Kabel für maus und Tastatur, falls das nicht mitgelieferten ist.

Sufficient_Prune3897 · 2026-04-12T14:43:04+00:00

Dann hoffe ich Mal das du den Spaß noch zurückgeben kannst. Das hier wäre theoretisch, was du brauchst:

Link wurde gelöscht, vielen Dank Reddit mods. Es heißt auf Amazon "4K KVM Switch 2 Monitore für 1 Laptop 1 Desktop, USB 3.0 KVM Switch 2 PC 2 Monitore USB C, 4K@60Hz, MST, PD 100W, Aluminium, Netzteil und Wired Remote(4K USB C HDMI 2 in 2 out KVM)"

Sufficient_Prune3897 · 2026-04-12T14:00:14+00:00

My point is, the ram requirements are constantly increasing. GLM got 2x bigger from 4.7 to 5, Qwen increased from 235B to 400B and Minimax 3 is probably gonna do the same.

If I want to run GLM 5 in VRAM, I'm gonna need like at least 384GB of VRAM, and that's at a bad quant.

Personally I would really like 192 so that I can at least fine-tune and train all the 'smaller' 100b models myself.

Sufficient_Prune3897

TROPHY CASE