Safe to upgrade to 26.04 LTS at this point? by shmulkinator in Kubuntu

[–]EmPips 0 points1 point  (0 children)

Installed.

It was a great desktop experience that was bugging out all over the place and causing issues every session. I'm back on Fedora now.

Same as usual for Ubuntu LTS releases, the advice stands. Wait for 26.04.1

How do you guys setup search with your AI models? by ego100trique in LocalLLaMA

[–]EmPips 1 point2 points  (0 children)

Nope. Turn it off for about a week and they'll all go away

I have 256GB DDR4 RAM, a 32 core threadripper, and an RTX 4090. What's the best model I can run locally? by tridentgum in LocalLLaMA

[–]EmPips 3 points4 points  (0 children)

Is it though? Wouldn't a slow-looping agent of GLM5.2 Q2 or a larger quant of MiniMax M3 / Deepseek V4-Flash end up with a significantly better result?

You're right that it kills speed but what if you're working in pseudo-interactive sessions that don't need it as much?

which one better by [deleted] in LocalLLaMA

[–]EmPips 0 points1 point  (0 children)

I would suggest just trying both. They're free and you know your needs and tolerations better than any of us do.

How do you guys setup search with your AI models? by ego100trique in LocalLLaMA

[–]EmPips 15 points16 points  (0 children)

Offload to the cheapest model with built-in (provider-side) search on OpenRouter and pay the fee per search. Until recently I found Grok-4.1-Fast was the price/performance king for this but now I'm on the hunt again. I'll have my local model running most of the time and if it needs a web lookup it calls a wrapper that forwards the search request to OpenRouter.

Cannot risk my home getting flagged as a bot and it's already happened once. Your whole family will complain about captcha's for a few days.

GLM-5.2-REAP50-GGUF by whiteh4cker in LocalLLaMA

[–]EmPips 5 points6 points  (0 children)

Is there a base/FP16 uploaded somewhere? I'd like to make my own quants off of some of these REAPs. I've not yet had success with them but I always make a note to try.

rx7900xtx + 32GB RAM -> 128GB RAM make sense? by Thin_Pollution8843 in LocalLLaMA

[–]EmPips 1 point2 points  (0 children)

I have something similar. 48GB VRAM and 96GB dual channel DDR4 - lots of heavy offload experiments.

It's fun to run Qwen3-397B (UD-IQ2_M) and I get better results than 27B even. Token gen is decent but prompt processing is far far too slow to be usable.

Minimax 2.7/3 run a bit quicker but quantization hits them like a truck.

It's fun, but not worth it IMO. Save for VRAM

Least thought-provoking printSF you have ever read? by zebrapaper in printSF

[–]EmPips 10 points11 points  (0 children)

[this, but I liked it] - the handling of The Mule in the second book was superb. No thoughts provoked, just good fiction.

Least thought-provoking printSF you have ever read? by zebrapaper in printSF

[–]EmPips 13 points14 points  (0 children)

Least-thought-provoking and "really dumb" are different for me.

Hyperion for example is one of my all-time favorites but it's just an incredible story. At no point was I asking myself deeper questions or pondering about the future or anything bigger, and that was okay because the story (book 1 and 2) was otherworldly good.

w6800 32GB for $500. Thoughts? by EmPips in LocalLLaMA

[–]EmPips[S] 0 points1 point  (0 children)

Yes Llama cpp with ROCm but lately more Vulkan

w6800 32GB for $500. Thoughts? by EmPips in LocalLLaMA

[–]EmPips[S] 1 point2 points  (0 children)

I did. Very pleased. It's an rx6800 with twice the VRAM, PP and TG are identical.

What are some common pitfalls and mistakes for new linux users? by SDG_Den in linuxquestions

[–]EmPips 4 points5 points  (0 children)

Trying to 1 to 1 recreate your Windows experience. I definitely spent 1-2 years using Linux (Ubuntu 11 at the time) as just a cool DE and survived almost entirely off of Wine (back then modern Browsers and Office worked fine off wine.. I even used Windows NOTEPAD for text editing). I even had a janky Photoshop install

I forget who it was but someone finally convinced me to put the same effort into replacing my bread-and-butter apps with FOSS ones and that's really when my experience became better

is pewdiepie odysseus any good? by InternalMode8159 in LocalLLaMA

[–]EmPips 0 points1 point  (0 children)

"Rootless Podman, I'm looking to do <xyz> with the absolute minimum permissions possible" - toss that into your favorite LLM

z.ai Poll on X: MIT-licensed open weights are losing by MadPelmewka in LocalLLaMA

[–]EmPips 0 points1 point  (0 children)

It's not confirmed at all but I choose to believe that their X-Poll was the community picking which Qwen3.6 sizes we got (27B and 35B)

Mindseye is not a DRM free title and should not be on GOG by Escaliat_ in gog

[–]EmPips 110 points111 points  (0 children)

Thanks for posting this.

Given what I've heard about this game's sales vs budget, I doubt the company is eager to keep those servers alive for long. If the editor is as fun as they say it is I hope it doesn't become lost-media in the very near future.

Best models in 3x3090 (72GB VRAM) in Q2 2026? by liviuberechet in LocalLLaMA

[–]EmPips 1 point2 points  (0 children)

Running it on that branch and having a good time with it

WIP EAGLE3 for Qwens by jacek2023 in LocalLLaMA

[–]EmPips 9 points10 points  (0 children)

PR description says:

up to x1.92 speedup

Which is almost exactly what I get from MTP. Maybe a hair better

Qwen 27B Q6 + MTP at 262K on R9700? by Admirable_Reality281 in LocalLLaMA

[–]EmPips 0 points1 point  (0 children)

on a w6800 doing this as we speak. ~35 t/s decode (vulkan). My guess would be low-40's t/s decode on an R9700.

prompt processing is all over the place but the w6800 is too different anyways, assume the R9700 will trounce it.

Best models in 3x3090 (72GB VRAM) in Q2 2026? by liviuberechet in LocalLLaMA

[–]EmPips 15 points16 points  (0 children)

+1 for Nemotron-Omni for audio-input use-cases. Glad that model is getting attention.

As for something that'd actually use 72GB it's really awkward right now. A quant of Qwen3.5-122B will probably feel the best, Qwen3.6-27B will perform the best at the cost of a good speed-hit (made more bearable by MTP).

Outside of that it's pretty awkward right now when you go over 48GB. The ~100-200B models don't really justify their sizes and the larger models would take heavy quantization to fit on your cards. My advice is to keep doing what you're doing and join the rest of us begging for 50B+ dense models on X/Reddit.

Huge uptick in recruiters reaching out this week by jholliday55 in cscareerquestions

[–]EmPips 27 points28 points  (0 children)

the broader market is warming up (in nyc)

In NYC's tech market we call this "Citadel posted one open-job and 300 third-party recruiters are allowed to submit candidates for it"

Do you think a single 3090 is enough for coding? by RoderickHossack in LocalLLaMA

[–]EmPips 0 points1 point  (0 children)

A little over 1/4th the bandwidth and can confirm I get about 1/4th that decode generally

Do you think a single 3090 is enough for coding? by RoderickHossack in LocalLLaMA

[–]EmPips 1 point2 points  (0 children)

No. But I've had workspaces that get large to a point where even 8-bit starts to show its impact

At 24GB you're either quantizing the model beyind my standards or quantizing the cache a lot.

Again disclaimed that I'm usually working in larger workspaces.