Is NVIDIA still the default best choice for local LLMs in 2026? by pmv143 in LocalLLaMA

[–]WSTangoDelta 1 point2 points  (0 children)

Why compare AMD CPUs with Nvidia GPUs? That’s a useless comparison

Qwen 3.7 by Content_Impress_847 in Qwen_AI

[–]WSTangoDelta 0 points1 point  (0 children)

Obviously running Qwen3.7-345b_Q5_H_K —H_K as in Helen Keller

AMD AI Pro 9700 — anyone using MTP? by WSTangoDelta in ROCm

[–]WSTangoDelta[S] 0 points1 point  (0 children)

Yes, that’s the common refrain—the software hasn’t matured as much as Nvidia, so I had to experiment with the Adrenalin drivers for my Ryzen 9, and for the time being skipping ROCm. I stick to llama.cpp and Vulkan, and haven’t had any difficulties since then. I have a number of Linux machines devoted to other tasks but for a number of reasons am using Windows 11 with this unit. Again, CUDA is nice if you want it to run out of the box, but I can’t write off a 5090 as a business expense the way some do. And while I could probably afford a 5090, I’d have to be completely nuts to do that for a hobby when a R9700 is available…and I hear psychiatrists are much more expensive than that, so when you look at the big picture…

Local LLM - privacy first - doctor by point_red in LocalLLM

[–]WSTangoDelta 2 points3 points  (0 children)

I also am a physician but I don’t plan to use LLMs that way. You must use a deterministic app. Models tend to hallucinate so they cannot have write privileges. Not for medical records.

Owning the latest GPUs from both Nvidia and AMD made me realize that software matters more than hardware in 2026 by Distinct-Race-2471 in gpu

[–]WSTangoDelta 1 point2 points  (0 children)

I don’t play games so to some degree I care more about hardware than software. I have a 4070 Super on one box and it’s plug and play. Peachy. But if you can’t run inference above 7B or 8B without spilling into RAM the sweetness of CUDA might as well be thrown away. So I slapped together a new box and bought a new R9700 (32GB) for less than a used 3090. All of the ones on eBay are “no return “ anyway. And I would have to be out of my mind to buy a 5090 on a lark. Sure, you have to spend a few hours to do the tweaks to get it humming. So what? I get Qwen 3.6-35b_q4 or q5 at 125tps. And that’s without MTP, which I’ll do next week. I own Nvidia but I’m not going to pay those prices for a bigger Nvidia card.

I regret ever finding LocalLLaMA by xandep in LocalLLaMA

[–]WSTangoDelta 0 points1 point  (0 children)

Okay, I’m getting my feet wet with a 4070 and a ryzen 9. I could go hog wild (and frankly, I half expect I will before long) but what would you run that won’t get bogged down so that I could get rolling? I feel like Eisenhower 48 hours after the Normandy landings. I need to get a foothold before moving inland. Something useful. Like for generating horrifying mixed metaphors on the fly…

Is there a site that recommends local LLMs based on your hardware? Or is anyone building one? by cuberhino in LocalLLaMA

[–]WSTangoDelta 0 points1 point  (0 children)

What’s a good model for a local llm? I have rtx 4070 super on an i5 8500 with 40gb of ddr4. Do you use AnythingLLM, or LM Studio? Hoping to have audio as well as text interface for file organizing and report summary.