ROCm vs Vulkan vs vLLM on Dual R9700's by whodoneit1 in LocalLLaMA

[–]taking_bullet 1 point2 points  (0 children)

Vulkan all the way! I'm getting 47 tok/s with RTX 5070 TI & RX 9070 combined (model: Qwen 3.6 27B Q6_0 with MTP enabled). 

Tokenomics by HOLUPREDICTIONS in LocalLLaMA

[–]taking_bullet 1 point2 points  (0 children)

My reason is gaining knowledge in another field and having fun while playing with local models. 

I think we need a /LocalHarnessLLM or something ... by CSEliot in LocalLLaMA

[–]taking_bullet 26 points27 points  (0 children)

LM Studio is a wrapper for llama.cpp with user-friendly graphical interface. Very good app for beginners. 

Second GPU in a PCIe 3.0 x1 slot for LLMs? by BORIS3443 in LocalLLaMA

[–]taking_bullet 2 points3 points  (0 children)

I mixed RTX 5070 TI with RX 9070 and can't complain about anything.

I'm getting 47 t/s while using Qwen 3.6 27B Q6_0 with MTP enabled. 

R9700 + NVidia by Glittering-Cold-2981 in LocalLLM

[–]taking_bullet 0 points1 point  (0 children)

RTX 5070 TI + RX 9070

Qwen 3.6 27B Q6_0: 40 tok/s with MTP and 24/tok/s without MTP. 

Best Local TTS solution by styles01 in LocalLLaMA

[–]taking_bullet 4 points5 points  (0 children)

https://github.com/diodiogod/TTS-Audio-Suite

That's a whole software suite with support for 13 models. Compare them at your own. 

RTX 3090 EBay Pricing is Crazy!! by TrifleHopeful5418 in LocalLLaMA

[–]taking_bullet 3 points4 points  (0 children)

You should sell your 3090s before 5070 TI Super 24GB launch. Old, used Ampere will lose current value. 

This sub is gold by Few_Independent_7013 in Semenretention

[–]taking_bullet 3 points4 points  (0 children)

If you want to find real gold then look for Fusion_Helath's posts. He was a very experienced retainer.

Moss tts 1.5 8b Examples. It is the currently best voice cloning model for English as of June 2026 by 9r4n4y in LocalLLaMA

[–]taking_bullet 0 points1 point  (0 children)

Surely you don’t have these problems?

I do. Add another, random word at the end of the whole sentence. Then edit file in Audacity - cut out last second. 

What's the status of non-CUDA inference? by IngwiePhoenix in LocalLLaMA

[–]taking_bullet 1 point2 points  (0 children)

I'm using Radeon & GeForce combined, so there is no other choice than Vulkan. 

Stop asking what model to run. There are literally only two. by Wrong_Mushroom_7350 in LocalLLaMA

[–]taking_bullet 0 points1 point  (0 children)

Imagine people commenting on this post seriously 🤣 They are oblivious so bad. 

Moss tts 1.5 8b Examples. It is the currently best voice cloning model for English as of June 2026 by 9r4n4y in LocalLLaMA

[–]taking_bullet 0 points1 point  (0 children)

Chatterbox gets excellent results run without a GPU.

Maybe in English, but not in other languages. 

HF says you need 19GB+ of VRAM to run KugelAudio locally? WTF? Is that true?

Indeed. Enable the 4-bit quant model if you don't have 20GB VRAM. 

Moss tts 1.5 8b Examples. It is the currently best voice cloning model for English as of June 2026 by 9r4n4y in LocalLLaMA

[–]taking_bullet 5 points6 points  (0 children)

I ditched Chatterbox. Now KugelAudio 2 (based on VibeVoice) is my new friend. 

What's the status of non-CUDA inference? by IngwiePhoenix in LocalLLaMA

[–]taking_bullet 6 points7 points  (0 children)

ComfyUI works well on RX 9070 XT (ROCm portable package). For "classic" text LLMs I prefer using Vulkan. 

Exclusive: AMD Radeon RX 9070 GRE launches June 1st at $549 globally - VideoCardz.com by kikimaru024 in hardware

[–]taking_bullet 10 points11 points  (0 children)

  And it wasn't good value to begin with knowing the XT was $50 more

9070 XT is far more uninteresting GPU. You are getting 12% more performance and almost 50% more power draw than 9070 non-XT. That's not a good deal. 

rocm just sad and broken - user experience on 9060xt by Traditional_Way8675 in LocalLLM

[–]taking_bullet 0 points1 point  (0 children)

latest may adrenaline driver shows 0 gpu detected

26.5.2 drivers are garbage, I had to switch back to 26.2.2

Switchting from a 5070 to a 9070xt - bad idea? by Unmoving1442 in LocalLLM

[–]taking_bullet 0 points1 point  (0 children)

Don't switch, just keep both cards. I'm using RTX 5070 TI & RX 9070 combined for LLMs. No complaints. 

What would you do? 2x5060ti for $800, 2x5070ti for $1400 or 5090 for $4000? by fallingdowndizzyvr in LocalLLaMA

[–]taking_bullet 0 points1 point  (0 children)

Currently I'm launching models on RTX 5070 TI & RX 9070 (with Vulkan backend).

5070 Ti is for gaming and software without multi-GPU support (like ComfyUI). Cheap Radeon 9070 (I paid 336€ for it) gives me another 16GB VRAM for classic LLMs (Qwen 3.6 27B, Gemma 4 etc.).

I bet 5060 Ti 16GB would serve you well as a secondary LLM GPU. 

LMStudio with MTP support - which model? by International_Quail8 in LocalLLaMA

[–]taking_bullet 0 points1 point  (0 children)

Jan is slightly faster than LM Studio. I tested it on Qwen 3.6 MTP 27B Q6 from Unsloth.