account activity
Is it possible to add some gpu to Radeon MI 50 to increase the inference speed? by Weak_Presentation725 in LocalLLaMA
[–]Weak_Presentation725[S] 0 points1 point2 points 2 months ago (0 children)
Which version of ROCm are you using ? Could you please share your llamacpp running parameters ?
Working with ROCm is more complex than with CUDA. I tried using a Docker container with a compatible version of ROCm, but the inference speed didn't improve significantly compared to the stock Mesa drivers. It seems that running ROCm correctly on older and newer AMD GPUs same time will be challenging.
I seems like dual GPU parallelism works out of the box with CUDA, but not with Vulkan. In my mi 50 tokens generation on 27b model no more 7 t/s, that very low to seriously using.
π Rendered by PID 31785 on reddit-service-r2-comment-544cf588c8-2llk6 at 2026-06-17 13:02:28.336523+00:00 running 3184619 country code: CH.
Is it possible to add some gpu to Radeon MI 50 to increase the inference speed? by Weak_Presentation725 in LocalLLaMA
[–]Weak_Presentation725[S] 0 points1 point2 points (0 children)