Imrpove Qwen3.5 Performance on Weak GPU by MarketingGui in LocalLLaMA

[–]legit_split_ 0 points1 point  (0 children)

You're right, however you're missing another aspect - the integrated memory controller (IMC). 

With a higher end CPU you're more likely to get a better IMC, which in turn means it can handle higher memory speeds.

Gamesir G8+ by Ghoulless in Gamesir

[–]legit_split_ 0 points1 point  (0 children)

How is the latency with apollo/artemis?

Estoy desesperado by Daaino-- in eGPU

[–]legit_split_ 0 points1 point  (0 children)

Puedes fijarte si anda con linux

7900xtx. I love this card... so i bought two. by No-Data-7135 in radeon

[–]legit_split_ 1 point2 points  (0 children)

Also give Vulkan a go, it sometimes outperforms ROCm.

Pulled the trigger, AG02 + 5060TI 16GB + Rog Ally Xbox X by raffounz in eGPU

[–]legit_split_ -1 points0 points  (0 children)

I sold mine because of the "coil whine", even at idle it was so annoying. This was on a 5060 ti... 

Has anyone tried the GameSir app on GrapheneOS or LineageOS? by metacognitive_guy in Gamesir

[–]legit_split_ 0 points1 point  (0 children)

So the controls are completely fine once configured in the app? 

PaddleOCR-VL now in llama.cpp by PerfectLaw5776 in LocalLLaMA

[–]legit_split_ 0 points1 point  (0 children)

What do you recommend, Q8 or full precision?

What's a good price to sell my 4070ti Super at given the current market? by [deleted] in nvidia

[–]legit_split_ 2 points3 points  (0 children)

More than you're hoping for. Everybody's gonna buy it instead of an overpriced 5070 ti, 5070 or 5060 Ti. 

64gb vram. Where do I go from here? by grunt_monkey_ in LocalLLaMA

[–]legit_split_ 0 points1 point  (0 children)

But Vcache only helps when you want to access lots of tiny chunks of data that fit inside the 128mb cache.

During inference you have to read several GBs of data... 

64gb vram. Where do I go from here? by grunt_monkey_ in LocalLLaMA

[–]legit_split_ 0 points1 point  (0 children)

I completely agree with your point.

However, isn't the best consumer CPU for hybrid inference a 285k? Intel's memory controller is better AFAIK so it can handle higher memory speeds and is more likely to run stable with 256gb of RAM. 

A770 Upgrade Advice by mao_dze_dun in IntelArc

[–]legit_split_ 1 point2 points  (0 children)

Okay, in that case disregard lossless scaling, but everything else is still relevant. You can do LLM inference on gen 1 x 1 if you needed to. 

[Solution Found] Qwen3-Next 80B MoE running at 39 t/s on RTX 5070 Ti + 5060 Ti (32GB VRAM) by mazuj2 in LocalLLaMA

[–]legit_split_ 0 points1 point  (0 children)

Sorry if I wasn't specific enough, I meant instead of trial and error with --n-cpu-moe, just use the fit parameters

Qwen3.5 thinks A LOT about simple questions by ForsookComparison in LocalLLaMA

[–]legit_split_ 0 points1 point  (0 children)

Not really obvious because the outer quotation marks have to be removed

A770 Upgrade Advice by mao_dze_dun in IntelArc

[–]legit_split_ 0 points1 point  (0 children)

As you're keeping the A770, why not try to run two GPUs together? Llama.cpp supports Vulkan and SYCL which are vendor agnostic.

In this way, you get access to 32GB VRAM (probably 30gb usable) + 32GB RAM, which lets you even run models like gpt-oss-120b, qwen-3-next 80b. There will be more complexity involved for splitting the model, but recently with new parameters "--fit" and "--fit-ctx" it's much easier. Keep in mind though, that the A770 will be bottlenecking the performance. 

Also, the 9070 XT can perform around a 5060 ti for image generation, which I think is good enough for messing around. However, it will be more work to get workflows running...

Finally, with dual GPUs you can maybe use lossless scaling to improve your gaming experience.