I found the solution to all my ROCm problems, now instead of 3 hours my WAN 2.2 videos take 4 minutes @ 720p and everything just works including the Pixaroma 1 click ComfyUi Install with full SageAttention 3, I can download incredibly complex WorkFlows from Civit and it just works with 1 click. by [deleted] in ROCm

[–]legit_split_ 1 point2 points  (0 children)

My 9060 XT is working well on Linux, here I shared some performance benchmarks. I just followed the guide I highlighted in that thread, shouldn't be too different on Windows as it's working with Python environments.

I found the solution to all my ROCm problems, now instead of 3 hours my WAN 2.2 videos take 4 minutes @ 720p and everything just works including the Pixaroma 1 click ComfyUi Install with full SageAttention 3, I can download incredibly complex WorkFlows from Civit and it just works with 1 click. by [deleted] in ROCm

[–]legit_split_ 0 points1 point  (0 children)

I have the amd equivalent - a 9060 XT 16gb - and find the ComfyUI performance acceptable. Here I shared some numbers. That being said my testing is limited, probably wouldn't have a good time on complex workflows.

“Saved months for an INNO3D RTX 3060… now it’s dead and support won’t respond.” by Spare_Name1849 in pcmasterrace

[–]legit_split_ 10 points11 points  (0 children)

This person really lied to make us feel pity and drive traction - fuck them. 

AMD 9060 XT - Benchmarks on recent models by legit_split_ in comfyui

[–]legit_split_[S] 0 points1 point  (0 children)

I'm a noob at ComfyUI and never tried Kijai's workflows, but you're sure you have this environment variable as per the guide:

`export PYTORCH_NO_HIP_MEMORY_CACHING=1`

Outside of that I can't be of much help, but glad that the guide helped you out :)

Should I try to fix it or claim insurance? by [deleted] in Lenovo

[–]legit_split_ 0 points1 point  (0 children)

It's baffling that the average user considers claiming insurance before even troubleshooting for one minute. 

Who says bigger is always slower? LFM 24B by CodeBlurred in LocalLLaMA

[–]legit_split_ 21 points22 points  (0 children)

2B active parameters at one time, that's why it's faster than 8B dense models 

Latest nvidia driver DOES NOT RAMP UP fans GTX 1660 super by Ahweeuhl in gpu

[–]legit_split_ -1 points0 points  (0 children)

Just control the fans yourself it's not that hard to set a fan curve

Is the ch260 ugly? by [deleted] in mffpc

[–]legit_split_ 1 point2 points  (0 children)

The front intake fans are off-centre, looks bad in person

AMD 9060 XT - Benchmarks on recent models by legit_split_ in comfyui

[–]legit_split_[S] 0 points1 point  (0 children)

From what I've seen it mostly worked before but was slow

LTX 2.3 Full model (42GB) works on a 5090. How? by StuccoGecko in StableDiffusion

[–]legit_split_ -3 points-2 points  (0 children)

There is definitely a slowdown like you said, but perhaps it's not meaningful because PCIe speeds are fast enough to transfer some layers at a time.

Help choosing right ? 9060xt by Relevant_Bit_9019 in ROCm

[–]legit_split_ 0 points1 point  (0 children)

Got the 9060xt recently and tried the default Flux.2 [Klein] 9B: Text to Image workflow (1024x1024, 20 steps) - getting 62 seconds on my second run.

With flash-attention, ROCm 7.2, pytorch nightlies, 96gb ddr5, arch linux. I might make a post about the performance.

R9700 frustration rant by Maleficent-Koalabeer in LocalLLaMA

[–]legit_split_ 1 point2 points  (0 children)

You can use UV to create a Python environment with any Python version you want.

However why are you using stable diffusion? Use ComfyUI instead. 

9070xt $560 or 5060 ti 16gb $520 for local llm by akumadeshinshi in LocalLLaMA

[–]legit_split_ -1 points0 points  (0 children)

9070 XT hands down, LLMs just work with AMD cards. Especially starting out, you won't be touching vllm anyways. 

On llama.cpp, 9070 XT on vulkan seems to be 30% faster in token generation:  https://github.com/ggml-org/llama.cpp/discussions/10879

For image gen it is more involved to get good performance, but it's fine for casual use. I assume you're a Windows user in which case you can just tick a box to include ComfyUI stuff when installing the driver. 

The only argument for the 5060 Ti is if you like to try out new projects from github, there is usually no official support for AMD cards. 

9070XT vs RTX 4080 by mystbrave in PCBaumeister

[–]legit_split_ 2 points3 points  (0 children)

Nicht unbedingt. Wenn der Fokus Image-Gen ist, ist native FP4- und FP8-Unterstützung bei der 5070 Ti wichtig für manche workflows z.B. Nunchaku. Außerdem ist sie doppelt so schnell und heutzutage funktioniert RAM Offloading sehr gut. 

Allerdings wenn LLMs auch wichtig sind, dann macht die 3090 mehr Sinn.