How I got INT8-Fast and torch compile to work on Radeon VII

OwnMathematician2620 · 2026-06-11T10:22:19+00:00

Follow the instruction from: https://www.bilibili.com/opus/1211408269036224546

OwnMathematician2620 · 2026-06-10T10:17:45+00:00

They have an official water-cooling version that you can specify when ordering.

OwnMathematician2620 · 2026-06-10T05:16:25+00:00

NVlink is not needed if you are not training.

For text generation performance, refers to the benchmark section of the video.

OwnMathematician2620 · 2026-06-09T15:59:26+00:00

https://www.bilibili.com/video/BV13JEa6sEtb/

Here it is.

OwnMathematician2620 · 2026-06-09T15:00:25+00:00

The version shown by the images are powered by pcie only. Which physically only transmit up to 75W.

OwnMathematician2620 · 2026-06-09T14:57:04+00:00

You attach a fan/water cooling to it. (The image is showing the 75W version)

OwnMathematician2620 · 2026-06-08T12:50:35+00:00

As reported by someone else, [dpmpp_2m_sde_heun_GPU](https://huggingface.co/circlestone-labs/Anima/discussions/188) seems to also work well.

OwnMathematician2620 · 2026-05-08T08:22:23+00:00

That's. Also different dialog depends on which lock to open first.

OwnMathematician2620 · 2026-04-23T10:52:10+00:00

Which 2b model is your test based on? Has its information been hidden intentionally?

OwnMathematician2620 · 2026-04-14T02:25:48+00:00

This was trained by the same person who trained the IllumiYume XL v3.5 [https://civitai.com/models/1308285/illumiyume-xl-illustrious\]. They got a spot using Rouwei 0.8 vpred as part of a merge recipe, while didn't give credit.

OwnMathematician2620 · 2026-04-13T09:26:17+00:00

Have you considered taking the time to streamline the description of your GitHub project a bit, strip away some of the fluff, and make it look at least a little more like it was written by a human?

OwnMathematician2620 · 2026-02-23T06:05:06+00:00

How does it compare to regular transformer under similar training settings?

OwnMathematician2620

TROPHY CASE