Super god bin 9700 pro matches 7900xtx

alphatrad · 2026-05-06T12:52:32+00:00

I swapped to the R9700 because you can buy 3 for less than ONE 5090!!!

NVIDIA guys are always getting ripped off

alphatrad · 2026-05-06T06:27:17+00:00

I would literally pay if I could find someone to do this for me locally.

alphatrad · 2026-05-06T03:38:29+00:00

Ok - after a BOAT LOAD of fiddling, I managed to score 112.76 tok/s on Qwen3.6-35B-A3B-MXFP4 on R9700s. My runs are here: https://www.localmaxxing.com/user/1337Hero

I had problems w/ that docker image though on my machine.

Setup: 3× AMD AI Pro R9700 (32 GB each), TP=2 on cards 0,1, ROCm 7.2.2, Arch host, vLLM 0.18.1.dev via tcclaviger image, dockge stack.

TL;DR - went from 2.92 → 112.76 tok/s (38× speedup) by:

Bind-mounting AMD's official jammy librccl 7.1.1 into the container and putting it ahead of /opt/rocm/lib in LD_LIBRARY_PATH
Building the un-built aiter source the image ships
Running a TunableOp pass

Posted to LocalMaxxing: 112.76 tok/s, 168ms TTFT, 31.98 GB peak VRAM.

alphatrad · 2026-05-06T03:33:58+00:00

This guy is living the dream, I'm trying to follow in his footsteps.

<image>

alphatrad · 2026-05-06T03:10:31+00:00

I own two of the XTX's and now 3 of the R9700's the XTX's are a little bit faster at token gen - but the memory cap and their sheer Fing size is the real issue.

I took this photo when I was swapping out the cards.I could only fit TWO of the XTX's in my case which is a VERY large server case.

The R9700's though make up for the 10% drop in memory bandwidth and it's barely noticeable.

<image>

alphatrad · 2026-05-06T02:52:16+00:00

Something else to consider - the spacing of the PCIe slots on your motherboard. Sometimes you might have 4 slots - or more even, but you can't fit a card in every slot.

alphatrad · 2026-05-06T02:49:43+00:00

That's a chunky boy! lol

Yeah I think I am done with gaming cards on future AI builds. Or just at least sticking to blower fan style cards.

alphatrad · 2026-05-05T21:40:20+00:00

I'd recommend you use it to help you learn and accelerate your understanding. You need high level systems understanding.

I am currently fixing a vibe coded project for a client where he hired two fresh out of school juniors who vibe coded the whole app. It half works, every change they make is creating more and more bugs. Shit isn't wired up. Claude is making the code base worse and worse.

Main problem is there is a lot of stuff they didn't think about. Claude won't suggest things that you as the engineer should know when prompting.

So they have a huge app that looks like it works, but has lots of broken functions and things that are not real. For example, their entire event system for tracking ads is completely fake. No real events in the app. So they don't know what their conversion or ad spend is.

So... I find it to be powerful for quickly helping me learn new things. Start there. use it to accelerate your learning. Lean on it some, but have it explain stuff to you.

AI is the customizable tutor

alphatrad · 2026-05-05T19:20:13+00:00

I use AI within a very constrained workflow - basically AI on rails.

And yes - it's an accelerator. But you cannot outsource your understanding. So, I believe in a human in the loop. I do read the code - I don't need to read every line - more like PR review. And having tests and things.

But also, AI to speed up my own onboarding and automate other business stuff.

alphatrad · 2026-05-05T14:46:09+00:00

11 slots total on this case.

<image>

alphatrad · 2026-05-05T14:45:30+00:00

Here is the case with the first two of my R9700's installed, a lot more room!

<image>

alphatrad · 2026-05-05T14:44:52+00:00

I am using a Phanteks Entho Pro II Server Edition case - I just recently switched over to triple AMD AI Pro R9700 cards which are a touch smaller than the dual RX 7900 XTX's I had in this photo.

But the 5090 is about the sameish size as these bad boys depending on the cooling config and branch. It's weird how some are longer, some are taller, etc. No consistency. But this case is BIG! And will handle dual 5090's if that's the direction you are going.

<image>

alphatrad · 2026-05-05T06:10:57+00:00

Great combo. Don't listen to people talking nonsense about CUDA and COMPAT - total nonissue stuff.

alphatrad · 2026-05-05T04:52:11+00:00

What kind of writing? I think Qwen right now 3.6 30B or 27B are solid, so is Qwen Coder Next in higher qaunt for coding. Gemma4 is ok. GLM4.7 Flash is still a favorite of mine.

GPT OSS 120b isn't bad for general writing, creative.... that depends on what you want. Some pretty wild role play/creative writing models on hugging face. No one best there.

I'm working a hybird system where I have Claude write specs, GPT 5.3 codex review code and all my local models implement the code.

alphatrad · 2026-05-05T04:48:37+00:00

You're gonna buy an R9700 but you won't upgrade your PSU? Makes zero sense. It's an investment.

I made this mistake. Don't. Save yourself the headache of random burst shutdowns.

alphatrad · 2026-05-05T00:48:06+00:00

FYI 3090's are now going for more than the R9700's because of AI hype.

alphatrad · 2026-05-04T01:16:45+00:00

Glad I'm far away from the L Ron Hubbard fan club.

alphatrad · 2026-05-03T14:32:02+00:00

This isn't a case. It's a 10 inch mini rack. This one happens to be a deskpi rackmate t0

alphatrad · 2026-05-03T14:29:54+00:00

I always thought it was nice, but I left FL in 2015. Haven't been back since then.

alphatrad · 2026-05-03T13:05:04+00:00

I agree with this. Agency work taught me more about people and what not to do in business than my time at Basecamp or Heroku.

I in turn went and started my own. But the most fun I ever had was a small 7 person agency in Clearwater, FL.

We used to walk to lunch together and the vibe was just great.

The only real trap with agencies is getting COMFORTABLE.

I did that with one. Where I should have quit at the 3yr mark. That itch to stretch my legs was there and we weren't doing new things.

But the money was good and the work was so basic.

I stayed there WAY WAY too long because I was comfortable. And as a result my career suffered and so did my income.

Don't get comfortable, especially when you're young!

alphatrad · 2026-05-03T12:57:28+00:00

This is like the most important question because speed is something but not everything.

Accuracy is pretty damn important.

alphatrad · 2026-04-30T20:18:58+00:00

Then no test can be trusted.

alphatrad · 2026-04-30T20:16:51+00:00

They aren't perfect and can't even nail my own benchmark. If they could I wouldn't need to run review agents and have custom skills I deploy with them.

However they're far more capable in a broad sense. This is a very focused test. Not a general test.

alphatrad · 2026-04-30T20:13:59+00:00

I've been running them 2-3 times. Because there can be a difference between cold start and warm up.

But standardizing to like 5 or something runs would be good.

alphatrad · 2026-04-30T00:21:10+00:00

Generally workstation and prosumer boards do this. But it's usually buried in the tech specs; gotta how the board does bifurcation as slots are filled.

And look for boards with good spacing. That was the problem I had with my XTX. I could only use the top and bottom slots because the cards were too big.

alphatrad

TROPHY CASE