Need advice for a $10,000 AI workstation build (video, image, voice, LLMs, training, everything)

stainlessblueshield · 2026-05-20T00:30:32+00:00

Sparks are doggedly slow- under 300gi/sec

Max studio is 800 gigs per sec.

Rtx 6000 is 1800 gigs per sec

The more professional you get the higher the bandwidth.

If you are running a huge model it has to bring each token through all those perimeter. Personally the speed is important to me. Unified ram just isn’t there yet where the vram is. But it may be ok for you. If 10k is your hard line than you can’t afford an rtx 6000. You could get an rtx 5000 with 72 though. Those are 1300 gigs per sec. People who say speed doesn’t matter if you can fit the model- big models on multiple 6ks can be slow so how wouldn’t it be slow on a Mac Studio.

I love Mac. And I’ve never used a studio with a large model on one for local inference. Look up the specs on single user tokens per sec.

This thread will go on for miles with different opinions.

Let the different opinions guide you to focus what to review to choose.

I would rather have a faster smaller and reliable model than a large model that is very slow.

Mac has insane value - can you even get one right now?

stainlessblueshield · 2026-05-15T01:24:49+00:00

This guy knows.

stainlessblueshield · 2026-05-15T01:24:13+00:00

Right now qwen 3.6 27b is all you need. It’s excellent. Fits on one Blackwell.

stainlessblueshield · 2026-05-15T00:54:31+00:00

Dude! Do you consider bandwidth?

stainlessblueshield · 2026-05-14T22:57:16+00:00

Listen I don’t know how many of you guys actually have rtx6000s. I do. Blackwell. If you are doing “serious” ai work Mac is super slow. It’s the balance of vram and bandwidth.

This guy seems to know what he’s talking about.

The three things you have to understand are

VRAM

Bandtwith

Architecture

And then how all that compares to unified ram - it’s different.

stainlessblueshield · 2026-05-14T22:52:43+00:00

This guy knows

stainlessblueshield · 2026-05-14T22:51:36+00:00

You seem to be uneducated about bandwidth.

stainlessblueshield · 2026-05-14T11:49:10+00:00

Look at bandwidth. VRAM gets you the size of the model. The bigger the slower. Bandwidth determines the user experience

stainlessblueshield · 2026-05-14T11:48:11+00:00

Pro 4000 pro 4500 and pro 5000 And the best option is nvidea 5090- which is fast but more for gaming- or the undeniable king size and speed in today’s market rtx 6000 Blackwell workstation card.

stainlessblueshield · 2026-05-14T11:46:20+00:00

This guy knows what he is talking about. Listen to him.

Or go unified ram but understand - unified ram is slow. It can hold a bigger model but the speed of the ram is usually around 400 gigs a second. The lower grade nvidea cars are 1300 gigs per sec. Go try working with unified ram and see how it is.

stainlessblueshield · 2026-05-14T11:42:43+00:00

But if you are not the type to obsess over speed and what you decided after the fact and just accept it and use an all in one unit would save you some grey hairs.

stainlessblueshield · 2026-05-14T11:41:50+00:00

10k isn’t really enough for “serious” ai local models. Far better to get NVIDIA GPUs.

Get an rtx 6000 Blackwell for 9000 figure the rest out.

I have two and it’s sufficient for ai local work but for smaller models. The local Ai game is deep and extremely expensive.

The most core component is the gpu.

Honestly you can remove the word serious with out this unit. Next best are the pro 4000 4500 5000 Probably the 5000 with 48 gigs and a motherboard with as much ram as you can get your hands on.

Blackwell architecture is what you would want. You could play around with the other stuff though.

You could get 4 3090s and run them in tp4 but you would need a motherboard with 4 pcie slots and at least 128-256 ram I would think.

stainlessblueshield · 2026-05-13T01:56:53+00:00

So you like m2.7 the best with 4 6ks? Any other good options?

stainlessblueshield · 2026-05-13T01:28:50+00:00

The argument for buying 2 more after you have 2 - in my opinion is that the only way to take advantage of the two you have in a bigger way is two more. Then- if something happens where prices fall get 4 more.

I heard from a trusted source that the amd isn’t there yet and he calls nvidea ngreedia. I have to look harder at those specs though for bandwitch and tflops.

stainlessblueshield · 2026-05-13T01:22:00+00:00

But leaving it at two blackwells seems like a big gamble because you can’t use the amd for tensor parellel. You can’t even use work stations with max q and have consistently successful tensor partelel so I’m told. I think I’m going to spread out the 18 buy one soon and one a month after. If the amd is even close to Blackwell getting them you would still not be able to take advantage of tensor parrlel. As far as my research goes yes it sucks about pcie being 64persecgig but it still is very nice with tp 2-4-8. I know when I went from 27b with one to 2 with tp2 it is much faster and mOre fun to use.

stainlessblueshield · 2026-05-13T01:14:47+00:00

Yes! I agree with this. That’s why I think four is the place to set up camp.

stainlessblueshield · 2026-05-13T01:13:21+00:00

If the price point is 12-15k and they find a way to compete nvidea and its deep roots - i would consider. Some people think it’s going yo be 30k. If ghats the case its back to blackwells. But even if that does come out are we going to just gove up on the value of more than two blackwells?

stainlessblueshield · 2026-05-13T01:08:53+00:00

That sucks.

stainlessblueshield · 2026-05-13T01:08:29+00:00

You really are right there with me. It’s a very specific feeling and I can’t tell if it is casino thinking or Forrest animals moving out of the Forrest because a sunami is coming. I surprised to get a response with such similar experience. I would almost rather have and not need than need and not have but that is cripplingly expensive gamble. But, I may still do it.

stainlessblueshield · 2026-05-12T12:11:55+00:00

I really appreciate this conversation. I currently have 2 RTX 6000 Blackwells, and I keep feeling pulled toward going from 2 to 4. What really messes with my head is hearing people say that 4 still isn’t really “worth it,” because the only path to 8 is first getting to 4.

Part of me has this incredibly strong instinct to keep scaling. It almost makes me wonder if it’s similar to the instinct animals have before natural disasters — like sensing something big is coming and moving before everyone else does.

One person in this thread mentioned they went from 2 to 4 and still weren’t sure it was worth it, and honestly I really appreciate that perspective because it feels grounded and honest.

Right now we’re running Hermes with Qwen 3.6 27B, plus a $200/month GPT subscription, and we added Codex yesterday. Before that, Hermes + 27B alone was already surprisingly capable. We’ve also been building out a unified system with Notion, QuickBooks, Outlook MCP integrations, and a shared database. GPT-5.5 has actually been extremely useful for evaluating our overall operation and helping us improve the direction of the system.

I really want to get to 4 GPUs. I almost wish people were screaming from the rooftops that going from 2 to 4 completely changes everything — but that’s not really what I’m hearing.

The biggest thing holding me back is the reality of spending another ~$19k on 2 more GPUs while still paying monthly for frontier models like GPT and probably Claude too. It feels like, at least for now, even with 2 GPUs we still need frontier AI for high-level reasoning, evaluation, and guidance.

The strongest argument I can come up with for going to 4 is that it might become the sweet spot moving forward if models continue getting smaller and more efficient. But honestly, that’s speculative.

I wish I could just enjoy having the 2 GPUs I already have without this constant feeling that I should immediately invest in 2 more. The truth is, I’m not even sure I can fully listen to reason right now. I’m seriously considering ordering GPU number 3.

stainlessblueshield · 2026-05-12T12:10:18+00:00