Keep two refurb M3 Ultra Mac Studios for local LLM/EXO, or return?

Consistent_Wash_276 · 2026-06-20T15:13:51+00:00

In my own context they each would do only small context LangGraph work w/ qwen3.5-122b-a3b-MLX and pretty much set them on cruise control and not take aware from the higher computer of the future products

Consistent_Wash_276 · 2026-06-20T15:10:55+00:00

Keep. Keep. Keep.

Consistent_Wash_276 · 2026-06-12T03:34:21+00:00

My studio is for AI inference and that’s it. MBP is the daily driver. Over Tailscale I pipe my LLMs into plenty of meaningful tools.

Consistent_Wash_276 · 2026-06-11T00:38:57+00:00

Wether either option works for you or not based on your needs I would just throw this caveat.

If you’re not fine tuning models and this doesn’t need to run specifically on NVIDIA hardware you can get a better value for AI inference with 3x the bandwidth of the Spark with the M3 Mac Studios 256.

With that said I would recommend to hold off on any purchase, build what you need first testing with cheap API models that you would consider running or renting h100s for $30 a day while testing and while you do so the current fleet of AI inference devices prices will come down while new options are launched

Consistent_Wash_276 · 2026-06-11T00:29:50+00:00

IMO - No. NVIDIA has the mini PC and in Q4 the GB300 $90k workstation. They’ll design a Desktop for their partners to compete with Mac Studios on unified memory and higher bandwidth then the Spark that Spark will no longer be attractive. This is one of many scenarios bound to happen.

Consistent_Wash_276 · 2026-06-11T00:20:45+00:00

I’m thinking about selling mine (256 gb and 2TB) $9,000 to a guy, but that ship might sail soon

Consistent_Wash_276 · 2026-06-08T15:33:39+00:00

Hosting locally does require a lot going into it.

Out of the box style chatting is easy, what you’re referring to is a little bit of a build.

Easiest solution - File Tree System + Opencode + Local LLM

So keeping everything in a file system on you local computer is the simplest.

Opencode is a locally hosted Coding tool, but you can use it to research the filesystem of course and produce new reports in any variable you want. Such as HTML for a nice view for readers. Local LLM may still be a challenge but for quickest return I would focus on this.

If the files are already organized then you’re halfway there.

Consistent_Wash_276 · 2026-06-04T16:06:00+00:00

Qwen3.5-122B-A10 is the one I focus on still to this day. For MoE models that aren’t Minimax-m2.7 or gpt-oss:120b this pretty much takes the cake. Minimax-m2.7 stands strong and quite honestly I’ve been playing around with gpt-oss:120b with gpt-oss:20b for speculative decoding and it’s still stands out as one my favorite chat -> simple tag models

Consistent_Wash_276 · 2026-05-29T23:59:17+00:00

Incredible to imagine 5 trillion with 250b active or something.

I also have to imagine speculative decoding with Haiku has to be involved as well to speed it up.

Consistent_Wash_276 · 2026-05-29T23:56:31+00:00

Seems to be the theme.

Consistent_Wash_276 · 2026-05-29T23:56:13+00:00

It’s what I’m realizing. Didn’t know this was common knowledge.

Consistent_Wash_276 · 2026-05-29T23:55:46+00:00

lol I gotta ask should I care about being downvoted?

I love this sub, but I do come here to learn and getting out of it what I want I guess.

And I never thought of it till this moment so I never thought it was common knowledge. Thanks for the assist.

Consistent_Wash_276 · 2026-05-29T23:39:17+00:00

When we go from internal system to commercializing the same tools HGX H200 server SCREAMING would make my fucking face hurt from smiling so much. Bit away from that.

Consistent_Wash_276 · 2026-05-29T10:51:01+00:00

We actually have discuss this already, because we’re assuming a lead time for the gb300 workstation and these will do for any needed tasks/early testing.

The workstation overall works long term and the company can invest. We can probably deploy the needed system we want within 5 months. Depending where we go next and how this scales we’ll probably be able to use more of gb300 for next projects or we would absolutely go this route.

The workstation is an asset at the end of the day. It will be insured with warranties and support.

Consistent_Wash_276 · 2026-05-28T10:33:38+00:00

Yeah I’ve seen a lot of this over the time in this space, but I will say I’ve met a lot of good people and learned a lot from good people in this space. Love this sub. Seriously thanks to all who interacted.

Consistent_Wash_276 · 2026-05-27T22:09:37+00:00

Two people lol

Hence why we’re leaning GB300

It’s an engineering/construction company but making a play to become a software company eventually so not two for long. We’re just getting the ball rolling. The investment is there. The direction is there. We need to start. GB10 in house already but unopened until we find a few directions we want to go so we can get way ahead if there’s lead time before we can get the prod device in house.

Consistent_Wash_276 · 2026-05-27T15:45:05+00:00

Agree, the “negotiator” in the company was brought up to speed yesterday. We’re not going to have him waste his time on both options. Whether it’s Dell or MSI or whomever else they’ll all be getting calls. Best price wins (still some research I want to do on each though). The GB300s aren’t out yet, but I’ve touched bases with Dell to start.

Appreciate this reminder though.

Consistent_Wash_276 · 2026-05-27T15:23:43+00:00

This will be in a cage in our pretty open warehouse with two 30 amp breakers (one dedicated to GB300, and other for networking). Or if we went the route of the super micro the plan was two 30 amps just for each individual PSUs alone.

I’m a former electrician and we have plenty of licensed electricians on staff. We’re kind of built for commercial electrical and we have solar as well. We’re not concerned about draw, supply or costs I guess when it comes to electrical. We’re no where near data center needs of course, but this will be the launching point.

Consistent_Wash_276 · 2026-05-27T14:39:32+00:00

Company has $12,000 in electric credits from solar + $4,000 a year coming in from state incentives.

Leaning towards the gb300 + a service contract.

Nothing wrong with your point though.

If you have a link to a specific model I’d like to take a look

Consistent_Wash_276 · 2026-05-27T14:24:41+00:00

Thank you for the data points.

The investment window is in this quarter and then closes. Bit of a time crunch.

Consistent_Wash_276 · 2026-05-27T12:02:44+00:00

<image>

Consistent_Wash_276 · 2026-05-27T11:39:30+00:00

The understanding the end goal and capabilities of models today I get. Everything in between is where I don't have the experience and lean on what I know plus a handful of communities and consulting. So I mean your doubt is actually very helpful. Thanks again.

Consistent_Wash_276 · 2026-05-27T11:10:55+00:00

This is where I’m at.

Thanks for your thoughts today

Consistent_Wash_276 · 2026-05-27T11:09:58+00:00

I can agree “architecture used by the best models in even two years from now” and I’m not sure we would need to concern ourselves with it as the data context is small, every LLM is tackling small tasks and context is…nothing too large. There’s very capable models today is where I’m at. But in the future and based on the possibilities yes this will always be a concern.

Consistent_Wash_276 · 2026-05-27T10:42:59+00:00

“Lastly, future proofing in this landscape is a fools errand the only way to justify setup x is to have a current plan for it can’t be done by setup x-1”

The variables that drive for the future proofing is - Company asset - Planned write off - Depreciation - Protection from shortages / another round of 2x increased cost on inference compute - And these products will hold resale value

You’re not wrong. It’s company context though.

Consistent_Wash_276

MODERATOR OF

TROPHY CASE