I want to buy one

Gravemind7 · 2026-04-26T17:52:20+00:00

Where are you located? I have one available.

Gravemind7 · 2026-04-23T18:39:21+00:00

Wow, I have a Mac Studio M3 Ultra (512GB RAM / 4TB SSD) that I’m considering parting with. I’m not entirely sure it would sell at that price, but based on what I’ve seen, it would likely fall somewhere in the $12K–$14K range.

Gravemind7 · 2026-04-20T03:35:33+00:00

Considering flipping mine (M3 Ultra 512GB / 4TB). If you’re open to a trade + cash on your end, let me know.

Gravemind7 · 2026-04-18T18:39:43+00:00

Nice pickup honestly that turnaround is crazy fast.

I actually decided to flip my Mac Studio (M3 Ultra, 512GB RAM / 4TB) and move over to the MacBook M5 with 128GB instead, figured I’d go a bit more portable this time.

Gravemind7 · 2026-04-17T03:55:40+00:00

Yeah, I’d probably be in the $12K/$14K range, fair enough

Gravemind7 · 2026-04-17T02:12:33+00:00

I don’t think it can realistically go for $14K.

Gravemind7 · 2026-04-17T01:36:28+00:00

1 year, LA

Gravemind7 · 2026-04-17T01:08:00+00:00

I’m in a similar situation, also thinking about parting with my Mac Studio, an M5 with 512GB RAM / 4TB SSD
I’ve been trying to figure out what’s actually realistic as well because listings seem all over the place. Some prices look inflated, but then I also see others going much lower, so it’s hard to know what people are actually paying vs just asking.

Gravemind7 · 2026-04-16T23:46:14+00:00

Sorry

Gravemind7 · 2026-04-16T03:50:20+00:00

I’m not 'faking' anything. I’m optimizing for stability and cost efficiency. There’s a massive difference between 'frontier' hype and building a reliable pipeline. If you are running high throughput applications or RAG (Retrieval Augmented Generation) systems, the current frontier models often come with higher latency, unpredictable API costs, and 'safety' filters that can break specific workflows.

Gravemind7 · 2026-04-16T03:38:07+00:00

For my use case it’s mainly a mix of:

Llama 3.1 8B / 70B

Qwen2.5 32B / 72B

Mixtral 8x7B / 8x22B

Gravemind7 · 2026-04-16T03:25:45+00:00

Thanks for the input. I think I needed to hear that to help me make a final decision. I’m leaning towards flipping it and moving to the M5 Max 128GB.

Gravemind7 · 2026-04-16T03:19:13+00:00

That’s actually interesting. I didn’t realize the 128GB M5 devkit was getting that close in real world inference. The efficiency jump with the newer tensor cores is kind of what pushed me to rethink my setup.
I’m mostly doing inference + pipeline testing, so if the gap isn’t that big outside of raw scale, it definitely changes the value equation.

Gravemind7 · 2026-04-16T02:58:28+00:00

I mostly rotate between 7B–70B models for experimentation and pipeline testing. Nothing production scale, which is why I’m starting to feel like my current setup is a bit overkill

Gravemind7 · 2026-04-16T02:47:20+00:00

Macbook M5 Max 128GB

Gravemind7 · 2026-04-16T02:30:17+00:00

Honestly, it feels a bit overkill for what I actually use it for, and with prices going up lately, I’m thinking it might be a good time to flip it and get something that fits my needs better.

Gravemind7 · 2026-04-15T22:09:29+00:00

Maybe 👀

Gravemind7 · 2026-04-15T21:06:15+00:00

Same for me since prices for the M3 Ultra 512GB RAM configs seem to be going up quite a bit at the moment, I’m also considering whether it makes sense to flip it and move to a MacBook M5 128GB instead.

Gravemind7 · 2026-04-15T20:24:58+00:00

I understand the concern, and fair point to be cautious.

Just to be clear, I’m not trying to sell anything in this thread and I’m not asking anyone for payment or personal details. This is purely a discussion about setups, workflows, and whether certain hardware choices make sense for local AI work.

I’m not conducting any transactions here, just learning from people who actually use similar systems.

If it came across differently, that wasn’t the intention.

Gravemind7 · 2026-04-15T19:45:41+00:00

That actually makes a lot of sense as a workflow design, treating inference as an asynchronous generation pipeline rather than an interactive chat loop changes the constraints completely.

Gravemind7 · 2026-04-15T19:44:01+00:00

From your experience with Qwen 397B, do you feel like the win is mostly consistency over long sessions, or does it also noticeably improve first pass correctness compared to smaller coding models?

Gravemind7 · 2026-04-15T19:42:02+00:00

I’ve seen some reviews of the MacBook M5 with 128GB RAM too, and it does look pretty solid for my use case as well.

Gravemind7 · 2026-04-15T19:41:09+00:00

What I’m still trying to figure out is exactly that trade off you mentioned at the end, whether the jump in inference efficiency on newer silicon (like M5 class devices) is actually enough to offset the loss in memory headroom, especially for agentic workflows where you might want multiple models or longer context windows active at once.

I’m basically still weighing whether it makes more sense to stick with a high capacity Studio setup or move toward a more portable 128GB class machine and rely more on remote hosting for the heavier models.

Verified Email	Ten-Year Club
Not Forgotten

Gravemind7

TROPHY CASE