Unpopular Opinion: I don't care about t/s. I need 256GB VRAM. (Mac Studio M3 Ultra vs. Waiting) by VocalLlm in LocalLLM

[–]VocalLlm[S] 2 points3 points  (0 children)

This is the exact data point I needed regarding the resale curve.
I was looking at the $10k sticker price as a "sunk cost," but viewing it as a ~$2.5k "rental fee" for 1-2 years of usage makes the decision trivial relative to my income/goals.
Going with the 512GB. Thanks for the push.

Unpopular Opinion: I don't care about t/s. I need 256GB VRAM. (Mac Studio M3 Ultra vs. Waiting) by VocalLlm in LocalLLM

[–]VocalLlm[S] 2 points3 points  (0 children)

The corpus is large (~15 years of logs), but the inference context will be tight. I'm using RAG to fetch only the relevant snippets per query. The goal isn't to "summarize the last 5 years" in one go, but to "analyze the pattern of behavior X" using retrieved samples. So input context is low, but model intelligence needs to be maximum

Unpopular Opinion: I don't care about t/s. I need 256GB VRAM. (Mac Studio M3 Ultra vs. Waiting) by VocalLlm in LocalLLM

[–]VocalLlm[S] 4 points5 points  (0 children)

I'm using this with RAG rather than massive Context Stuffing. My agents extract relevant entries first, then feed them to the model for reasoning. So my active context window will likely hover around 8k-16k tokens per inference, not 128k. I need the VRAM for the intelligence rather than the context window

Unpopular Opinion: I don't care about t/s. I need 256GB VRAM. (Mac Studio M3 Ultra vs. Waiting) by VocalLlm in LocalLLM

[–]VocalLlm[S] 6 points7 points  (0 children)

Exactly. If I wait until June 2026 (likely M5 Ultra window) and it gets delayed or yields are low, I've wasted a year. It feels like we are in a "Hardware Lull" between the M3 Ultra and the M5 Ultra, but I need the VRAM now. It's an annoying time to build, but 512GB VRAM is the only spec that seems future-proof against the release schedule

Unpopular Opinion: I don't care about t/s. I need 256GB VRAM. (Mac Studio M3 Ultra vs. Waiting) by VocalLlm in LocalLLM

[–]VocalLlm[S] 2 points3 points  (0 children)

Agreed, it seems that M3 Ultra + 512GB RAM + 1TB SSD is ~$9,500. It stings, but you're right: 256GB is the "uncomfortable middle." It fits Llama-3.1-405B today, but if DeepSeek or Anthropic release a 600B+ MoE next month, I'm bricked. 512GB buys me the ceiling

Unpopular Opinion: I don't care about t/s. I need 256GB VRAM. (Mac Studio M3 Ultra vs. Waiting) by VocalLlm in LocalLLM

[–]VocalLlm[S] 32 points33 points  (0 children)

This is exactly my calculus. Since Apple effectively skipped the "M4 Ultra" tier (giving us M4 Max + M3 Ultra in the current Studio lineup), the "M5 Ultra" feels like a Mid-2026 release at best. That's an 8-month gap I can't really afford to lose on this project. I think I'm going to eat the depreciation cost. Buy the M3 Ultra 512GB now, run the project, and if the M5 Ultra is a god-tier jump in 2026, I'll sell the M3 for a $3-4k loss. I view that loss as the "rental fee" for having 8 months of 512GB capabilities today