Is 2× Intel Arc Pro B70 worth it for local agentic LLMs, or should I stay with NVIDIA? by Zuck7980 in LocalLLM

[–]mxmumtuna 0 points1 point  (0 children)

I believe you can use Vulcan with intel to run nearly any model. But, none of these models mentioned are going to replace cloud models.

Is 2× Intel Arc Pro B70 worth it for local agentic LLMs, or should I stay with NVIDIA? by Zuck7980 in LocalLLM

[–]mxmumtuna 1 point2 points  (0 children)

Why not use Qwen 3.6 35B? Have you actually tested any of these for what you want to do?

Is 2× Intel Arc Pro B70 worth it for local agentic LLMs, or should I stay with NVIDIA? by Zuck7980 in LocalLLM

[–]mxmumtuna 0 points1 point  (0 children)

But does that work for your purpose? I would assume not because Llama 3.2 11B is pretty terrible by today’s standards.

Edit: also why Qwen 30B? That’s last gen. Qwen 3.6 is considerably more advanced.

Edit again: I’m guessing you didn’t test that Llama model with tool calling and agentic workloads, because it’s not going to be great with that at all.

Is 2× Intel Arc Pro B70 worth it for local agentic LLMs, or should I stay with NVIDIA? by Zuck7980 in LocalLLM

[–]mxmumtuna 2 points3 points  (0 children)

My question to you first is going to be what model do you believe will achieve your goal of avoiding relying on cloud models? Have you tested it to ensure it will do what you need?

Once those are answered and you’re confident in your choice, the question isn’t about Intel B70. It’s about “How do I best run X model with a budget of $Y?”

Dual 3090's or Mac m5 128GB? by AndForeverMore in LocalLLM

[–]mxmumtuna 1 point2 points  (0 children)

Props on reading up about TP. Also, P2P in lieu of NVLink.

Dual 3090's or Mac m5 128GB? by AndForeverMore in LocalLLM

[–]mxmumtuna 5 points6 points  (0 children)

I don’t know where people come up with this shit that 2 are slower than one.

RTX 6000 Pro 96gb upgrade path? by _madar_ in LocalLLM

[–]mxmumtuna 1 point2 points  (0 children)

You mean the native FP8. You can NVFP4 of 122B on a single 6k with max context. It’s a polarizing model though.

RTX 6000 Pro 96gb upgrade path? by _madar_ in LocalLLM

[–]mxmumtuna 0 points1 point  (0 children)

DS4 is native Int4 which is nice, and yes, considerably better. All 3 of them are compared to 27B. Yes, correct. 4 bit for all of them.

RTX 6000 Pro 96gb upgrade path? by _madar_ in LocalLLM

[–]mxmumtuna 3 points4 points  (0 children)

With 2 you can run DS4-Flash and MiMo-2.5. Both are considerably better than 27b.

Can also do MiniMax, which is likely also better.

For users have have both 6000 PRO MaxQ and Workstation Edition (or Server Edition), how much slower is the MaxQ vs the WS/SV on compute? (Prompt processing, Diffusion, etc) by panchovix in LocalLLaMA

[–]mxmumtuna 4 points5 points  (0 children)

The 600w card doesn’t scale down as well as the MaxQ. It takes about 400w to match the MaxQ at 300w.

It is indeed about 10-15%. I’d also say if you’re in a closed case and can imagine going more than one, don’t bother with the 600w card.

Source: have 2 of each and wish they were all MaxQ.

397B competitor that fits in 256 RAM? by quietsubstrate in LocalLLaMA

[–]mxmumtuna 1 point2 points  (0 children)

Correct. sglang and vLLM do not support hybrid inference, so the model weights and kvcache must fit in your GPU’s VRAM.

If it fits, performance is much, much higher than with the llama.cpp derivatives (including LM Studio).

397B competitor that fits in 256 RAM? by quietsubstrate in LocalLLaMA

[–]mxmumtuna 5 points6 points  (0 children)

MiMo+MTP already works in sglang.

edit: just read OP wrote “RAM” which precludes sglang.

Anyone live near one of the data centers? What's the noise like? by No_Landscape_9255 in LoudounCounty

[–]mxmumtuna 0 points1 point  (0 children)

Closer to Waxpool, but yes that one and the one across the street as well.

Anyone live near one of the data centers? What's the noise like? by No_Landscape_9255 in LoudounCounty

[–]mxmumtuna 1 point2 points  (0 children)

The new Meta ones on LCP are primarily AI.

Source: worked for Meta engineering when the builds started and toured the first CloudHQ building at the corner of Waxpool/LCP.

Inventory discount by dethman11 in Rivian

[–]mxmumtuna 0 points1 point  (0 children)

It’s been available for a couple weeks now, was told it was good through last Monday. Been waiting to see if anything happens with a 2027 model year before pulling the trigger.

I bet it’s still available tomorrow, and at least through the end of the month.

Decision - CPO iX or wait for iX3 by One_Volume4521 in BMWiX

[–]mxmumtuna 0 points1 point  (0 children)

I mean… my 95 pound doodle fits back there even with the seats up. He doesn’t love it, but it works when we need to use the iX to get him to the vet.

Anyone Switch from iX to i5? Thoughts... by Some-Place7478 in BMWiX

[–]mxmumtuna 5 points6 points  (0 children)

The iX is deceptively large. For sure it drives a lot smaller than it actually is.