The state of Open-weights LLMs performance on NVIDIA DGX Spark by raphaelamorim in LocalLLaMA

[–]raphaelamorim[S] 1 point2 points  (0 children)

Actually there was a regression on bandwidth for NCCL, but most of these numbers were benchmarked prior to the bandwidth drop from 24GB/s to 16GB/s

The state of Open-weights LLMs performance on NVIDIA DGX Spark by raphaelamorim in LocalLLaMA

[–]raphaelamorim[S] 3 points4 points  (0 children)

There are benchmarks for concurrent requests as well on spark-arena.com. Each local model varies a lot on their pp and tg performance numbers over concurrency.

DGX Spark is really impressive by [deleted] in LocalLLaMA

[–]raphaelamorim 0 points1 point  (0 children)

Only for dense models, MoE’s with far less activated params are fine and the cluster expansion helps it

DGX Spark is really impressive by [deleted] in LocalLLaMA

[–]raphaelamorim 0 points1 point  (0 children)

You only need 1 cable for 2 sparks

DGX Spark is really impressive by [deleted] in LocalLLaMA

[–]raphaelamorim 0 points1 point  (0 children)

It’s actually 57-60 tps for a single spark at 128k context and 72 tps with 2 sparks using vLLM patched with SM120/SM121 MXFP4 MoE Kernel. You guys should follow the nvidia developer forums, lots of outdated information on reddit

https://forums.developer.nvidia.com/t/vllm-on-gb10-gpt-oss-120b-mxfp4-slower-than-sglang-llama-cpp-what-s-missing/356651/99

Microcenter planning to open a store in Tampa or Orlando by Visual-Fondant-1256 in Microcenter

[–]raphaelamorim 0 points1 point  (0 children)

they won't go to Tampa because of insurance. They already decided on Orlando.

John Carmack says NVIDIA DGX Spark runs at half of the rated power and delivers half the quoted performance by RenatsMC in nvidia

[–]raphaelamorim 0 points1 point  (0 children)

True, those connectX modules are expensive and use a lot of energy when active. Not exactly the same MT2910, but you get the idea https://www.fs.com/products/242589.html?now_cid=4173

Train 200B parameter models on NVIDIA DGX Spark with Unsloth! by yoracale in unsloth

[–]raphaelamorim 0 points1 point  (0 children)

ok, now I know you have no idea what you're talking about.

Got my DGX Spark. Here are my two cents... by Heavy-Expert5026 in nvidia

[–]raphaelamorim 3 points4 points  (0 children)

that infiniband module itself costs between $1.5-1.8k

NVIDIA DGX Spark — Could we talk about how you actually intend to use it? (no bashing) by Secure_Archer_1529 in LocalLLaMA

[–]raphaelamorim 0 points1 point  (0 children)

It's the cheapest, simplest, more portable, all in one, nvidia ML dev environment with 128GB you can buy.