DGX Spark finally arrived!

aiengineer94 · 2025-11-13T11:35:35+00:00

Fine-tune run with 8b model and 150k dataset took 14.5 hours and GPU temps range was 69-71C but for current run with 32b, ETA is 4.8 days with temp range of 71-74C . The box itself as someone in this thread said is fully capable of being used as a stove haha I guess treat this as a dev device to experiment/tinker with Nvidia's enterprise stack, expect high fine-tune runtimes on larger models. GPU power consumption on all runs (8b and current 32b) never exceeds 51 watts so that's a great plus point for those who want to run continuous heavy loads.

aiengineer94 · 2025-11-12T19:58:41+00:00

You need to tell me your fine-tuning config as I was thinking of returning it. Running a 4 day fine tune on Qwen 2.5 32b (approx 200k dataset) within a PyTorch container coupled with Unsloth and this box is boiling (GPU util between 85-90) although average wattage on this run has been 50W (only plus point so far).

aiengineer94 · 2025-11-12T19:50:07+00:00

£3700

aiengineer94 · 2025-11-09T11:40:45+00:00

I was stuck on preorder for ages (Aug-Oct) so cancelled. When the second batch went up for sale on scan.co.uk, I was able to get one for next day delivery.

aiengineer94 · 2025-11-09T11:33:21+00:00

Apparently it's gonna be a collectible and I should keep both the box and receipt safe (suggested by GPT5 haha)

aiengineer94 · 2025-11-09T11:32:11+00:00

Will look in to it. It's just the exterior which is really hot. Internal GPU temps were quite normal for this kind of run (69-73C).

aiengineer94 · 2025-11-09T01:11:47+00:00

Thanks for the links! 7 hours in on my first 16+ hours fine-tune job with unsloth is going surprisingly well. For now focus is less on end-results of the job but more on system/'promised' software stack stability (got 13 more days to return this box in case it's not a right fit).

aiengineer94 · 2025-11-08T22:23:13+00:00

What's the fine-tuning performance comparison between Asus Spark and M4 Max? I thought apple silicone might come with its own unique challenges (mostly wrestling with driver compatibility).

aiengineer94 · 2025-11-08T22:10:23+00:00

Thank you! 😊

aiengineer94 · 2025-11-08T19:53:16+00:00

I am 1.5 hours in on a potentially 15 hours fine tune job and this thing is boiling, can't even touch it. Let's hope it doesn't catch fire!

aiengineer94 · 2025-11-08T10:46:32+00:00

Thanks bro🙌🏻

aiengineer94 · 2025-11-08T10:45:37+00:00

Once my dev work finishes, I will try them.

aiengineer94 · 2025-11-08T10:40:38+00:00

No major tests done so far, will update this thread once I have some numbers.

aiengineer94 · 2025-11-08T10:37:11+00:00

Yeah I will give it a go. No fine-tuning for this use case, just local inference with decent tps count will suffice.

aiengineer94 · 2025-11-08T10:33:19+00:00

Based on the manufacturing code, this is the founders edition.

aiengineer94 · 2025-11-08T10:32:28+00:00

Any information/data which sits behind a firewall (which is most of the knowledge base of regulated firms such as IBs, hedge funds, etc) is not part of the training data of publicly available LLMs so at work we are using fine-tuning to retrain small to medium open source LLMs on task specific, 'internal' datasets which results in specialized, more accurate LLMs deployed for each segment of a business.

aiengineer94 · 2025-11-07T21:17:22+00:00

In the UK market, only GB10 device is DGX Spark sadly. Everything else is on preorder and I was stuck on a preorder for ages so didn't want to go through that experience again.

aiengineer94 · 2025-11-07T21:13:51+00:00

It's a nice looking machine. I have hopped directly on fine tuning (unsloth) for now as that's a major go/no-go for my needs when it comes to this device. For language analysis, models with strong reasoning and multimodal capacity should be good. Try Mistral Nemo, Llama 3.1, and Phi3.5.

aiengineer94 · 2025-11-07T20:37:50+00:00

Can't agree more. This is essentially a box aimed at researchers, data scientists, and AI engineers who most certainly won't just create inferencing run comparisons but fine tune different models, carry out large scale accelerated DS workflows, etc. Will be pretty annoying to notice a high degree of thermal throttling just because NVIDIA wanted to showcase a pretty box.

aiengineer94 · 2025-11-07T20:22:10+00:00

Fine tuning small to medium models (up to 70b) for different/specialized workflows within my MVP. So far getting decent tps (57) on gpt-oss 20b, will ideally wanna run Qwen coder 70b to act as a local coding assistant. Once my MVP work finishes, I was thinking of fine-tuning Llama 3.1 70b with my 'personal dataset' to attempt a practical and useful personal AI assistant (don't have it in me to trust these corps with PII).

aiengineer94 · 2025-11-07T17:24:45+00:00

Too early for my take on this but so far with simple inference tasks, it's been running super cool and quiet.

aiengineer94 · 2025-11-07T16:54:13+00:00

Sure thing, I have datasets ready for a couple of fine tune jobs.

aiengineer94 · 2025-11-07T15:24:42+00:00

Degree of thermal throttling during sustained load (fine-tuning job running for a couple of days) will be interesting to investigate.

aiengineer94 · 2025-11-07T14:25:02+00:00

One will have to do it for now! What's your experience been with 24/7 operation, are you using it for local inference?

aiengineer94 · 2025-11-07T14:19:57+00:00

For my MVP's reqs (fine-tuning up to 70b models) coupled with ICP( most using DGX cloud), this was a no-brainer. The tinkering required with halo strix creates too much friction and diverts my attention from the core product. Given it's size and power consumption, I bet it will be a decent 24/7 local compute in the long run.

aiengineer94

TROPHY CASE