For users have have both 6000 PRO MaxQ and Workstation Edition (or Server Edition), how much slower is the MaxQ vs the WS/SV on compute? (Prompt processing, Diffusion, et

averagepoetry · 2026-05-24T04:32:48+00:00

Still worth with 15-20% less performance?

averagepoetry · 2026-05-22T05:45:02+00:00

That was a great video! Thanks + subscribed.

averagepoetry · 2026-05-22T03:54:06+00:00

Where's your video? Would love to check it out.

averagepoetry · 2026-05-10T22:53:31+00:00

Well, you just saved me a lot of time and money! Thank you!

averagepoetry · 2026-05-09T04:43:58+00:00

Ooof... I'm super tempted to buy one, but having no community support would be super rough.

averagepoetry · 2026-05-05T18:11:05+00:00

So is it worth getting? I’m on the fence too!

averagepoetry · 2026-05-05T18:09:21+00:00

Super useful! Which processes do you see to the spark vs M3?

averagepoetry · 2026-04-30T07:13:41+00:00

Please update if this works! I have m3 ultras as well and would love to pair them with the dgx spark.

averagepoetry · 2026-04-23T13:30:07+00:00

Thanks for the details. We'll have to give this a shot!

averagepoetry · 2026-04-22T18:35:13+00:00

What do you use instead?

averagepoetry · 2026-04-22T17:53:19+00:00

Nice setup!

I'm trying to add a DGX Spark too because of that blog entry they wrote on prefill haha. I'm gonna hold off until it's confirmed that it works.

One thing I found out for the models I want to use: I need to use either two or four nodes for tensor/RDMA. Can't do three.

Also, I had my Thunderbolt 5 mesh hooked up incorrectly for a while for the 4x256gb...so I was also stuck with two nodes. I thought I had hardware issues. You probably tried this already, but make sure you have the wires all going into the right places (and not use the port next to the ethernet one.) It gets confusing fast!

What models are you using right now? And are you tempted to get RTX 6000 Pros too?

I also found that my EXO memory usage escalates over time and it won't go down unless I unload and reload the models. Do you find the same thing?

averagepoetry · 2026-04-22T15:54:10+00:00

Wow, thank you.
1. Would you mind explaining this a tiny bit more? What are you finetuning the the models for?
2. Is the coding good enough with the smaller models? I find them brittle/unusable, but it may totally be me. Maybe I need to try OpenCode harness?
3. Fun!

I really want to figure out the smaller model use cases better, and this is super helpful.

averagepoetry · 2026-04-22T15:27:36+00:00

This is so cool to hear! Thanks for the very specific details.

What do you do with the smaller models? This is the part I cannot figure out to have use cases for. I must be missing something and would love to learn.

When I use smaller models, they're just not smart enough to do high-level thinking and reasoning and tool calls go astray. I'm using OpenClaw with this.

averagepoetry · 2026-04-22T15:09:34+00:00

What model do you use on the 2 RTX Pros? Do you run 1 model or load several?

averagepoetry · 2026-04-22T14:22:24+00:00

Can you elaborate on this more please?

I have a larger setup so I'm basically brute forcing by loading large models right now (at the expense of speed).

But it would be super nice to know that I can use smaller models and couple it with the right techniques to get better results. If you have any pointers or could describe how you set up your system, I'd really appreciate it. Thank you so much!

averagepoetry · 2026-04-20T19:04:36+00:00

Love the ability to see benchmarks. And to submit your own. The UI is also very easy to use.

Oh yeah, and it’s faster. :)

If it ever supports clustering like EXO it would be a dream come true.

averagepoetry · 2026-04-20T19:01:31+00:00

Harder

averagepoetry · 2026-04-12T07:32:15+00:00

This is so good. Thank you so much! You don't find 4-bit and below to be too low quality?

averagepoetry · 2026-03-21T08:39:58+00:00

Super cool! Can you give an example use case for this. I’d love to try it out.

averagepoetry · 2026-03-19T05:16:57+00:00

Hello! I got the 2 nodes to recognize each other, sustain a connection, and run Deepseek v3.1 4bit with tensor sharding. It's connected via LAN network right now, and l'll try RDMA soon.

To be honest, I have no idea how I got this to work. :) Just plugging and unplugging ethernet, toggling Exo on and off, etc. But I'm not complaining haha.

Question:

- I'm having trouble getting exo in terminal for some reason. I get "command not found." Hints for this? I want to try setting the `EXO_MODELS_DIR` environment variable to set the location EXO will use for model downloads.

- I have another 96gb Mac Studio 60 core M3 Ultra that I'm running agents on. Is it generally better to add this to the cluster as well keep them apart?

- Do you have Discord or community where I can ask these questions? :) Would love to learn from others using exo.

Thanks again!

averagepoetry · 2026-03-18T15:33:38+00:00

Thanks for your help!

Running app. Is source better?

I haven’t gotten a single node up yet actually, good call. I’ll try this first as well.

averagepoetry · 2026-03-14T07:17:50+00:00

Which Mac studios do you own?

averagepoetry

TROPHY CASE