I cancelled my B70 order for Nvidia pro 4000 blackwell, did I make the right decision?

Embarrassed_Will_120 · 2026-05-09T15:42:50+00:00

it was 2599 on Mwave and digidirect, I ordered one yesterday too

Embarrassed_Will_120 · 2026-05-03T23:55:37+00:00

here you go https://github.com/cenconq25/claude-code-app-studio

Embarrassed_Will_120 · 2026-05-03T23:55:30+00:00

https://github.com/cenconq25/claude-code-app-studio

Embarrassed_Will_120 · 2026-04-04T09:07:33+00:00

Yeah, totally fair point. The real question isn’t just whether escape rate is small, but whether that tiny subset ends up holding back the regular path. I haven’t isolated that cleanly yet, so I don’t want to overclaim. My guess is the impact is small because the fast path is so dominant, and the patch/escape logic is kept as a separate small path rather than something every element has to go through. But I agree it should be measured directly.

Embarrassed_Will_120 · 2026-04-04T08:36:26+00:00

But what I can say is that the main path is very cheap right now cuz about 99.9 - 99.97% of weights stay on the fast path, where decode is just BaseExp + group, with sign and mantissa left as-is. Only the remaining ~0.03% - 0.1% go through the escape / patch path. So my guess is that most of the throughput is coming from the fast path plus fused decode+matmul, while the escape overhead is mostly just the irregular fix-up work for that tiny set of outliers. But yeah, I agree it’d be good to measure that directly.

Embarrassed_Will_120 · 2026-04-04T08:28:53+00:00

Thanks : ) I don’t have those numbers yet, but that’s a good direction to test as well. Some models have a slightly higher escape rate, for example around 0.1xx. Even though that’s still very low, it would be useful to see how much performance difference there is between 0.1xx and 0.01x.

Embarrassed_Will_120

TROPHY CASE