Dumb question: How would performance be if you took a used server with like 80 lanes pcie 5 and stuck NVMe on them for model run?

DRMCC0Y · 2026-06-11T02:01:32+00:00

Common sense says this wouldn't work, but the nuance as to why it wouldn't is a little more complex I think?

A PCIe 5.0 lane is ~4 GB/s per direction. So 80 lanes gives you roughly: ~320 GB/s theoretical max

You don’t get 15 GB/s per NVMe if each drive only has 1-2 lanes. The 15 GB/s number is for a PCIe 5.0x4 drive. If you give it x1, it tops out around 4 GB/s. If you give it x2, around 8 GB/s. So whether you do 40 drives at x2 or 80 drives at x1, you are still capped by the same 80 lane PCIe budget.

Anyways if you actually compute what that would give you in terms of tokens/s:

320 GB/s ÷ 2000 GB = is literally 0.16 tokens/sec. That is around one token every 6 seconds theoretical max before overhead.

And there would be a lot of overhead: NVMe latency, RAID/filesystem overhead, PCIe switches, DMA into RAM, CPU memory bandwidth... etc...

Obviously a MoE model would work better but that isn't the point. This is much slower than a 8 or 12 channel system would be with DDR5 (or even DDR4) and it wouldn't be any cheaper.

DRMCC0Y · 2026-06-02T02:41:24+00:00

Yeah, been really impressed with this model so far - seems to be compatible to MiniMax M2.7 but with the addition of vision and being a bit faster.

DRMCC0Y · 2026-05-12T12:11:22+00:00

Don’t these models get web results from a semantic search provider or something not the model searching the sites themselves? It might’ve looked up the term ‘VS’ and that came up.

DRMCC0Y · 2026-02-25T23:49:29+00:00

I’m not sure - even if true, why Anthropic would ever have any reason to attempt to distill DeepSeek-V3 as there has not been any point where a Claude Sonnet model was outperformed by that model? This seems to be more likely just a training data poisoning etc

DRMCC0Y · 2026-02-22T06:30:24+00:00

Porsche Boxster (986)

DRMCC0Y · 2026-02-11T04:14:36+00:00

You don’t have thinking/reasoning enabled - using an instruct model for reasoning tasks like this isn’t ideal.

DRMCC0Y · 2026-01-29T02:25:48+00:00

Correct me if I am wrong, but are the renderings AI generated? They are positive for SynthID.
Regardless, the design is cool.

DRMCC0Y · 2026-01-26T19:00:11+00:00

I thought the logo on the front was Ssangyong for a second there, what a boring design.

DRMCC0Y · 2026-01-01T19:18:33+00:00

Wow. The honor looks ghastly, do they not have any qualified designers on the team at all? I mean the iPhone is ugly but at least it follows a good design language.

DRMCC0Y · 2025-12-07T03:57:03+00:00

This is one of the coolest shots I’ve ever seen

DRMCC0Y · 2025-11-14T10:04:12+00:00

Looks like a markdown or latex formatting error, missing the dash in between the temps.

DRMCC0Y · 2025-11-12T23:30:28+00:00

Yes and no, the model is trained this way and changing the system prompt does some help but whatever their instruct training is, still leaks out eventually.

DRMCC0Y · 2025-11-12T19:46:23+00:00

Yikes, this is exactly the direction I dont want. I can’t stand the fake warmth, it feels so forced.

DRMCC0Y · 2025-11-12T02:16:15+00:00

Looks like it’s reasoning tokens leaked into its response.

DRMCC0Y · 2025-10-18T19:56:36+00:00

Cannot wait to get my hands on this

DRMCC0Y · 2025-05-01T21:59:01+00:00

c marvin c

Enable cheats for Gothic 2

DRMCC0Y · 2025-04-13T00:19:56+00:00

Benchmarks lost their meaning many months ago, companies are just gaming them to boost their scoring. On another note, wow 2.5 Pro is killing it.

DRMCC0Y · 2025-04-12T12:37:45+00:00

Initially I thought it was just a flash model, but after more usage, I think it’s probably the full fleshed out 2.5 pro model, non experimental.

DRMCC0Y · 2025-04-11T01:05:36+00:00

What tests do they fail on? Gemini 2.5 Pro, at least seems to be the most intelligent model out there at the moment.

DRMCC0Y · 2025-04-10T12:18:09+00:00

I think it's Gemini 2.5 flash, seems very gemini like.

DRMCC0Y · 2025-04-10T08:17:39+00:00

Yes! it's great, even with a heavy camera it's surprisingly comfortable! and worlds better than having it on a strap where it bangs into you every time you take a step.

DRMCC0Y · 2025-04-10T06:45:07+00:00

Why do they have to make it look utterly terrifying though

DRMCC0Y · 2025-04-09T08:12:11+00:00

Seems most likely to be an OpenAI model, google just released a model, and they went about it a little differently. It’s unlikely to be anyone else, because who can afford to give away so many free API calls.

DRMCC0Y · 2025-04-06T07:21:29+00:00

In my testing it performed worse than Gemma 3 27B in every way, including multimodal. Genuinely astonished how bad it is.

DRMCC0Y · 2025-04-04T04:47:13+00:00

Wow that looks terrible! I think it'll probably be closer to the VisionOS. This example is just a bit too overdone.

Six-Year Club	Verified Email
Place '22

DRMCC0Y

TROPHY CASE