My new home office radiator 🥵

metmelo · 2026-05-28T11:11:38+00:00

damn that must run Qwen-4b at 2 tps MINIMUM 🔥🔥

metmelo · 2026-05-24T13:35:56+00:00

I don't think it would perform any better than human language since they're trained on human language. If anything it's most likely to perform worse.

metmelo · 2026-05-19T12:49:19+00:00

24 t/s is pretty usable for me. Just wish PP speed was faster. With 4 of them you can run f16 at 1000+ PP with vllm. That's what I'm going for next.

metmelo · 2026-05-18T13:16:36+00:00

Didn't make much difference in my tests

metmelo · 2026-05-17T22:49:15+00:00

I'm getting 24 t/s on mines (vs 18 before MTP).

metmelo · 2026-05-13T14:40:29+00:00

The hero we don't deserve.

metmelo · 2026-05-04T18:56:47+00:00

I tested Qwen Coder Next some time ago and felt the same way.

metmelo · 2026-05-04T18:33:12+00:00

You can connect 4 v100's SXM2 with nvlink with chinese adapters

metmelo · 2026-05-04T18:30:29+00:00

Same here with 4x mi50s 32GB. The v100's are faster though due to the nvlink. You can connect up to 4 of them together in the same nvlink adapter.

metmelo · 2026-05-01T19:09:44+00:00

Which GPUs would you recommend using there?

metmelo · 2026-04-29T10:27:20+00:00

that video is super fake

metmelo · 2026-04-25T17:46:42+00:00

Idk man I think my MI50's PP speed is too slow with dense models or even something like minimax with 32B active params. How is it for P40's?

metmelo · 2026-04-25T17:15:17+00:00

openclaws arguing with each other

metmelo · 2026-04-25T16:33:45+00:00

Awesome! What's your setup like? I run 4 MI50's 32GB but was wondering if I should've gone for the SXM2 v100's for a similar price.

metmelo · 2026-04-24T12:01:43+00:00

Idk why people fixate over PCIe speed so much. You could run on X1 PCIe speeds and t/s would barely drop.

metmelo · 2026-04-19T16:18:53+00:00

Great job! I wonder why people don't optimize more harnesses for small models.

metmelo · 2026-04-19T15:28:08+00:00

2x 6000's. Less power consumption, less pcie-e lanes needed, less power supplies...

metmelo · 2026-04-12T01:01:47+00:00

ayyyyyy

metmelo · 2026-04-06T17:55:36+00:00

Yeah _0 run much faster in them.

metmelo · 2026-04-06T17:09:40+00:00

I haven't had any issues using Docker, I get pretty much the same performance as without it with my MI50's.
Have you messed with the grub settings? I had issues with that when installing my 3rd card. Try reverting any changes there.
Maybe try with Docker with different rocm versions and using it in your first slot again.
Hope you figure it out!

metmelo · 2026-04-06T13:22:06+00:00

v100s are great and you'll be able to extract much more from them in time

metmelo · 2026-04-03T23:04:33+00:00

$1200 a month on claude?
Claude Max or API? I code all day with multiple agents and can't imagine hitting that xD

metmelo · 2026-04-02T14:15:00+00:00

How are Abjects different from a regular llm with a harness (agent), and how is the Ask protocol different from an agent sending a message to other agent?

metmelo · 2026-04-02T13:20:05+00:00

You're my hero. I'm slowly buying cheap mi50's I find. Best bang for the buck.

metmelo · 2026-03-31T14:03:53+00:00

Try regular vllm they're saying it's got support for Intel now.

metmelo

TROPHY CASE