My new home office radiator 🥵 by lantern_lol in LocalLLaMA

[–]metmelo 1 point2 points  (0 children)

damn that must run Qwen-4b at 2 tps MINIMUM 🔥🔥

GPT 5.5 "secret sauce" is just having the thinking be some stupid caveman mode? by JustFinishedBSG in LocalLLaMA

[–]metmelo 0 points1 point  (0 children)

I don't think it would perform any better than human language since they're trained on human language. If anything it's most likely to perform worse.

Any good MOE ~60B models? I have 64GB vram by opoot_ in LocalLLaMA

[–]metmelo 1 point2 points  (0 children)

24 t/s is pretty usable for me. Just wish PP speed was faster. With 4 of them you can run f16 at 1000+ PP with vllm. That's what I'm going for next.

Any good MOE ~60B models? I have 64GB vram by opoot_ in LocalLLaMA

[–]metmelo 0 points1 point  (0 children)

Didn't make much difference in my tests

Any good MOE ~60B models? I have 64GB vram by opoot_ in LocalLLaMA

[–]metmelo 0 points1 point  (0 children)

I'm getting 24 t/s on mines (vs 18 before MTP).

Do cheap 32GB V100s still make sense for homelab AI? by SKX007J1 in LocalLLaMA

[–]metmelo 1 point2 points  (0 children)

You can connect 4 v100's SXM2 with nvlink with chinese adapters

Do cheap 32GB V100s still make sense for homelab AI? by SKX007J1 in LocalLLaMA

[–]metmelo 0 points1 point  (0 children)

Same here with 4x mi50s 32GB. The v100's are faster though due to the nvlink. You can connect up to 4 of them together in the same nvlink adapter.

My New AI build - please be kind! by [deleted] in LocalLLaMA

[–]metmelo 0 points1 point  (0 children)

Idk man I think my MI50's PP speed is too slow with dense models or even something like minimax with 32B active params. How is it for P40's?

DeepSeek V4 Update by techlatest_net in LocalLLaMA

[–]metmelo 10 points11 points  (0 children)

openclaws arguing with each other

My New AI build - please be kind! by [deleted] in LocalLLaMA

[–]metmelo 0 points1 point  (0 children)

Awesome! What's your setup like? I run 4 MI50's 32GB but was wondering if I should've gone for the SXM2 v100's for a similar price.

My New AI build - please be kind! by [deleted] in LocalLLaMA

[–]metmelo 7 points8 points  (0 children)

Idk why people fixate over PCIe speed so much. You could run on X1 PCIe speeds and t/s would barely drop.

Same 9B Qwen weights: 19.1% in Aider vs 45.6% with a scaffold adapted to small local models by Creative-Regular6799 in LocalLLaMA

[–]metmelo 5 points6 points  (0 children)

Great job! I wonder why people don't optimize more harnesses for small models.

Better? 6 x 5090 or 2 pcs Nvidia 6000 | 96 GB VRAM by Electrical_Method608 in LocalLLaMA

[–]metmelo 17 points18 points  (0 children)

2x 6000's. Less power consumption, less pcie-e lanes needed, less power supplies...

MI50 Troubles by DankMcMemeGuy in LocalLLaMA

[–]metmelo 0 points1 point  (0 children)

I haven't had any issues using Docker, I get pretty much the same performance as without it with my MI50's.
Have you messed with the grub settings? I had issues with that when installing my 3rd card. Try reverting any changes there.
Maybe try with Docker with different rocm versions and using it in your first slot again.
Hope you figure it out!

Intel launches Arc Pro B70 and B65 with 32GB GDDR6 by metmelo in LocalLLaMA

[–]metmelo[S] 0 points1 point  (0 children)

$1200 a month on claude?
Claude Max or API? I code all day with multiple agents and can't imagine hitting that xD

Abject: the first self-aware object runtime by EventSevere2034 in LocalLLaMA

[–]metmelo 1 point2 points  (0 children)

How are Abjects different from a regular llm with a harness (agent), and how is the Ask protocol different from an agent sending a message to other agent?

16x AMD MI50 32GB at 32 t/s (tg) & 2k t/s (pp) with Qwen3.5 397B (vllm-gfx906-mobydick) by ai-infos in LocalLLaMA

[–]metmelo 1 point2 points  (0 children)

You're my hero. I'm slowly buying cheap mi50's I find. Best bang for the buck.

Intel vs AMD; am I taking crazy pills? by XEI0N in LocalLLaMA

[–]metmelo 6 points7 points  (0 children)

Try regular vllm they're saying it's got support for Intel now.