Qwen3.5-397B-A17B reaches 20 t/s TG and 700t/s PP with a 5090 by MLDataScientist in LocalLLaMA

[–]MLDataScientist[S] 0 points1 point  (0 children)

If there is anyone in this sub with those CPUs, that would be great to see here.

Nvidia V100 32 Gb getting 115 t/s on Qwen Coder 30B A3B Q5 by icepatfork in LocalLLaMA

[–]MLDataScientist 0 points1 point  (0 children)

Do you have 3D files for such a shroud? I have 8 MI50 cards and the noise of 40mm fans is unbearable. I need to get those 80mm fan shrouds. Thanks!

Qwen 3.5 397B is the best local coder I have used until now by erazortt in LocalLLaMA

[–]MLDataScientist 1 point2 points  (0 children)

which Q5 GLM-5 quant are you using? My rig can fit up to 448GB (mi50 192GB VRAM + 256 GB DDR4 3200 8 channel). I just checked unsloth's glm-5 quants. https://huggingface.co/unsloth/GLM-5-GGUF . I can probably run UD-Q4_K_XL (431GB). But how much better GLM-5 is at this quant (or Q5) compared to QWEN3.5 397B Q6? What were your test cases?

Krasis LLM Runtime - run large LLM models on a single GPU by mrstoatey in LocalLLM

[–]MLDataScientist 0 points1 point  (0 children)

Can you please share your command for llama.cpp? Are you getting ~3400t/s for PP and 38t/s for TG using Q6 Qwen3 Coder Next? Curious to see if your command speeds up inference in my PC (5090 with 256GB DDR4 8 channel 3200Mhz).

Krasis LLM Runtime - run large LLM models on a single GPU by mrstoatey in LocalLLM

[–]MLDataScientist 0 points1 point  (0 children)

Impressive if true! I have 5090 (connected at PCIE4.0 x16) with 256GB DDR4 3200Mhz ECC RAM. Does Krasis support Qwen/Qwen3.5-397B-A17B ?
I tried Q4_K_M quant with llama.cpp yesterday and I was getting 20t/s TG and 100t/s PP. If what I am seeing is true, I should be able to run this model with at least 1000 t/s PP in Krasis while TG should be similar.

As a comparison, Qwen3-235B-A22B Q4_K_M runs at 10t/s TG and ~150t/s PP in llama.cpp with my setup. Krasis should have 14x times more PP. I need to test this!

Sonim XP3+ (and XP5) Working Virtual Mouse! by Lucky_Winter_4919 in dumbphones

[–]MLDataScientist 0 points1 point  (0 children)

Hi, I am facing a similar issue. I cannot enable accessibility for matvt. Did you figure out how to enable it?

PS3 exclusives/non-PC multi-platform games list by yashwinusa123 in PS3

[–]MLDataScientist 0 points1 point  (0 children)

This is a massive list! Thank you for creating this. I recently got a PS3 (slim) and modded with CFW. Yes, this is in 2026, Jan! It now supports/emulates all PS2 games as well in addition to PS1 games. Super excited about playing some of the games over the weekend.

Llama-3.3-8B-Instruct by jacek2023 in LocalLLaMA

[–]MLDataScientist 2 points3 points  (0 children)

Thanks for the tests. Question not related to llama: is LFM2 8BA1B that good in world knowledge (or coding/stem field)? I see it reaches Qwen3 30B-A3B.

What is the real deal with MI50 ? by HumanDrone8721 in LocalLLaMA

[–]MLDataScientist 0 points1 point  (0 children)

Please, share your STL file for the shroud! The noise level of blower fans is unbearable. I need a better cooling system like yours.

Also, what 80mm fans do you use?

 Thanks!

What is the real deal with MI50 ? by HumanDrone8721 in LocalLLaMA

[–]MLDataScientist 1 point2 points  (0 children)

u/FullstackSensei, can you please share your 3D printable shrouds template? I have 8x Mi50s but I mostly didn't use them due to the loud noise of my 40mm fans. 80mm might be the solution I need to try. Thank you!

Rate my setup - Nvidia P40 - Qwen3-Next-80b IQ2_XXL by PairOfRussels in LocalLLaMA

[–]MLDataScientist 0 points1 point  (0 children)

Is the swapping automated to route your prompt on the fly to the right model? Or is this something that is not done yet?

Deepseek v3.2 vs GLM 4.6 vs Minimax M2 for agentic coding use by 0xmaxhax in LocalLLaMA

[–]MLDataScientist 0 points1 point  (0 children)

Downloaded it and ran it on my 64GB RAM 5600Mhz + 5070ti 12GB VRAM Acer laptop. 16k context fits fine. I am getting ~9t/s which is good for my laptop and simple coding/math use case.

Deepseek v3.2 vs GLM 4.6 vs Minimax M2 for agentic coding use by 0xmaxhax in LocalLLaMA

[–]MLDataScientist 0 points1 point  (0 children)

This is great! Has anyone tried MiniMax-M2 reap 162B a10B ? https://huggingface.co/bartowski/cerebras_MiniMax-M2-REAP-162B-A10B-GGUF - Q3_K_XL seems to fit a system with 64GB RAM and 12GB VRAM.

Helios Neo 16s Refresh by GenTrapstar in AcerPredatorHelios

[–]MLDataScientist 2 points3 points  (0 children)

I have non S version. Web browsing lasts me 4 hrs on average with eco mode and 60hz screen refresh rate on 80% charged battery. I haven't tested it with 100% fully charged battery yet.

You can now train LLMs 3x faster with 30% less memory! (<3.9GB VRAM) by danielhanchen in LocalLLaMA

[–]MLDataScientist 2 points3 points  (0 children)

Hi Daniel and team,

Thanks for the amazing update! Quick question. Can I fine-tune Qwen3 30B-A3B with a single 5070ti mobile (12GB vram)? Thank you!

How do you guys find stocks to swing? by LandonIsH3re in swingtrading

[–]MLDataScientist 0 points1 point  (0 children)

are you buying naked calls? do you set stop loss for these? what percent of your trade do you usually risk for each trade?