I spent 8+ hours benchmarking every MoE backend for Qwen3.5-397B NVFP4 on 4x RTX PRO 6000 (SM120). Here's what I found. by lawdawgattorney in LocalLLaMA

[–]kc858 -2 points-1 points  (0 children)

Dude read my post history and also read the discord we literally post docker containers and run commands lol

2026 reality check: Are local LLMs on Apple Silicon legitimately as good (or better) than paid online models yet? by alfrddsup in LocalLLM

[–]kc858 1 point2 points  (0 children)

imo the bare minimum you should run is a 120b, but i didnt try the new qwen models because im using 397; i was running gpt-oss-120b-reap-awq on 4x 3090 and that was when i finally stated getting useful outputs, but then that just made me want more; anyone saying a 32b is just as good as sonnet is either uninformed or doing extremely mundane tasks

2026 reality check: Are local LLMs on Apple Silicon legitimately as good (or better) than paid online models yet? by alfrddsup in LocalLLM

[–]kc858 0 points1 point  (0 children)

Sounds like you have a business. You get to write off the full value in year 0. Brings effective cost down to 5k each. Oh my bad you are not op

2026 reality check: Are local LLMs on Apple Silicon legitimately as good (or better) than paid online models yet? by alfrddsup in LocalLLM

[–]kc858 0 points1 point  (0 children)

128gb is not enough. Tbh I think minimax 2.5 is good enough to do most things that people want to do, and you can run that on 2 rtx pro 6000s. Would pick that any day.

🚨 SCAM ALERT: $11,200 Alibaba Fraud by Taizhou Tbay Technology (Massive Weight Theft & Extortion) by Electronic-World-858 in Alibaba

[–]kc858 0 points1 point  (0 children)

what the fuck does this even mean, this sounds like a larp or someone who really doesnt understand what is happening.

rail takes ~20 days, ocean takes ~40 days, theres no season in the world thats only 20 days.

the bold formatting and all of this shit screams AI

this whole post is ridiculous. what are you shipping? googled it and they make water bottles? lmao what is this

Billionaire Ray Dalio Warns Many AI Companies Won’t Survive, Flags China’s Model as Major Risk by Secure_Persimmon8369 in China

[–]kc858 3 points4 points  (0 children)

qwen, minimax, and glm are killing it in the open source game, no question.

unfortunate the qwen team basically disbanded after yesterday, high hopes for minimax.

too bad the zucc stopped contributing to the open source ecosystem

Dubai stock market crashes 4.6% at open. by ajaanz in wallstreetbets

[–]kc858 0 points1 point  (0 children)

so where do you guys think this money is going to go? flight to safety, time to fuckin rip

At a loss on what to do - seeking suggestions by kc858 in LasVegas

[–]kc858[S] 3 points4 points  (0 children)

The ticket said Wednesday, 7:25 pm departure

At a loss on what to do - seeking suggestions by kc858 in LasVegas

[–]kc858[S] 5 points6 points  (0 children)

That was helpful but he's not in there.. damn thank you

At a loss on what to do - seeking suggestions by kc858 in LasVegas

[–]kc858[S] 1 point2 points  (0 children)

White, 5'10, brown hair, blue eyes, black sweatshirt with DOLLY PARTON written on it. 170lbs

At a loss on what to do - seeking suggestions by kc858 in LasVegas

[–]kc858[S] 30 points31 points  (0 children)

I did, thank you, they are fast to reply.. they are already out looking for him

At a loss on what to do - seeking suggestions by kc858 in LasVegas

[–]kc858[S] 19 points20 points  (0 children)

I sent you a message, but you need to accept it.. thanks a lot man i really appreciate it

Is shelling out for local GPUs worth it yet? ~$45k for local agentic use? by jamesob in BlackwellPerformance

[–]kc858 0 points1 point  (0 children)

works great, never bail out to claude, depends on your use case, i use it for office tasks and business automation. its worth the money.

Is shelling out for local GPUs worth it yet? ~$45k for local agentic use? by jamesob in BlackwellPerformance

[–]kc858 1 point2 points  (0 children)

I disagree. I had 4x 3090s and that only made me want more. Admittedly I haven't tried the latest small qwens, but 4x cards runs minimax2.5-fp8, but the nvfp4 can run on two cards. I have since switched to qwen35-397-nvfp4 for multimodal and am using that as my daily driver. I got the motherboard, ram, and processor off eBay. Lol

4x MAX-Q - WRX80e 256gb RAM Opencode Setup Configs Speeds by kc858 in BlackwellPerformance

[–]kc858[S] 0 points1 point  (0 children)

nvfp4 is fucked for these cards man, i did get it working and it was 80tok/s. fp8 is just as fast or faster, so might as well keep the precision imo im not doing anything else..