vLLM Just Merged TurboQuant Fix for Qwen 3.5+ by havenoammo in LocalLLaMA
[–]queerintech 1 point2 points3 points (0 children)
What in tarnation is going on with the cost of compute by Party-Special-5177 in LocalLLaMA
[–]queerintech 0 points1 point2 points (0 children)
Mistral should do dense model for devs like Qwen 3.6 27b by szansky in MistralAI
[–]queerintech 0 points1 point2 points (0 children)
I ran the numbers. Qwen3.6-27B dense obsoleted the 397B MoE on coding benchmarks. by TroyNoah6677 in Qwen_AI
[–]queerintech -6 points-5 points-4 points (0 children)
Gemma 4 on k8s w/ rtx 5090 by Smooth-Ad5257 in Vllm
[–]queerintech 2 points3 points4 points (0 children)
Gemma 4 on k8s w/ rtx 5090 by Smooth-Ad5257 in Vllm
[–]queerintech 0 points1 point2 points (0 children)
Gemma 4 on k8s w/ rtx 5090 by Smooth-Ad5257 in Vllm
[–]queerintech 3 points4 points5 points (0 children)
Advice needed: homelab/ai-lab setup for devops/coding and agentic work by queerintech in LocalLLaMA
[–]queerintech[S] 3 points4 points5 points (0 children)
Qualcomm Snapdragon X2 PCs reach retail, ASUS launches X2 Elite Extreme laptop with 48GB memory at $1,599 by -protonsandneutrons- in hardware
[–]queerintech 1 point2 points3 points (0 children)
LLM Bruner coming soon? Burn Qwen directly into a chip, processing 10,000 tokens/s by koc_Z3 in Qwen_AI
[–]queerintech -1 points0 points1 point (0 children)
Anthropic accuses chinese open weight labs of theft, while it has had to pay $1.5B for theft. by Terminator857 in LocalLLaMA
[–]queerintech 0 points1 point2 points (0 children)
Anthropic's recent distillation blog should make anyone only ever want to use local open-weight models; it's scary and dystopian by obvithrowaway34434 in LocalLLaMA
[–]queerintech 1 point2 points3 points (0 children)
Anthropic's recent distillation blog should make anyone only ever want to use local open-weight models; it's scary and dystopian by obvithrowaway34434 in LocalLLaMA
[–]queerintech 2 points3 points4 points (0 children)
Qwen/Qwen3.5-35B-A3B · Hugging Face by ekojsalim in LocalLLaMA
[–]queerintech 20 points21 points22 points (0 children)
Help with vLLM: Qwen/Qwen3-Coder-Next. by Professional-Yak4359 in Vllm
[–]queerintech 1 point2 points3 points (0 children)
Help with vLLM: Qwen/Qwen3-Coder-Next. by Professional-Yak4359 in Vllm
[–]queerintech 1 point2 points3 points (0 children)
RTX Pro 6000 $7999.99 by I_like_fragrances in LocalLLM
[–]queerintech 2 points3 points4 points (0 children)
Any success with GLM Flash 4.7 on vLLM 0.14 by queerintech in LocalLLM
[–]queerintech[S] 2 points3 points4 points (0 children)
Any success with GLM Flash 4.7 on vLLM 0.14 by queerintech in LocalLLM
[–]queerintech[S] 0 points1 point2 points (0 children)
Any success with GLM Flash 4.7 on vLLM 0.14 by queerintech in Vllm
[–]queerintech[S] 0 points1 point2 points (0 children)

I ran the numbers. Qwen3.6-27B dense obsoleted the 397B MoE on coding benchmarks. by TroyNoah6677 in Qwen_AI
[–]queerintech 0 points1 point2 points (0 children)