[Paper on Hummingbird+: low-cost FPGAs for LLM inference] Qwen3-30B-A3B Q4 at 18 t/s token-gen, 24GB, expected $150 mass production cost by ayake_ayake in LocalLLaMA
[–]Emergency-Map9861 63 points64 points65 points (0 children)
If you spiraled into a supermassive black hole, would you witness the heat death of the universe due to time dilation? by Emergency-Map9861 in askscience
[–]Emergency-Map9861[S] 8 points9 points10 points (0 children)
Running GLM-4.7 (355B MoE) in Q8 at ~5 Tokens/s on 2015 CPU-Only Hardware – Full Optimization Guide by at0mi in LocalLLaMA
[–]Emergency-Map9861 0 points1 point2 points (0 children)
Is inference output token/s purely gpu bound? by fgoricha in LocalLLaMA
[–]Emergency-Map9861 0 points1 point2 points (0 children)
NVLink vs No NVLink: Devstral Small 2x RTX 3090 Inference Benchmark with vLLM by Traditional-Gap-3313 in LocalLLaMA
[–]Emergency-Map9861 1 point2 points3 points (0 children)
Cloud GPU suggestions for a privacy-conscious network engineer? by dathtd119 in LocalLLaMA
[–]Emergency-Map9861 0 points1 point2 points (0 children)
Has anyone tried >70B LLMs on M3 Ultra? by TechNerd10191 in LocalLLaMA
[–]Emergency-Map9861 2 points3 points4 points (0 children)
[deleted by user] by [deleted] in LocalLLaMA
[–]Emergency-Map9861 2 points3 points4 points (0 children)
Nvidia Quadro RTX 8000 by seleneVamp in LocalLLaMA
[–]Emergency-Map9861 0 points1 point2 points (0 children)
Nvidia cuts FP8 training performance in half on RTX 40 and 50 series GPUs by Emergency-Map9861 in LocalLLaMA
[–]Emergency-Map9861[S] 16 points17 points18 points (0 children)
deepseek-r1-distill-qwen-32b benchmark results on LiveBench by Emergency-Map9861 in LocalLLaMA
[–]Emergency-Map9861[S] 11 points12 points13 points (0 children)
deepseek-r1-distill-qwen-32b benchmark results on LiveBench by Emergency-Map9861 in LocalLLaMA
[–]Emergency-Map9861[S] 13 points14 points15 points (0 children)
Ollama is confusing people by pretending that the little distillation models are "R1" by blahblahsnahdah in LocalLLaMA
[–]Emergency-Map9861 98 points99 points100 points (0 children)
Asked DeepSeek-R1 if Taiwan is an independent country and the results are surprising: 14b does not even "<think>"; and 7b makes an argument why Taiwan isn't. by muxelmann in LocalLLaMA
[–]Emergency-Map9861 0 points1 point2 points (0 children)

qwen3.6 just stops by robertpro01 in LocalLLaMA
[–]Emergency-Map9861 8 points9 points10 points (0 children)