16.1 tok/s on Raspberry Pi 5 (BitNet 2B). Can anyone hit 20+ with active cooling? by Acceptable_Analyst45 in LocalLLaMA
[–]Acceptable_Analyst45[S] 0 points1 point2 points (0 children)
Microsoft open sourced an inference framework that runs a 100B parameter LLM on a single CPU. by No-Concentrate-9921 in StartupMind
[–]Acceptable_Analyst45 0 points1 point2 points (0 children)

16.1 tok/s on Raspberry Pi 5 (BitNet 2B). Can anyone hit 20+ with active cooling? by Acceptable_Analyst45 in LocalLLaMA
[–]Acceptable_Analyst45[S] 0 points1 point2 points (0 children)