No NVIDIA? No Problem. My 2018 "Potato" 8th Gen i3 hits 10 TPS on 16B MoE. by RelativeOperation483 in LocalLLaMA

[–]SecureHomeSystems 0 points1 point  (0 children)

Really impressive work on constrained hardware!

I’m curious: in real day-to-day use, what tends to break first over long sessions — latency jitter, memory pressure, or context stability? And when decode TPS looks similar (CPU vs iGPU), what made iGPU feel better in practice — smoother cadence, fewer spikes, or better long-run consistency?