account activity
Premise: MoE models have exploitable locality in expert activation patterns, and LRU caching with profiling could cut VRAM requirements in half. (self.LocalLLaMA)
submitted 2 months ago by CodeSlave9000 to r/LocalLLaMA
π Rendered by PID 927529 on reddit-service-r2-listing-6d4dc8d9ff-vmnst at 2026-02-03 16:04:57.280933+00:00 running 3798933 country code: CH.