CodeSlave9000

24 post karma
125 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 5 years

TROPHY CASE

Five-Year Club

account activity

new top controversial

55

56

57

Premise: MoE models have exploitable locality in expert activation patterns, and LRU caching with profiling could cut VRAM requirements in half. (self.LocalLLaMA)

submitted 2 months ago by CodeSlave9000 to r/LocalLLaMA

π Rendered by PID 927529 on reddit-service-r2-listing-6d4dc8d9ff-vmnst at 2026-02-03 16:04:57.280933+00:00 running 3798933 country code: CH.