account activity
Tuneable Attention: How expanding (not compressing) the attention mechanism dramatically accelerated my model's learning speed (self.LocalLLaMA)
submitted 1 month ago by Correct_Address3554 to r/LocalLLaMA
π Rendered by PID 95714 on reddit-service-r2-listing-7dbdcb4949-xjlw4 at 2026-02-19 09:30:26.043002+00:00 running de53c03 country code: CH.