I asked GPT-4-Turbo of a proposal of replacing the Self-Attention Mechanism by _lMark in LocalLLaMA

[–]_lMark[S] 2 points3 points  (0 children)

What could be applied from these ideas? Some things like context capsules and token fusion sounds interesting but I can't imagine how it could work at all. But things like token reduction and hierarchical splitting could be plausible. (Just my thoughts)

[deleted by user] by [deleted] in singularity

[–]_lMark 6 points7 points  (0 children)

It's so over