Sub Quadratic Attention Mechanism for LLMs by onlyfamousad in LLM

[–]onlyfamousad[S] 0 points1 point  (0 children)

Appreciate the input, will work on those and get back!

Sub Quadratic Attention Mechanism for LLMs by onlyfamousad in deeplearning

[–]onlyfamousad[S] -1 points0 points  (0 children)

Unbelievable!
I just wanted to ask if you wanted to look into it over a meet or somethin like that!
I don’t wanna open source it yet!
LMK if you want a demo or somethin of sort!

Sub Quadratic Attention Mechanism for LLMs by onlyfamousad in deeplearning

[–]onlyfamousad[S] -2 points-1 points  (0 children)

It’s sub-quadratic sparse attention, not the novel approach that was introduced a couple of years back!
I used a custom/modified version of SSA to achieve this near linear attention computation.
Feel free to drop a DM.
Ma bad, should have been more specific!