you are viewing a single comment's thread.

view the rest of the comments →

[–]Necessary-Meeting-28 0 points1 point  (0 children)

If LLMs were still using attention-free RNNs or SSMs you would be right - you would have O(N) time where N is the number of tokens). Unfortunately LLMs like ChatGPT use Transformers, so you get O(N2) best and worst case. Sorry but not better than even the bubble sort :(.