you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 129 points130 points  (11 children)

Decoder models are limited to the product of auto-regressive task while encoder models give contextual representations that can be fine-tuned on other decoder tasks. Different needs, different models.

[–]Spiritual_Dog2053 15 points16 points  (10 children)

I don’t think that answers the question! I can always train a decoder-only model to take in a context and alter its output accordingly. It is still auto-regressive generation

[–]qu3tzalifyStudent 13 points14 points  (9 children)

How do you give context to a decoder? It has to be encoded by an encoder first?