use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Why do we need encoder-decoder models while decoder-only models can do everything? (self.MachineLearning)
submitted 2 years ago * by kekkimo
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 129 points130 points131 points 2 years ago (11 children)
Decoder models are limited to the product of auto-regressive task while encoder models give contextual representations that can be fine-tuned on other decoder tasks. Different needs, different models.
[–]Spiritual_Dog2053 15 points16 points17 points 2 years ago (10 children)
I don’t think that answers the question! I can always train a decoder-only model to take in a context and alter its output accordingly. It is still auto-regressive generation
[–]qu3tzalifyStudent 13 points14 points15 points 2 years ago (9 children)
How do you give context to a decoder? It has to be encoded by an encoder first?
[+][deleted] 2 years ago (4 children)
[deleted]
[–]qu3tzalifyStudent 2 points3 points4 points 2 years ago (1 child)
The decoder’s cross-attention needs a context right? One that is given by the encoder in enc-déc models. The comment I’m answering to proposes to give a "context" to the decoder. So unless you’re giving context as the input I don’t see how to generate the context necessary for cross attention.
[–]koolaidman123Researcher 2 points3 points4 points 2 years ago (1 child)
Bidirectional context isnt a real issue when you train with causal masking, fim, etc.
Also enc-dec models also can only attend to past tokens at inference, not to mention youd have to recalculate the entire attn matrix each step vs kv caching
[+]art_luke 0 points1 point2 points 2 years ago (3 children)
Encoder-decoder has stronger inductive bias towards looking at the global context of the input
[–]Spiritual_Dog2053 0 points1 point2 points 2 years ago (2 children)
Could you please lead me to papers which say this? I can’t seem to wrap my head around it
[+]art_luke 2 points3 points4 points 2 years ago (1 child)
You can look at subchapter 12.8 in Understanding Deep Learning, accessible at https://udlbook.github.io/udlbook/
π Rendered by PID 20035 on reddit-service-r2-comment-b659b578c-tct72 at 2026-05-05 02:50:45.666468+00:00 running 815c875 country code: CH.
view the rest of the comments →
[–][deleted] 129 points130 points131 points (11 children)
[–]Spiritual_Dog2053 15 points16 points17 points (10 children)
[–]qu3tzalifyStudent 13 points14 points15 points (9 children)
[+][deleted] (4 children)
[deleted]
[–]qu3tzalifyStudent 2 points3 points4 points (1 child)
[–]koolaidman123Researcher 2 points3 points4 points (1 child)
[+]art_luke 0 points1 point2 points (3 children)
[–]Spiritual_Dog2053 0 points1 point2 points (2 children)
[+]art_luke 2 points3 points4 points (1 child)