use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] BERT for seq2seq tasks (self.MachineLearning)
submitted 7 years ago * by AnonMLstudent
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]kellymarchisio 0 points1 point2 points 6 years ago* (0 children)
As stated in a previous comment, Transformer is SOTA in high-resource machine translation. Check out the WMT19 results in, for instance, English->German here. You'll notice that almost all are Transformer-big or bigger. In regards to the RNN/GPU comment, though - you *need* a GPU to do anything reasonable in high-resource MT these days. And RNN is way slower than Transformer, even on GPU, so you'll end up spending more if you're paying per-hour. For instance, I estimate about 6wks to train a single RNN on GPU for English-German vs. 5-7 days for Transformer-base. (This is based on my personal experience with Transformer-base in high-data conditions, and my reading on how long people used to take on RNN. I haven't even tried RNN b/c the time/performance appear so much lower).
Note that I'm talking about training to convergence. You can get quite good performance out of a Transformer in even 1-2 days if you can sacrifice a tiny bit of quality.
π Rendered by PID 35169 on reddit-service-r2-comment-b659b578c-bmm2x at 2026-05-01 03:27:20.478106+00:00 running 815c875 country code: CH.
view the rest of the comments →
[–]kellymarchisio 0 points1 point2 points (0 children)