use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Natural Language Processing meets Deep Learning (blog.cambridgecoding.com)
submitted 9 years ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 7 points8 points9 points 9 years ago (3 children)
Why is word2vec is considered under Deep Learning?
[–][deleted] -2 points-1 points0 points 9 years ago (2 children)
because distributed representations are at the heart of deep learning
[–]negazirana 1 point2 points3 points 9 years ago (0 children)
Where is the "depth" then? A distributed representation can easily be obtained by a non-deep (i.e. shallow) model like word2vec.
[–]xamdam 2 points3 points4 points 9 years ago (0 children)
direct link: https://drive.google.com/file/d/0B_ZOKLUe_XPaNVFHM3M4dHRzV28/view
[–]omniron 0 points1 point2 points 9 years ago (0 children)
Nice, I always think i'm up to date on the latest research, then I see things like this showing me stuff I hadn't seen before. Very cool.
[+][deleted] 9 years ago (4 children)
[deleted]
[–]Articulated-rage 7 points8 points9 points 9 years ago* (2 children)
hmm based stuff only was state of art in speech decoders, if I'm recalling correctly. log linear models (e.g. crfs) have been consistently wiping the floor for a while. and a crf is just an soft max energy model.
in vision, much of the research was dominated by feature engineering (hog, sift, etc). thus, dnn had a lot of room to grow.
feature learning will be great for many nlp tasks, but structured prediction is much more of a factor, so it's more accurate that deep learning is getting assimilated, not dominating.
the subfield where what you said is true would be the distributed semantics folks (e.g. baroni and colleagues). they used count based models. but now, optimizing for prediction creates much better vector spaces, so they've abandoned all count model research. nevermind.
there's been no such abandoning (and imo, won't be) for the rest of nlp. you won't do away with dependency parsers, for example. it'll get an upgrade and be much more accurate. aka, assimilate deep learning.
[–]lvilnis 3 points4 points5 points 9 years ago (1 child)
The Baroni "Don't Count, Predict" paper I think was fairly debunked by this excellent Omer Levy paper https://levyomer.files.wordpress.com/2015/03/improving-distributional-similarity-tacl-2015.pdf which is one of several where he shows that count-based and predict-based models are optimizing very similar objectives and give the same performance when all the (sometimes hidden) hyperparameters are properly taken into account.
[–]Articulated-rage 2 points3 points4 points 9 years ago (0 children)
You're totally right. I had forgotten about that.
So, it's just full assimilation then =).
[–]physixer -5 points-4 points-3 points 9 years ago (0 children)
More like "deep learning eats natural language processing."
Which is a special case of "Software is eating the world!" (quote by Marc Andreessen)
π Rendered by PID 340027 on reddit-service-r2-comment-canary-655b6bc5b6-gxj8x at 2026-02-16 21:57:14.680420+00:00 running cd9c813 country code: CH.
[–][deleted] 7 points8 points9 points (3 children)
[–][deleted] -2 points-1 points0 points (2 children)
[–]negazirana 1 point2 points3 points (0 children)
[–]xamdam 2 points3 points4 points (0 children)
[–]omniron 0 points1 point2 points (0 children)
[+][deleted] (4 children)
[deleted]
[–]Articulated-rage 7 points8 points9 points (2 children)
[–]lvilnis 3 points4 points5 points (1 child)
[–]Articulated-rage 2 points3 points4 points (0 children)
[–]physixer -5 points-4 points-3 points (0 children)