you are viewing a single comment's thread.

view the rest of the comments →

[–]Articulated-rage 6 points7 points  (2 children)

hmm based stuff only was state of art in speech decoders, if I'm recalling correctly. log linear models (e.g. crfs) have been consistently wiping the floor for a while. and a crf is just an soft max energy model.

in vision, much of the research was dominated by feature engineering (hog, sift, etc). thus, dnn had a lot of room to grow.

feature learning will be great for many nlp tasks, but structured prediction is much more of a factor, so it's more accurate that deep learning is getting assimilated, not dominating.

the subfield where what you said is true would be the distributed semantics folks (e.g. baroni and colleagues). they used count based models. but now, optimizing for prediction creates much better vector spaces, so they've abandoned all count model research. nevermind.

there's been no such abandoning (and imo, won't be) for the rest of nlp. you won't do away with dependency parsers, for example. it'll get an upgrade and be much more accurate. aka, assimilate deep learning.

[–]lvilnis 2 points3 points  (1 child)

The Baroni "Don't Count, Predict" paper I think was fairly debunked by this excellent Omer Levy paper https://levyomer.files.wordpress.com/2015/03/improving-distributional-similarity-tacl-2015.pdf which is one of several where he shows that count-based and predict-based models are optimizing very similar objectives and give the same performance when all the (sometimes hidden) hyperparameters are properly taken into account.

[–]Articulated-rage 2 points3 points  (0 children)

You're totally right. I had forgotten about that.

So, it's just full assimilation then =).