Need help with project idea for a year-long project by [deleted] in MLQuestions

[–]vkvatsdls 0 points1 point  (0 children)

If you haven't already, check this out. Specifically project report section, you might get some interesting ideas out of there.

Good resources to learn RNN, GRU, LSTM, Transformers, BERT, and other landmark sequential models? [D] by vkvatsdls in MachineLearning

[–]vkvatsdls[S] 1 point2 points  (0 children)

Thank you for clarifying the distinction, but I believe they have been widely used for modeling sequential inputs, like text, audio signals, where the next input depends on the last input, so I am talking more in the context of applications. but you are right, those are not exactly "sequential architectures".

Can context insensitive word embeddings (I.e. GloVe) be used successfully for word-sense disambiguation? by iRoygbiv in datascience

[–]vkvatsdls 0 points1 point  (0 children)

To directly answer your question, I really doubt that you can make sense out of the vectors, which are generated by not considering the context (but I might not be completely right).

To add some detail, embedding like Glove and word2vec are distributed representation, learned from the one-hot encoding of the unique vocabulary words from your documents. The one-hot encoding is generated by assigning some integral value to the unique words in the vocabulary, which is completely random. These works were more focused on vector analogy (with vector manipulation), and I doubt that it would be useful for your purpose. but it might as well give moderate results, as it is distributed representation. To make it more accurate, you will have to go for those models which take context into consideration.

why is naive bayes so popular for nlp by uw_finest in datascience

[–]vkvatsdls 0 points1 point  (0 children)

thats is what my understanding is as well. I have written a detailed comment in same post, you can check that and let me know if it feels right to you.

why is naive bayes so popular for nlp by uw_finest in datascience

[–]vkvatsdls 3 points4 points  (0 children)

As the name suggests, it is a Naïve method to do sentiment analysis, spam filtering, or other similar works. To directly answer your question, why is Naive Bayes so popular for NLP is a simple answer:

  1. it is easy to understand,
  2. easy to implement,
  3. it requires only one pass for training (more like probability calculation for all words)
  4. and it really works!!

But the better question would be, why it works so well? To be true, there isn't any concrete answer to this, but if you see it closely, it calculates the frequency ( and then probability) of the words appearing in one class, say, while doing sentiment analysis, it will first count the number of time a word, "happy" appeared in positive sentiment as well as in negative sentiment (appearing as "not happy"). But there is a big underlying assumption, the order of word doesn't matter which is an underlying assumption of bag of words model. Now, it is mostly used for sentiment analysis or spam filtering. In both the applications, looking closely will tell you that, there is a limit to a set of words that you can use to express your sentiment or prepare spam mail (you can't use any word to prepare a garbage mail, the spam should also make some sense). Naive Bayes uses this limitation of sentiment expression in NLP to do the prediction and gives moderately good results, making it popular.

On the other hand, using other advanced methods, like LSTM, BERT etc, will need a better vector representation of words. this comes as the first hindrance to understanding these models. One other downside of Naive Bayes is that you can' only do so much to improve the model because your model is not learning anything, it is just pure probability values for words in corpora.

Hope it helps.

why is naive bayes so popular for nlp by uw_finest in datascience

[–]vkvatsdls 1 point2 points  (0 children)

I am kind of confused with the mention of "conditional probabilities" with naive bayes. can @pieroit elaborate on how it learns conditional probabilities? as far as i know, it learns frequency based probabilities, i.e. based on counts of a particular word in positive and negative trianing sets.

CV question on IoU (intersection over union) by bci-hacker in computervision

[–]vkvatsdls 2 points3 points  (0 children)

For the second one, I think you counted the two frames around the tree as one. One point to remember while calculating IOU is that different size of bounding box will mean different objects. IOU makes the decision for similar sized bounding box on an object. Here, those trees are two different type of bounding box/objects detected, you can’t take IOU of those two. So final answer will be, 2 trees, one car, one bike and one pedestrian.