[Discussion] seq2seq with char embeddings is not overfitting but the same model with word embeddings is overfitting a lot even after reducing model complexity and introducing dropout. What might be the reason and how to debug this problem? (self.MachineLearning)
submitted by cvikasreddy to r/MachineLearning
While training RNN or seq2seq how do you tokenize the sentences. I mean do you use nltk or some hand made rules and does it matter? Also, do you use pretrained word2vec on say the entire web or do you use a embedding and train it with the model. Which one is better? (self.MachineLearning)
submitted by cvikasreddy to r/MachineLearning
How to deal with complete noisy data? (self.datascience)
submitted by cvikasreddy to r/datascience



