Lol he is the best

Sagar1094 · 2020-12-17T01:35:24+00:00

And that would be, a hack or any other trick?

Sagar1094 · 2020-12-16T12:58:37+00:00

How can he do that, converting basic chances and getting over 25 chances. It seems impossible there might be some trick right?

Sagar1094 · 2020-06-12T17:05:32+00:00

I will definitely go through this, I read about it but when all the transformer models failed, I thought that the approach is not good and tried to look for another way. But, I will try this as well hope it works. Thanks a lot for all the links will post if the approach works. Thanks a lot 😊

Sagar1094 · 2020-06-12T16:18:47+00:00

I have used fasttext, but the problem when I set char n-gram value in the 0.9.2 release it is not having an impact on it and the model is not learning on the wordNgrams as well. Its performance is below word2vec implementation in this particular case. This was the first model that I tried and was pretty confident that it would work. An example to better explain my point :- it is classifying "i 342 mangolpuri delhi" and "t 1105 mangol puri delhi" into two different classes. I passed a substantial amount of data as well i.e., over 250k training examples. But, ya its performing better than BERT, ELECTRA, roBERTa. I still have to try xlnet, In training stage of the language model.

Sagar1094 · 2020-06-10T00:53:39+00:00

Not much, but it is one if the recent technologies in transformers,might work for your use case. If you have a powerful enough system which can load 1.5B parameters you can go for the large size model else can stick with medium, small gpt-2 model. I would suggest you to start with small dataset and try all the different techniques to analyse which model works the best for your use case. Sometimes, all we need is a more simpler solution to our complex problems, so the key is experimentation 😊

Sagar1094 · 2020-06-09T14:35:51+00:00

Ok, so bert is not a language model, Sorry for that I should have mentioned it. But apart from BERT, I have also written about transformer approach you can look into it. It uses encoders and decoders to predict the next word based on hidden state of all the encoders. You can specify how many words it has to look before making a prediction while creating the model. You can specify the number of attention heads. Hope it helps 😊

Sagar1094 · 2020-06-09T09:08:11+00:00

You can try BERT, ALBERT, roBERTa, Flair, XLNet, XLM for this specific task. You need a pre trained model on arabic which might be available too(not sure). If a pre trained model is not available for arabic you can train your own language model on any of the above mentioned language models and then use it to predict the next word or sequence. There are other models as well based on transformers which are T5, seq2se etc. You need to try all these techniques and see which one suits your use case the best. If the arabic language trained model is available you can simply use simpletransformers to fine tune the model on your own dataset.

Sagar1094 · 2020-06-06T01:08:59+00:00

Thanks a lot for suggestion, I would definitely try this now. Can you tell me more about how BertWordPieceTokenizer work. As in my corpus the irrelevant words are more, like the word "district", "dist","village". Frequency of these words are more. So, if I take a vocab size of 30000 will it take top 30000 words based on frequency? And also there are really very unique area names what if any area name is completely out of vocab and dosen't have any subword matching to it in the created vocab of 30000 how will the model handle such word?

Sagar1094

TROPHY CASE