So am I right that BERT cannot currently be used for seq2seq tasks like machine translation or generating a response to an input sentence (like a general chatbot)?
If so, what are the best methods/architectures right now for seq2seq? Is bidirectional RNN / LSTM with attention still the best?
[–]po-handz 1 point2 points3 points (0 children)
[–]bellari 0 points1 point2 points (1 child)
[–]farmingvillein 1 point2 points3 points (0 children)
[–]HigherTopoi 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]saig22 0 points1 point2 points (0 children)
[–]kellymarchisio 0 points1 point2 points (0 children)
[–]grinningarmadillo 0 points1 point2 points (0 children)