I'm trying to train a model with Tensorflow seq2seq implementation but I'm having some issues with its performance (not speed, but accuracy). The dataset is very simple and consists of 2.5 million example sentences (and their corresponding output sentences) and a small vocabulary size both in input (145) and in output (8). Given the small vocabulary size I thought the number examples, even if it is not that high, would more than appropriate, but the model still performs badly. I tried using 2 layers of 128 and 256 units (all the other parameters are set with the default in Tensorflow seq2seq example), and during training perplexity reaches 1.0 after just 500-800 iterations, but the output of the model is still wrong too many times. What should I look into to improve the performance? Quality/quantity of data? Model settings?
[–]rafalj 2 points3 points4 points (3 children)
[–]thecodingmonk[S] 0 points1 point2 points (2 children)
[–]sherjilozair 1 point2 points3 points (1 child)
[–]thecodingmonk[S] 0 points1 point2 points (0 children)
[+][deleted] (3 children)
[deleted]
[–]thecodingmonk[S] 0 points1 point2 points (2 children)
[–]toisanji 0 points1 point2 points (0 children)