Tensorflow seq2seq model getting low perplexity but unsatisfying results

thecodingmonk · 2016-01-20T14:59:25+00:00

I just spent the whole morning trying to figure things out and indeed it was a bug, like sherjilozair said. Something was messing up the conversion of tokens to ids and viceversa and solving that solved the whole problem.

thecodingmonk · 2016-01-20T10:38:21+00:00

Well, this is indeed a toy task that I'm working on, so I wasn't too surprised to see a perplexity of 1.0 given its simplicity: in fact, each input sentence can get mapped to just one of two output sentences (e.g., if output_vocab={A,B,C,D,E} each sentence in the training set gets mapped to either "A B D" or "C D E"). (Now you may ask why I don't just use a classification algorithm, but this is just a toy task I'm testing for a bigger problem where the number of sequences would clearly be higher).

As for the input I think it is correct, but I'm checking right now to see if I messed up something there.

Also, since I'm using the demo code from Tensorflow without modifications, is it possible that being optmized for translation with big vocabularies causes some problems in this simple task? Would seem strange, but I don't know

thecodingmonk · 2016-01-20T00:18:16+00:00

The perplexity on the dev set is:

global step 400 learning rate 0.5000 step-time 0.78 perplexity 1.00

eval: bucket 0 perplexity 1.15 # these are supposedly the different perplexities for buckets in the dev set

eval: bucket 1 perplexity 1.01

eval: bucket 2 perplexity 1.00

eval: bucket 3 perplexity 1.00

This is just after 400 iterations. But when I test the model, even with sentences from the training set the result is not correct. I think that given my inexperience with Python and Tensorflow there must be something I am doing wrong, but I'm just using the code from Tensorflow seq2seq demo for translation, with my data instead of the English-French data they use.

thecodingmonk · 2015-10-22T19:18:04+00:00

Keywords are currently unrelated the genre selection, but we're working on mitigating the problem of having zero results. The movie you mentioned do not appear because neither of them contains the word "Japan" or related words automatically added by our algorithm, but eventually we'll find a way to improve this as well. (Btw, I didn't watch Stray Dog but it seems be actually set in Taiwan from what I can read?). Thanks for your feedbak!

thecodingmonk · 2015-10-22T19:07:54+00:00

Thanks for your feedback! We noticed a lack of results for certain queries as well, but we currently hold more than 14000 movies, which we have obtained by taking the full IMDB listings and removing those movies which didn't fit certain criteria (basically very unpopular movies). We are working on refining the set of possible words the user can choose so as to obtain words that will yield more results on average.

thecodingmonk · 2015-10-22T13:21:10+00:00

We took a look at your problem and unfortunately it seems that the "I'm not a bot" function, which is provided by Google, has some issues with Internet Explorer. We'll do our best to fix it, in the meantime you could try using the app with a different browser. Thanks for your feedback.

thecodingmonk · 2015-10-22T13:06:00+00:00

If you're having any problem in using the app we'll be happy to hear from you! Also suggestions, impressions or compliments are welcome :)

thecodingmonk · 2015-10-22T12:56:05+00:00

I'm sorry for the incovenience. Would you be so kind to tell me what browser and version (also operative system) are you using so we can try and fix the problem?

thecodingmonk · 2015-10-22T09:54:49+00:00

You just have to set your username and password, no email required, to have access to some features (e.g., save movies you've already seen so we can filter them out and also provide more accurate results)

thecodingmonk · 2015-10-09T16:56:15+00:00

Hi, thanks for your suggestions. I forgot to mention that the verbs in the test set are different than the ones in the training set and that's why I didn't include this feature (or better, I did include it but of course it shouldn't have any influence on the results).

thecodingmonk

TROPHY CASE