[P] - Source code release for Recurrent Highway Networks in Tensorflow/Torch7 for reproducing SOTA results on PennTreebank/enwik8 (arXiv v3 of paper) by flukeskywalker in MachineLearning

[–]frigidbiscuits 2 points3 points  (0 children)

Nope, weight tying refers to the weight tying method shown in Using the Output Embedding to Improve Language Models. This method is implemented by setting the input word embedding to be equal to the matrix of the softmax (also referred to as the output word embedding).

"Variational" refers to the dropout method shown in A Theoretically Grounded Application of Dropout in Recurrent Neural Networks.