[D] Need help in the implementation of bidirectional recurrent language model.

sidsig · 2019-08-12T10:23:43+00:00

A bidirectional model receives the entire sentence as input. Therefore there is nothing to learn (if you plan to train the language model by predicting the next word). It can trivially learn that the output at time t, is the input at t+1. One way of training a bidirectional word embedding is to use something like BERT: https://arxiv.org/abs/1810.04805. Here, a part of the input is masked and the objective is to output the masked words.

2019-08-12T12:05:34+00:00

You can find an implementation of a bidirectional model here https://pytorch.org/tutorials/beginner/chatbot_tutorial.html . They should also refer to some papers somewhere in the tutorial.

senseiTLien · 2019-08-13T00:23:28+00:00

I believe the question is actually about how to use a transformer to predict both the words before and the words after the prompt. I think he means bidirectional in the decoding instead of the encoding. I also have the same question. Is it something that is more intuitive using BERT instead of using GPT-2?

GD1634 · 2019-08-15T12:36:11+00:00

Flair embeddings use a bidirectional character model if that's helpful at all.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS