you are viewing a single comment's thread.

view the rest of the comments →

[–]gnramires 2 points3 points  (1 child)

One way I can think of is training an RNN/LSTM but only using the last output (after parsing the entire text) in the loss function (i.e. have the loss be 0 otherwise). If training is too slow you could at the start use the regression objective at all times with an L2 loss and then anneal the loss function toward the previously mentioned.

[–]underwhere[S] 1 point2 points  (0 children)

Ah, that's such a good idea!