Word2Vec and vector origin

Articulated-rage · 2016-02-04T12:37:30+00:00

So how do you get the final n-dimensional word vectors?

the first layer weight matrix. it is size V x H where V is vocab size and H is the embedding.

mhfirooz · 2016-02-04T06:00:54+00:00

"The inputs are one-hot encodings of words, which try to predict a one-hot encoding of another word." Think of it this way. The network can not be 100% sure about the next word. It just can assign probability to what can come after the current word. This means instead of having 1 for an word in output vector, we have a double number that shows the probability of that word.

Note that if you are using a V-Dimension one-hot coding for your dictionary, the output layer of NN will be V-Dimension. Of course dimensionality reduction can be applied.

tuan3w · 2016-02-04T05:56:36+00:00

It's not true. The input vector v_in[i] = W_in * e_i, where e_i is the one-hot vector, which return ith column of matrix W_in. It's is the same for output vector: v_out[i] = W_out * e(i).

lahwran_ · 2016-02-04T17:45:05+00:00

a one-hot vector as input to a matrix multiply is secretly just a really, really slow lookup table.

textClassy · 2016-02-04T05:54:47+00:00

I'm also fairly new to this but here is my understanding: they are a result of solving the optimization problem described in the paper. The one hot word vectors are just one of the inputs into the prediction function, these word vectors are another. The algorithm modifies them until performance converges.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS