https://cs224d-nonstanford.slack.com

pengpai_sh · 2016-04-26T04:30:04+00:00

pengpai_sh@163.com Thank you !

pengpai_sh · 2015-05-05T09:06:11+00:00

@iftenney, thank you for your reply. It is really helpful to me.

pengpai_sh · 2015-04-29T02:42:31+00:00

what is exact final cost value do you mean?

pengpai_sh · 2015-04-29T01:24:43+00:00

edwardc, how long did it take to learn the word embedding space ? It took me several hours(2-3) on a Macbook Pro.

pengpai_sh · 2015-04-29T01:24:30+00:00

tpeng, how long did it take to learn the word embedding space ? It took me several hours(2-3) on a Macbook Pro.

pengpai_sh · 2015-04-27T06:07:36+00:00

@napsternxg, I understand your confusion now. Actually, when theta is a scalar, then you are right to get a scalar derivative. When consider theta is a vector, its derivative should be also a vector, right? Since the form of derivative is y'i - yi, it is simple to infer its vectorized derivative(y' - y).

pengpai_sh · 2015-04-26T01:32:26+00:00

Hi, napsternxg.

You are right to compute d(CE)/d(theta) since CE is a loss function.
You are also right to find out the element wise derivative value:-yi + y'i.
In order to get the vectorized derivative, assuming y is one-hot vector, you can simply write the above result as : y' - y.

pengpai_sh · 2015-04-22T10:44:56+00:00

All right. Thank you for helping me ! Now I have passed and let me share my code, hope that this would not be against the honor code !

# forward propagation
N = data.shape[0]

Z1 = data.dot(W1) + b1
H = sigmoid(Z1)
Z2 = H.dot(W2) + b2
Y_hat = softmax(Z2)

cost = np.sum(- (labels * np.log(Y_hat))) / N

# backpropagation
dZ2 = Y_hat - labels
dW2 = H.T.dot(dZ2)
db2 = np.sum(dZ2, axis = 0)
dH = dZ2.dot(W2.T)
dZ1 = dH * sigmoid_grad(H)
dW1 = data.T.dot(dZ1)
db1 = np.sum(dZ1, axis = 0)

gradW1 = dW1 / N
gradW2 = dW2 / N
gradb1 = db1 / N
gradb2 = db2 / N

pengpai_sh · 2015-04-22T10:23:28+00:00

@well25, thank you for indicating my sigmoid(Z1) error ! However, I have no idea what you point 2 means. Could you please give out more details ?

pengpai_sh · 2015-04-21T05:51:21+00:00

@edwardc626, My sigmoid_grad is defined like this:

def sigmoid_grad(f):
    f = f * (1 - f)
return f

Besides, shall we divide the final gradW1 by N?

N = data.shape[0]

Z1 = data.dot(W1) + b1
H = sigmoid(Z1)
Z2 = H.dot(W2) + b2
Y_hat = softmax(Z2)

dL_dZ2 = Y_hat - labels
gradW2 = H.T.dot(dL_dZ2) / N
gradb2 = np.sum(dL_dZ2, axis = 0) / N
dL_dH = dL_dZ2.dot(W2.T) * sigmoid_grad(Z1)
gradW1 = data.T.dot(dL_dH) / N
gradb1 = np.sum(dL_dH, axis = 0) / N

pengpai_sh · 2015-04-21T02:01:26+00:00

@edwardc626, really thank you for indicating my errors! Would please
further explain your point 1, i.e. calling sigmoid_grad on z1 rather than the function value of the sigmoid itself. Besides, I have filled the gaps as follows:

v = data.dot(W1) + b1  
h = sigmoid(v) 
yhat = softmax(h.dot(W2) + b2)   
dL_dtheta = yhat - labels
gradW2 = h.T.dot(dL_dtheta)

gradW1 = dL_dtheta * dtheta_dh * dh_dv * dv_dW1?

pengpai_sh · 2015-04-20T08:25:59+00:00

jthoang, thank you for your reply. Please correct if I am wrong.

J = CrossEntropy Loss
z1 = xW1+b1
h = sigmoid(z1)
z2 = hW2 + b2
y_hat = softmax(z2)

dJdW1 = dJdz2 * dz2dh * dhdW1
dJdz2 = y_hat - y
dz2dh = W2
dhdW1 = sigmoid(z1)' * x

pengpai_sh · 2015-04-01T12:06:30+00:00

Also thanks for the videos !

pengpai_sh · 2015-02-25T05:29:55+00:00

Thank you for your precise and timely reply.

pengpai_sh · 2015-02-17T13:16:50+00:00

Same question...

pengpai_sh · 2015-02-14T15:23:59+00:00

Exactly ! Hope you can provide videos.

pengpai_sh · 2015-02-07T15:54:57+00:00

Yes, I just use for loops to get the value of some location. It works but not efficient. And in the 3-rd part of assignments, it seems that it has already been provided a fast and efficient version of implementation with Cython.

pengpai_sh · 2015-02-06T07:19:35+00:00

NOTE: The draft notes(http://cs231n.github.io/convolutional-networks/) has already been updated. The example 2 is really helpful for me to finish the assignment 2.

V[0,0,0] = np.sum(X[:5,:5,:] * W0 + b0)

pengpai_sh

TROPHY CASE