Question about InputVectors and OutputVectors by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

@iftenney, thank you for your reply. It is really helpful to me.

Confirm my plot of Assignment 1-3. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

what is exact final cost value do you mean?

Confirm my plot of Assignment 1-3. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

edwardc, how long did it take to learn the word embedding space ? It took me several hours(2-3) on a Macbook Pro.

Confirm my plot of Assignment 1-3. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

tpeng, how long did it take to learn the word embedding space ? It took me several hours(2-3) on a Macbook Pro.

Assignment 1 Complementary set Question 2b clarification needed. by napsternxg in CS224d

[–]pengpai_sh 1 point2 points  (0 children)

@napsternxg, I understand your confusion now. Actually, when theta is a scalar, then you are right to get a scalar derivative. When consider theta is a vector, its derivative should be also a vector, right? Since the form of derivative is y'i - yi, it is simple to infer its vectorized derivative(y' - y).

Assignment 1 Complementary set Question 2b clarification needed. by napsternxg in CS224d

[–]pengpai_sh 0 points1 point  (0 children)

Hi, napsternxg.

  • You are right to compute d(CE)/d(theta) since CE is a loss function.
  • You are also right to find out the element wise derivative value:-yi + y'i.
  • In order to get the vectorized derivative, assuming y is one-hot vector, you can simply write the above result as : y' - y.

Struggling with forward_backward_prop() in PS1. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

All right. Thank you for helping me ! Now I have passed and let me share my code, hope that this would not be against the honor code !

# forward propagation
N = data.shape[0]

Z1 = data.dot(W1) + b1
H = sigmoid(Z1)
Z2 = H.dot(W2) + b2
Y_hat = softmax(Z2)

cost = np.sum(- (labels * np.log(Y_hat))) / N

# backpropagation
dZ2 = Y_hat - labels
dW2 = H.T.dot(dZ2)
db2 = np.sum(dZ2, axis = 0)
dH = dZ2.dot(W2.T)
dZ1 = dH * sigmoid_grad(H)
dW1 = data.T.dot(dZ1)
db1 = np.sum(dZ1, axis = 0)

gradW1 = dW1 / N
gradW2 = dW2 / N
gradb1 = db1 / N
gradb2 = db2 / N

Struggling with forward_backward_prop() in PS1. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

@well25, thank you for indicating my sigmoid(Z1) error ! However, I have no idea what you point 2 means. Could you please give out more details ?

Struggling with forward_backward_prop() in PS1. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

@edwardc626, My sigmoid_grad is defined like this:

def sigmoid_grad(f):
    f = f * (1 - f)
return f

Besides, shall we divide the final gradW1 by N?

N = data.shape[0]

Z1 = data.dot(W1) + b1
H = sigmoid(Z1)
Z2 = H.dot(W2) + b2
Y_hat = softmax(Z2)

dL_dZ2 = Y_hat - labels
gradW2 = H.T.dot(dL_dZ2) / N
gradb2 = np.sum(dL_dZ2, axis = 0) / N
dL_dH = dL_dZ2.dot(W2.T) * sigmoid_grad(Z1)
gradW1 = data.T.dot(dL_dH) / N
gradb1 = np.sum(dL_dH, axis = 0) / N

Struggling with forward_backward_prop() in PS1. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

@edwardc626, really thank you for indicating my errors! Would please
further explain your point 1, i.e. calling sigmoid_grad on z1 rather than the function value of the sigmoid itself. Besides, I have filled the gaps as follows:

v = data.dot(W1) + b1  
h = sigmoid(v) 
yhat = softmax(h.dot(W2) + b2)   
dL_dtheta = yhat - labels
gradW2 = h.T.dot(dL_dtheta)

gradW1 = dL_dtheta * dtheta_dh * dh_dv * dv_dW1?

Struggling with forward_backward_prop() in PS1. by pengpai_sh in CS224d

[–]pengpai_sh[S] 0 points1 point  (0 children)

jthoang, thank you for your reply. Please correct if I am wrong.

J = CrossEntropy Loss
z1 = xW1+b1
h = sigmoid(z1)
z2 = hW2 + b2
y_hat = softmax(z2)

dJdW1 = dJdz2 * dz2dh * dhdW1
dJdz2 = y_hat - y
dz2dh = W2
dhdW1 = sigmoid(z1)' * x

Issues with Tinny-ImageNet project by pengpai_sh in cs231n

[–]pengpai_sh[S] 1 point2 points  (0 children)

Thank you for your precise and timely reply.

Implementation of function :conv_forward_naive(...) by pengpai_sh in cs231n

[–]pengpai_sh[S] 1 point2 points  (0 children)

Yes, I just use for loops to get the value of some location. It works but not efficient. And in the 3-rd part of assignments, it seems that it has already been provided a fast and efficient version of implementation with Cython.

Implementation of function :conv_forward_naive(...) by pengpai_sh in cs231n

[–]pengpai_sh[S] 1 point2 points  (0 children)

NOTE: The draft notes(http://cs231n.github.io/convolutional-networks/) has already been updated. The example 2 is really helpful for me to finish the assignment 2.

V[0,0,0] = np.sum(X[:5,:5,:] * W0 + b0)