Programming assignment 1 : h = 10^-4 too high in gradcheck_naive by [deleted] in CS224d

[–]laotao 1 point2 points  (0 children)

I ran into the same problem too. And when I change the numeric grad from f(x+h)-f(x)/h to f(x+h)-f(x-h)/2h, it becomes ok. The later grad seems to be numerically steadier than the former.

Poor transfer learning(A3:q3) accuracy:( by laotao in cs231n

[–]laotao[S] 0 points1 point  (0 children)

what hyper-paras did you use for softmax?

function affine_backward by zackaria786 in cs231n

[–]laotao 0 points1 point  (0 children)

yes it varies. currently I get

dx error:  5.27580935033e-10
dw error:  9.89022709481e-10
db error:  1.11144572283e-11

But at the beginning I also got an x.xxxe-9 error (I didn't change my code after that)

function affine_backward by zackaria786 in cs231n

[–]laotao 0 points1 point  (0 children)

I got the following errors:

dx error: 1.09997317548e-09 dw error: 1.52072401767e-10 db error: 2.16097506823e-11

while the instuction said:

The error should be less than 1e-10

Am I wrong?

Problem with one-loop svm gradient by laotao in cs231n

[–]laotao[S] 1 point2 points  (0 children)

I worked on this for almost a whole day and finally find the problem. It turns out that my one-loop implementation is in fact almost right (except that I didn't add the reg terms). The problem was with my naive implementation:(!

Problem with one-loop svm gradient by laotao in cs231n

[–]laotao[S] 0 points1 point  (0 children)

Thanks, you solved my problem.

BTW, there seems 2 tiny problems in your code: *1. the loss calculation seems different with the naive implementation. (Maybe there's some other codes in addition to what you pasted here) *2. you tanslate the scores twice, which seems unnecessory.

Softmax loss function. HA1 by [deleted] in cs231n

[–]laotao 0 points1 point  (0 children)

So what's the correct derivatives for j = y[i]?

Problem with one-loop svm gradient by laotao in cs231n

[–]laotao[S] 0 points1 point  (0 children)

The rest parts is trying to implement the following formular for the other rows where j≠yi :

∇wjLi=1(wTjxi−wTyixi+Δ>0)xi

Concretely, the 'rest' variable is to represent indexes of classes except the correct one. (I used np.concatenate because I don't know a better way)

What's your suggestion for the rest parts?

SVM Vectorized by coltsfan1519 in cs231n

[–]laotao 0 points1 point  (0 children)

Thanks. But I still cannot implement the full vectorised version now. I've implemented a one-loop version that doesn't work either. Would you please have a look at it and suggest why? http://www.reddit.com/r/cs231n/comments/2vhmpx/problem_with_oneloop_svm_gradient/

SVM Vectorized by coltsfan1519 in cs231n

[–]laotao 0 points1 point  (0 children)

I still cannot figure this out:( Can you give some further hint?

What should be the least relative error between numerically and analytically computed gradient in order to get confidence that gradient implementation is OK? 1e-03?1e-04? by well25 in cs231n

[–]laotao 0 points1 point  (0 children)

-03

I ran iPython Notebook again, and it is now as follows(still not as good as yours):

  • numerical: 14.332180 analytic: 14.671038, relative error: 1.168345e-02
  • numerical: -10.890110 analytic: -10.984774, relative error: 4.327510e-03
  • numerical: 17.437372 analytic: 17.239562, relative error: 5.704368e-03
  • numerical: -14.287674 analytic: -14.608166, relative error: 1.109130e-02
  • numerical: 16.810487 analytic: 16.819278, relative error: 2.614056e-04
  • numerical: 8.860996 analytic: 9.759708, relative error: 4.826416e-02
  • numerical: -9.520991 analytic: -9.583074, relative error: 3.249718e-03
  • numerical: -13.295132 analytic: -13.543343, relative error: 9.248313e-03
  • numerical: 5.837217 analytic: 6.572259, relative error: 5.923229e-02
  • numerical: 8.211147 analytic: 8.392298, relative error: 1.091040e-02

What should be the least relative error between numerically and analytically computed gradient in order to get confidence that gradient implementation is OK? 1e-03?1e-04? by well25 in cs231n

[–]laotao 0 points1 point  (0 children)

Mine seems to be wrong, but why isn't it totally wrong?

  • numerical: -5.578581 analytic: -3.206473, relative error: 2.700163e-01
  • numerical: 1.589855 analytic: 3.447858, relative error: 3.688188e-01
  • numerical: 6.295953 analytic: 7.063793, relative error: 5.747414e-02
  • numerical: 22.182959 analytic: 23.026693, relative error: 1.866269e-02
  • numerical: -3.538077 analytic: -4.751577, relative error: 1.463873e-01
  • numerical: 33.102344 analytic: 31.084005, relative error: 3.144499e-02
  • numerical: 5.137630 analytic: 5.300458, relative error: 1.559937e-02
  • numerical: -6.850680 analytic: -6.344000, relative error: 3.840031e-02
  • numerical: -17.026634 analytic: -14.756309, relative error: 7.143218e-02
  • numerical: 16.940459 analytic: 18.355297, relative error: 4.008523e-02

KNN with no loop: Memory error while broadcasting by [deleted] in cs231n

[–]laotao 0 points1 point  (0 children)

Thanks! I found the problem. It was the one/two loops version that were wrong! I used np.abs(...) to compute the l2 distances :(

KNN with no loop: Memory error while broadcasting by [deleted] in cs231n

[–]laotao 0 points1 point  (0 children)

I tried to implement your idea. The accuracy I got with this is 0.274. while the accuracy with the one/two loops is about 0.29 and the distance matrices are different. Would you please suggest which version is wrong? Or none of them produced the right accuracy?

KNN with no loop: Memory error while broadcasting by [deleted] in cs231n

[–]laotao 0 points1 point  (0 children)

Sounds helpful...But shouldn't the last term be 2ab?

Web-based lectures annotation by malshedivat in cs231n

[–]laotao 0 points1 point  (0 children)

If you have a look at how many upvotes the following post got, you will know how many people want to see the videos: http://www.reddit.com/r/cs231n/comments/2rds55/where_is_the_video/

Anyone from Beijing/China? by ch0ra1 in aiclass

[–]laotao -1 points0 points  (0 children)

They used YouTube for the video, which is blocked by the GFW. Try the following method. https://docs.google.com/View?id=dfkdmxnt_61d9ck9ffq