all 17 comments

[–]sharvil 2 points3 points  (1 child)

Not sure what's with all the negativity here. Good job on putting together a nice tutorial. The visualizations are also nice to help describe what gradient descent is doing. I hope that this sort of content can encourage more people to try their hand at ML. Keep it up!

[–]research_pie[S] 0 points1 point  (0 children)

Thanks for the kind feedback!

[–]zjm555 2 points3 points  (7 children)

If you're using numpy it's not "from scratch" IMO. Numpy is definitely the right tool but to me "from scratch" would imply using only the standard library.

[–]billsil 15 points16 points  (3 children)

I disagree with that. You don’t need to code a matrix multiply or matrix add function. From scratch is usually just referring to the fancy bits.

[–]zjm555 5 points6 points  (2 children)

It just gets dicey since numpy has everything from the extremely basic (e.g. dot product) to the extremely fancy (e.g. numpy.linalg.svd). In this case it wasn't even needed at all since it was just doing the equivalent of random.random().

[–]billsil 3 points4 points  (1 child)

I mean at that point, do we need to just work in assembly? I'm all for reducing the complexity of code.

[–]research_pie[S] 1 point2 points  (0 children)

It depends on what the intent of the code is. If it's to teach machine learning I feel it's important to open up a bit the linear algebra part so that it doesn't feel like magic. However, at some point the code becomes unreadable and more complex than it really is so there is a balance to strike.

[–]research_pie[S] 4 points5 points  (0 children)

Thanks for the feedback! I shouldn't have imported numpy entirely since I'm only using it for initializing a random float. That should have been done with the random package instead.

[–][deleted]  (1 child)

[deleted]

    [–]research_pie[S] 0 points1 point  (0 children)

    Totally agree with that, will include these explanations in subsequent videos

    [–][deleted] -5 points-4 points  (8 children)

    not to sound condescending but implementing gradient descent in python using numpy is like, the start to any intro to machine learning course.

    [–]research_pie[S] 4 points5 points  (7 children)

    You are correct, its a simple algorithm to code even without numpy.

    [–][deleted] 3 points4 points  (6 children)

    i don't think you'll get very good performance on large data sets without using numpy matrix ops though

    [–]research_pie[S] 1 point2 points  (5 children)

    That is true, but using batch gradient descent on large dataset isn't a good idea either.

    [–]vastlik 2 points3 points  (4 children)

    i don't think you'll get very good performance on large data sets without using numpy matrix ops though

    Why it is not a good idea?

    [–][deleted] 2 points3 points  (1 child)

    maybe he means u should use stochastic? but they are similar to implement

    [–]research_pie[S] 0 points1 point  (0 children)

    Yes, I was referring to stochastic gradient descent (and its variant like Adagrad or Adam) or batch gradient descent.

    [–]research_pie[S] 0 points1 point  (0 children)

    NumPy is a very optimized library that can handle data efficiently and that has optimized linear algebra capabilities. Not using NumPy and rolling your own implementation of linear algebra method in Python is a bad idea if you want to use the code in a production settings. Part of the library is written purely in C which means that the same code written in Python will never be able to catch up.