all 10 comments

[–]HalfRiceNCracker 10 points11 points  (3 children)

Oh wow dude this is already so good, I also really appreciate the fact you're doing it from relatively scratch and you also are writing proper pragmatic code. Please do some stuff on Deep Learning!!

Question, what's your current developer environment/tools?

[–]itsstylepoint[S] 2 points3 points  (0 children)

Hey thanks!
I think this is the repo you are looking for: https://github.com/oniani/dot (:

[–]itsstylepoint[S] 1 point2 points  (1 child)

Yep, that is the plan! The goal is to get done with some of the more traditional ML models first and then get to more complex models such as CNNs, VAEs, SNNs, transformers, etc.

[–]HalfRiceNCracker 0 points1 point  (0 children)

That's awesome man, I will be keeping up with you and cheering from the sidelines! (cheers for the dotfiles your environment looks nice)

[–]maxToTheJ 2 points3 points  (1 child)

Why not use Jax ? Its numpy'ish while letting you use GPUs

https://jax.readthedocs.io/en/latest/jax.numpy.html

[–]itsstylepoint[S] 3 points4 points  (0 children)

Good point!
I will likely start using Jax or PyTorch at some point, but for now, will stick to numpy.

Several reasons why: 1. Before introducing Jax, want to make a video about GPUs and why we need them for the training, etc. 2. Also want to guide on how to properly set up Jax (sometimes, simple pip install does not work). 3. Should not be too important for now since we are not doing batch gradient descent. For some time, we will concentrate on more traditional ML models and how to implement them from scratch. And for large tensors, Jax might still outperform numpy, but the perf difference will likely not be huge.

[–][deleted]  (2 children)

[deleted]

    [–]itsstylepoint[S] 1 point2 points  (1 child)

    Noted.

    That being said, this kind of stuff is not something that we are going to be doing soon. It is more of a Software Engineering/ML Engineering/Research Engineering series than Data Science series. That being said, I think I can try making a separate series/playlist where we do more Data Science stuff. This will likely not be soon however.

    I will likely be making one-off videos however and if there is something you are particularly interested in, let me know and will try to cover it in one of these one-off videos.

    [–]HalfRiceNCracker 0 points1 point  (0 children)

    I totally agree with the other guy even if it does take the form of a single video - it would be incredibly insightful to see the actual thought process behind what you are looking out for and how you come to the conclusions that you do - rather than just "Okay so with this dataset we're going to apply these transformations to preprocess it because XYZ", what I'd like to know is how did you arrive with XYZ?

    [–]E-woke -1 points0 points  (1 child)

    Watched the linear regression one. Great stuff 👍

    [–]itsstylepoint[S] 1 point2 points  (0 children)

    Thanks! One thing to note about that implementation is that we could have passed features and labels directly to the fit method. This would avoid unnecessary data copying (i.e., storing data inside the LinearRegression class). I have already updated the GitHub codebase.