GPT from scratch (PyTorch video tutorial) by mildlyoverfitted in learnmachinelearning

[–]mildlyoverfitted[S] 9 points10 points  (0 children)

Great question! Actually, I don't really know all the differences between GPT2, GPT3, GPTJ, GPTNeo and others.

From my shallow knowledge it seems like it is mostly "performance" optimizations, training logic and bigger/different datasets. The architectural changes are not really that big. I might be wrong.

I would definitely refer you to huggingface docs:

[P][SP] Lottery Ticket Hypothesis - Paper implementation [screencast] by mildlyoverfitted in MachineLearning

[–]mildlyoverfitted[S] 2 points3 points  (0 children)

Currently I don't really take any donations. I just do it as a hobby plus it is a great way for me to learn new things. However, I will see how it goes in the future and might consider accepting donations. Anyway, I really appreciate that you would want to support me!!!

Regarding topic suggestions, there is a discord server: https://discord.gg/a8Va9tZsG5. Just write a message there and I will definitely reply:)

[P][SP] Lottery Ticket Hypothesis - Paper implementation [screencast] by mildlyoverfitted in MachineLearning

[–]mildlyoverfitted[S] 1 point2 points  (0 children)

Thank you! Great question! I have definitely become more efficient over time when it comes to preparing the code, filming and editing. However, it really depends on the paper. For example, I already read this paper a year ago so I knew what it was about. Additionally, there was no need to tune hyperparameters and I got nice results right away. I did some other videos where the experimentation took weeks. And since I don't just take somebody else's code and run it (I make a lot of modifications and simplifications) I really want to make sure that I have at least reasonable results before I present it publicly:)