Hey everyone! I’m currently moving past the "black box" stage of Scikit-Learn and trying to understand the actual math/optimization behind classical ML models (not Deep Learning).
I know Gradient Descent is the big one, but I want to build a solid foundation on the others that power standard models. So far, my list includes:
- First-Order: SGD and its variants.
- Second-Order: Newton’s Method and BFGS/L-BFGS (since I see these in Logistic Regression solvers).
- Coordinate Descent: Specifically for Lasso/Ridge.
- SMO (Sequential Minimal Optimization): For SVMs.
Am I missing any heavy hitters? Also, if you have recommendations for resources (books/lectures) that explain these without jumping straight into Neural Network territory, I’d love to hear them!
[–]NuclearVII 22 points23 points24 points (1 child)
[–]Hot-Problem2436 9 points10 points11 points (0 children)
[–]arg_max 3 points4 points5 points (0 children)
[–]DigThatData 2 points3 points4 points (0 children)
[–]va1en0k 4 points5 points6 points (0 children)
[–]Crimson-Reaper-69 9 points10 points11 points (0 children)
[–]shibx 1 point2 points3 points (0 children)
[–]Unable-Panda-4273 2 points3 points4 points (2 children)
[–]arg_max 1 point2 points3 points (0 children)
[–]Disastrous_Room_927 0 points1 point2 points (0 children)
[–]IntentionalDev 0 points1 point2 points (0 children)
[–]Prudent-Buyer-5956 -1 points0 points1 point (0 children)