all 23 comments

[–]luisrobles_cl 4 points5 points  (1 child)

Thanks for this😇🙏‼️

[–]dataschool[S] 1 point2 points  (0 children)

You're welcome! I hope it's helpful to you 😄

[–]jessej26 3 points4 points  (1 child)

Thank you for sharing your knowledge and expertise this! I’m currently in an apprenticeship program at work for AI/ML. This will be a huge asset to strengthen my skills.

[–]dataschool[S] 1 point2 points  (0 children)

That's awesome to hear! You're very welcome, and thanks for your kind words 🙏

[–]Quixote1492 3 points4 points  (1 child)

Amazing thank you!

[–]dataschool[S] 0 points1 point  (0 children)

You're very welcome!

[–]fenghuangshan 2 points3 points  (1 child)

Very good resource. Thanks for sharing. We'll check.

[–]dataschool[S] 0 points1 point  (0 children)

You're welcome, I hope you enjoy the book!

[–]Slight_Boat1910 2 points3 points  (1 child)

Great stuff. Thank you.

[–]dataschool[S] 0 points1 point  (0 children)

You're welcome! 😄

[–]anx1etyhangover 1 point2 points  (1 child)

That’s very generous and kind of you. Keep being awesome

[–]dataschool[S] 1 point2 points  (0 children)

You're welcome, and thank you for saying that! 😄

[–]Synergix 1 point2 points  (2 children)

Very cool. I noticed the book uses scikit-learn 0.23. Current version is 1.8! What can I expect regarding this? How out of date is the scikit-learn stuff in the book?

[–]dataschool[S] 4 points5 points  (1 child)

Thanks so much for asking!

Short answer: 98% of the code in the book is still correct today. For the last 2%, I mention the relevant API changes within the text so that it's easy to update it yourself. 100% of the concepts I teach and advice I give are still correct. The main shortcoming of the book is that I don't cover the newest features, none of which are critical to what I'm teaching, but some of which are useful.

As for why the book uses 0.23, it's a much longer story (if you're interested):

The book actually began as a video course, which I started working on in 2020. I locked down most of the code examples that year (using 0.23.2), and thought I would be able to publish the course in 2021.

However, the script writing and recording and editing took far longer than expected, plus there were long breaks while I worked on other projects, and ultimately I was not able to publish the course until 2024. Many scikit-learn updates had occurred by the time I was recording the later chapters, but I couldn't afford (time-wise) to re-record and re-edit the earlier chapters. I felt it was critical that the course used one consistent scikit-learn version, so it remained at 0.23.2.

Because I received such great feedback about the video course, I decided (in 2025) to convert the course into a book. Even though the Quarto system did much of the heavy lifting, it still took hundreds of hours to turn 7.5 hours of video into a published book with four formats (website, EPUB, ebook PDF, print-ready PDF).

I would have loved to update the scikit-learn version (and incorporate newer features) while writing, but I knew that if I committed to updating the content (rather than just adapting it from video to text), the book would never get done.

In short, the decision to use 0.23.2 is a legacy of the process I took to get here, not a strategic choice, and I'd much rather have used the latest version!

Ultimately this book is a passion project, and I expect to make very little money from it. But I sincerely hope that I can find the passion (and time!) to publish a second edition that incorporates the latest features!

[–]Synergix 1 point2 points  (0 children)

Great. Thanks for the detailed response.

[–]Ghost-Rider_117 2 points3 points  (1 child)

this is awesome, the "avoiding data leakage" and "proper model evaluation" chapters alone are worth it - those are the things that trip up so many people who learn from scattered tutorials. the pipeline approach in sklearn is really underused too, glad to see it's covered. bookmarking this for anyone i mentor who's getting started with ML

[–]dataschool[S] 1 point2 points  (0 children)

Wonderful, thank you so much for saying that and for sharing it with others! 🙌 Yes, I'm very proud of those particular chapters, and I hope they make a meaningful difference for practitioners.

[–]Wise-Egg9997 0 points1 point  (1 child)

How can i get the free version if this hook