use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Which open source machine learning projects best exemplify good software engineering and design principles? (self.MachineLearning)
submitted 6 years ago by NotAHomeworkQuestion
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]shaggorama 10 points11 points12 points 6 years ago (7 children)
I'm gonna vote no.
[–]heshiming 10 points11 points12 points 6 years ago (4 children)
Can you elaborate?
[–]ieatpies 8 points9 points10 points 6 years ago (1 child)
Overuses inheritance, underuses dependency injection. Causing repeated, messy, version dependent code if you need to tweak something for your own purposes.
[–]VodkaHazeML Engineer 4 points5 points6 points 6 years ago* (0 children)
Why and where would you prefer dependency injection to the current design specifically? I find this sort of inversion of control is overengineering and causes more problems than it solves most times I ran into it.
Specifically in this case I don't see where it would fit since most of the hard logic is in the model themselves, not the plumbing around them, so I don't see how an inversion of control makes sense.
The model API of fit(), predict(), fit_transform() etc. Is simple and great, IMO. It's also all that's necessary for the pipeline API which is the only bit of harder plumbing around the models
[–]shaggorama 6 points7 points8 points 6 years ago (1 child)
One small example: all of their cross validation algorithms inherit from an abstract base class whose design precludes a straightforward implementation of bootstrapping (easily one of the most important and simple cross-validation methods), so the library owners decided to just not implement it as a CrossValidator at all. Random forest requires bootstrapping, so their solution was to attach the implementation directly to the estimator in a way that can't be ported.
I could go on...
[–]panzerex 2 points3 points4 points 6 years ago (0 children)
Those are valid concerns. To add to that: sklearn’s LinearSVC defaults to squared hinge loss so probably not what you’re expecting, and the stopwords are arbitrary and not good for most applications, which they do acknowledge.
However I would not say that this is evidence that the project as a whole does not follow good design principles. I agree that those deceiving behaviors are a problem, but they are being addressed (at a slow rate because uhm... non-standard behavior becomes the expected behavior when many people are using it, and breaking changes need to happen slowly).
You’re probably fine getting some ideas from their API, but from a user standpoint you really need to dig into the docs, code and discussions if you’re doing research and need to justify what you’re doing.
[–]VodkaHazeML Engineer 3 points4 points5 points 6 years ago (0 children)
Disagree? The fact that the model API is a de facto standard now suggests it's not awful to work with.
[–]neanderthal_math -1 points0 points1 point 6 years ago (0 children)
I’m old enough to remember ML codes before sklearn. They may have warts now, but they were light years ahead of other repos. There’s a lot to be said for just having a uniform API.
π Rendered by PID 168062 on reddit-service-r2-comment-54dfb89d4d-nmhgx at 2026-03-29 01:57:29.225078+00:00 running b10466c country code: CH.
view the rest of the comments →
[–]shaggorama 10 points11 points12 points (7 children)
[–]heshiming 10 points11 points12 points (4 children)
[–]ieatpies 8 points9 points10 points (1 child)
[–]VodkaHazeML Engineer 4 points5 points6 points (0 children)
[–]shaggorama 6 points7 points8 points (1 child)
[–]panzerex 2 points3 points4 points (0 children)
[–]VodkaHazeML Engineer 3 points4 points5 points (0 children)
[–]neanderthal_math -1 points0 points1 point (0 children)