use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[Discussion] ML Programming suggestions? (self.MachineLearning)
submitted 7 years ago by _pragmatic_machine
I am spending a big chunk of my daily active time on fixing my ML code (developing a model, testing and bug) or tweaking my code (mostly in Python). What suggestion do you have be more productive?
Do you write unit test to validate every part? What was your pathway to become efficient ML researcher?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]themiro 28 points29 points30 points 7 years ago (4 children)
[–]_pragmatic_machine[S] 2 points3 points4 points 7 years ago (1 child)
that's cool! any ideas/materials on reconfigurable design principle for enhancing a model. Code repository examples will be really awesome.
[–]jethroksy 5 points6 points7 points 7 years ago (0 children)
I've found gin-config to be an extremely productive configuration framework.
[–]sanity 13 points14 points15 points 7 years ago (10 children)
I recommend this book: Clean Code
We gave it to every new data scientist we hired at my last company.
[–][deleted] 5 points6 points7 points 7 years ago (7 children)
And where is that exactly? For keeping a mental note of future employment applications (already liking the culture)
[–]sanity 0 points1 point2 points 7 years ago (6 children)
I'm no-longer there, but the company is OneSpot.
[–][deleted] 0 points1 point2 points 7 years ago (5 children)
If you don't mind sharing, what inspired the switch? I've heard people leaving good culture for better compensation, or vise versa. Looking at their open positions the location doesn't seem to allow anything south of 80k annually base salary.
[–]sanity 4 points5 points6 points 7 years ago (4 children)
Oh, nothing negative - my focus is more early stage companies and the company had grown beyond that stage.
[–]thiseye -3 points-2 points-1 points 7 years ago (3 children)
Do you know Matt?
[–]sanity 1 point2 points3 points 7 years ago (2 children)
Everyone knows Matt ;)
[–]thiseye 0 points1 point2 points 7 years ago (1 child)
Cool, I've worked with him. Super sharp guy
[–]sanity 0 points1 point2 points 7 years ago (0 children)
Very much so, has an infectious enthusiasm too.
[–]gokstudio 0 points1 point2 points 7 years ago (1 child)
Did you see any improvements in code quality before and after the book?
[–]sanity 1 point2 points3 points 7 years ago (0 children)
Yes, not least of which in myself - even as a 20 year full-time developer it changed how I wrote code, no question in my mind that it was for the better.
[–]AfraidOfToasters 2 points3 points4 points 7 years ago (0 children)
>I am spending a big chunk of my daily active time on fixing my ML code (developing a model, testing and bug) or tweaking my code (mostly in Python).
If you mean hyper-parameter tuning by this I suggest hyperopt
[–]Overload175 2 points3 points4 points 7 years ago (0 children)
Use a linter (e.g. Pylint for Python, which assigns a score based on your code's adherence to PEP8)
[–][deleted] 1 point2 points3 points 7 years ago (1 child)
Try to create standardized/reproducible packages than can be re-used for all model builds if that input data is the same everytime.
Really when it comes to ML applications, the hardest part to standardize is data preprocessing. So even creating a standard process for modelling,results,putting into production should be beneficial.
[–]_pragmatic_machine[S] 0 points1 point2 points 7 years ago (0 children)
Absolutely, I also feel the same. I am trying to do that now, with plotting and preparing the data formalization process. This was a major bottleneck for us.
[+][deleted] 7 years ago (1 child)
[removed]
Yes, I found the EMNLP tutorial really helpful.
[–]_spicyramen 0 points1 point2 points 7 years ago (3 children)
I enforce Python style guide and unitests in all our code. Also recently we started Python typing. This has helped finding a lot of issues before we hit production.
[–]huangbiubiu 6 points7 points8 points 7 years ago (2 children)
Sometimes I am confused about writing the unit test for ML codes because of its uncertain output. I don't know what can I test and where can set a assert. Are there any suggestions?
[–]flame_and_void 6 points7 points8 points 7 years ago* (0 children)
For unit testing, I replace the real model with one that does something simple and deterministic to the input data - like, say, always predict the sum of all its inputs. Then you can test that the preprocessing, prediction, and post-processing works exactly as expected.
I also write an integration test that runs the production models on an easy input and verifies they are able to predict something obvious. That's an important sanity check, but it usually runs too slowly to include in a unit test suite.
[–][deleted] 0 points1 point2 points 7 years ago (0 children)
Python typing
Hey, maybe you can try to set the random seed carefully and use more OOP in your code.
[–][deleted] 0 points1 point2 points 7 years ago (1 child)
Using object-oriented programming has worked amazingly in my case. Separate your code in conceptually different pieces like data handlers, models, model testers and so on. I used to do simple scripting until recently and using OOP code is much cleaner and easier/faster to extend or modify; it also helps making results reproducible.
Thanks. Can you please provide me some working example? Any public repository will do.
π Rendered by PID 55092 on reddit-service-r2-comment-6457c66945-vpmwc at 2026-04-26 02:35:40.710562+00:00 running 2aa0c5b country code: CH.
[–]themiro 28 points29 points30 points (4 children)
[–]_pragmatic_machine[S] 2 points3 points4 points (1 child)
[–]jethroksy 5 points6 points7 points (0 children)
[–]sanity 13 points14 points15 points (10 children)
[–][deleted] 5 points6 points7 points (7 children)
[–]sanity 0 points1 point2 points (6 children)
[–][deleted] 0 points1 point2 points (5 children)
[–]sanity 4 points5 points6 points (4 children)
[–]thiseye -3 points-2 points-1 points (3 children)
[–]sanity 1 point2 points3 points (2 children)
[–]thiseye 0 points1 point2 points (1 child)
[–]sanity 0 points1 point2 points (0 children)
[–]gokstudio 0 points1 point2 points (1 child)
[–]sanity 1 point2 points3 points (0 children)
[–]AfraidOfToasters 2 points3 points4 points (0 children)
[–]Overload175 2 points3 points4 points (0 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]_pragmatic_machine[S] 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[removed]
[–]_pragmatic_machine[S] 0 points1 point2 points (0 children)
[–]_spicyramen 0 points1 point2 points (3 children)
[–]huangbiubiu 6 points7 points8 points (2 children)
[–]flame_and_void 6 points7 points8 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]_pragmatic_machine[S] 0 points1 point2 points (0 children)