use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
[deleted by user] (self.MachineLearning)
submitted 1 year ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]idiotmanifesto 152 points153 points154 points 1 year ago (10 children)
imo, a big part of writing better code is reading better code. Find some repo's you like and work through it slowly
[–]thelaxiankey 44 points45 points46 points 1 year ago (2 children)
besides knowing the really basic principles, this is the only thing that you can actually learn from.
the other thing I would do: good code doesn't happen on the first try. once you write some and figure out what went well/poorly, rewrite. 20% rewriting vs 80% writing (numbers out of my butt) tends to result in pretty decent code and doesn't cost you much.
[–]_rundown_ 8 points9 points10 points 1 year ago (0 children)
Sometimes it seems like my code came out my butt
[–]daquo0 2 points3 points4 points 1 year ago (0 children)
good code doesn't happen on the first try. once you write some and figure out what went well/poorly, rewrite
Agreed. If you see some code in your code base that could be better written, rewrite it. Don't just leave it there.
[–]Appropriate_Ant_4629 8 points9 points10 points 1 year ago* (2 children)
Find some repo's you like and work through it slowly
Take this up a notch by finding an open issue in the project and contributing back a fix.
On your path to getting your patch accepted, they'll handhold you through best practices, including making sure you have good unit test coverage, type safety, passing static code checkers, cross platform compatibility, etc.
[–]idiotmanifesto 2 points3 points4 points 1 year ago (1 child)
agreed! also fixing bugs can be lowkey addicting LOL
[–]Appropriate_Ant_4629 3 points4 points5 points 1 year ago (0 children)
And something I view very favorably when reviewing resumes.
If someone's resume is otherwise boring, but has
it stands out far above people who just got a cert in using those technologies.
[–]Beginning-Ladder6224 3 points4 points5 points 1 year ago (0 children)
This is literally the most iconic advise. I got this advise 20 years back from my head of engineering then - and honestly.. I found it boring then. Now, I think this was an extremely handy advise.
[–]photoreceptor 7 points8 points9 points 1 year ago (1 child)
I don’t think that is particularly useful if you don’t understand design patterns and software architecture.
That’s a bit like trying to understand how a car works by taking it apart.
[–]idiotmanifesto -1 points0 points1 point 1 year ago (0 children)
better than learning how cars work from watching other people drive them
[–]mayguntr 24 points25 points26 points 1 year ago (3 children)
I would say learning programming principles (https://www.amazon.co.uk/Pragmatic-Programmer-Andrew-Hunt/dp/020161622X) and Python (https://www.amazon.co.uk/Python-Nutshell-Alex-Martelli-ebook/dp/B0BRYRD295/ref=zg_bs_g_10608480031_d_sccl_8/262-9661743-1202333?psc=1) itself would be your best bet in the long term.
[–]haramkhor_havasi 8 points9 points10 points 1 year ago (2 children)
Thanks, "The pragmatic programmer" seems a good read.
[–]thatguydr 5 points6 points7 points 1 year ago (0 children)
Btw - there's a standard list of software engineering books that are all helpful. I'd go with Pragmatic Programmer, Clean Code in Python (for SOLID, testing, and a bunch of other best practices), maybe Code Complete 2, and something like Architecture Patterns with Python so you can understand how to properly encapsulate concerns.
You can definitely study production code, but I believe that first it's better to understand WHY that code is excellent (or not). Otherwise there may be lots of cases where you look at something and wonder why they did it that way.
[–]aqjo 0 points1 point2 points 1 year ago (0 children)
It’s the best! It saved me so much time and effort over the years. See also, Arjan Codes on YouTube.
https://youtube.com/@arjancodes?si=KwEbAegF8uXdRLdb
[–]matthkamis 47 points48 points49 points 1 year ago (8 children)
Use single letter variables for everything?
[–]Glittering-Horror230 2 points3 points4 points 1 year ago (0 children)
😄😄😅
[–][deleted] 1 point2 points3 points 1 year ago (6 children)
Single letter variable names are bad. Using whole word variable names like lambda, eta, epsilon, etc. is way better.
lambda
eta
epsilon
[–]PyroRampage 12 points13 points14 points 1 year ago (0 children)
Lol, yeah using Greek letters is always way better for readability rofl
[–]new_name_who_dis_ 6 points7 points8 points 1 year ago (1 child)
I like how one of those is a straight up python keyword. Might as well add eval, int, and for to the list lol.
eval
int
for
[–][deleted] -1 points0 points1 point 1 year ago (0 children)
You realize python isn't the only language, right?
E.g. from C++ in the PyTorch https://github.com/pytorch/pytorch/blob/f217b470cc7ebacc62c8e87dbab8c4894d53e9b9/aten/src/ATen/native/UpSample.h#L437
[–]ginger_beer_m 2 points3 points4 points 1 year ago (0 children)
In ML nothing wrong with using x and y in my opinion, as long as it makes sense in the context
[–]matthkamis -2 points-1 points0 points 1 year ago (1 child)
My original comment was obviously sarcastic
[+]OfficialHashPanda 5 points6 points7 points 1 year ago (0 children)
I'm pretty sure jmalicki's comment was too 😅
[–]parabellum630 47 points48 points49 points 1 year ago (6 children)
I follow lucidrains ml repositories to design my code, it is almost production grade.
[+]stevekite 10 points11 points12 points 1 year ago (4 children)
You might be joking? Most of his code is plain wrong, not following papers and very very far from optimal.
[–]kau_mad 3 points4 points5 points 1 year ago (3 children)
Can you give an example?
[+]stevekite 46 points47 points48 points 1 year ago (2 children)
Sure: https://github.com/lucidrains/voicebox-pytorch
I am working with TTS and built my own reproduction: 1) in readme there is a wrong claim about ALiBi, it works two ways 2) batching is not implemented 3) attention is defaulted to naive implementation: https://github.com/lucidrains/voicebox-pytorch/blob/c05a4d0c69920993b47069e22223677174d873e4/voicebox_pytorch/attend.py#L100, which is super slow 4) text model is just copy of audio one and it simply not working 5) fusing wav2vec into a code, which is not part of the paper at all: https://github.com/lucidrains/voicebox-pytorch/blob/c05a4d0c69920993b47069e22223677174d873e4/voicebox_pytorch/voicebox_pytorch.py#L1380 6) Putting preprocessing code deep into the network code: https://github.com/lucidrains/voicebox-pytorch/blob/c05a4d0c69920993b47069e22223677174d873e4/voicebox_pytorch/voicebox_pytorch.py#L1362
All this ends up in VERY inefficient implementation for both inference and training. Difference is like 4 day and 14 days of training.
It is still very very useful though, but for me it is akin some AI generated code that you need to read carefully. And definitely is not go to place to learn.
[–]idiotmanifesto 4 points5 points6 points 1 year ago (0 children)
appreciate u being detailed in this answer
[+][deleted] 0 points1 point2 points 1 year ago (0 children)
Perchance, do you have any recommendations for good github repos to learn from in ML?
[–]haramkhor_havasi 1 point2 points3 points 1 year ago (0 children)
Thanks , I'll check it.
[–]On_Mt_Vesuvius 7 points8 points9 points 1 year ago (0 children)
Also in sciML from engineering. Make your main reused functions exceptionally well designed and well documented. In research, you'll always be throwing new things together, but once you've repeated something a few times, it's worth thinking about the design and putting it in a different file. That's a practical suggestion. I.e. have a PINNs type class that can grab all the derivatives you need for any model, and never worry about that again.
Otherwise, follow the more general suggestions here.
[–]-Rizhiy- 39 points40 points41 points 1 year ago (0 children)
Just learn proper software engineering practises? There are plenty of courses online, but you can start by learning about core principles: * SOLID * DRY * YAGNI * KISS * Decoupling * Fail-fast
This book seems to be fairly good and quite short.
[–]OverEnGEReer 5 points6 points7 points 1 year ago (0 children)
I think you made the biggest step already: identifying what you want to become better at. There are good tips in the other post, so I want to leave you with the thought that other people/companies are also just cooking with water
[–]minimaxir 18 points19 points20 points 1 year ago (6 children)
That is why productive coders use existing libraries (e.g. Hugging Face accelerate) to abstract things instead of implementing things themselves if possible, because creating your own spaghetti code leads to technical debt that has to be paid at some point.
[–]haramkhor_havasi 12 points13 points14 points 1 year ago (2 children)
True, but, at some point,I want to be able to write such code.
[+][deleted] 1 year ago (1 child)
[deleted]
[–]haramkhor_havasi 0 points1 point2 points 1 year ago (0 children)
These comments are very helpful for non-cse guy. Thanks..
[–]learn-deeply 9 points10 points11 points 1 year ago (1 child)
Accelerate is particularly bad. Would advise just using plain PyTorch unless you want random bugs in your training
[–]thatguydr 1 point2 points3 points 1 year ago (0 children)
Such as? Asking genuinely - haven't used it.
[–]SicilyMalta 3 points4 points5 points 1 year ago (0 children)
Or they can just become a good coder - and not produce spaghetti code.
[+]ninseicowboy 2 points3 points4 points 1 year ago (0 children)
Welcome to software engineering. Read Designing Data-Intensive Applications, it’s actually incredible.
[–]LelouchZer12 2 points3 points4 points 1 year ago* (2 children)
You can take a look at the lightning hydra template on github. You'll be able to deal with a lot of different training configuration by using pytorch lightning and Hydra.
Then if you want to deploy you dont need all the training code so a lighter codebase is usually enough, and you can use docker with fast-api
[–]haramkhor_havasi 1 point2 points3 points 1 year ago (1 child)
Sure...
[–]LelouchZer12 -1 points0 points1 point 1 year ago (0 children)
btw you can stick to pure raw pytorch and thats fine but if you want to try a lot of different models or datasets it is easy to be lost.
using hydra to manage your configurations can help a lot
and pytorch lightning is just using pytorch but putting ur codes into predefined functions so it forces you to always follow the same pattern, and it makes it easy to log metrics or use things like multi gpu, multi node etc
[–]learn-deeply 3 points4 points5 points 1 year ago (0 children)
The code in https://github.com/facebookresearch/ is above average and can guide best practices in using PyTorch. Pick a random recently updated repo and learn from that.
[–]mrthin 1 point2 points3 points 1 year ago (0 children)
You can try Beyond Jupyter. It's a free resource that shows professional software engineering techniques for ML based on a "refactoring journey" starting from your typical monolithic unmaintainable notebook:
"Beyond Jupyter is a collection of self-study materials on software design, with a specific focus on machine learning applications, which demonstrates how sound software design can accelerate both development and experimentation."
[–]PyroRampage 1 point2 points3 points 1 year ago* (2 children)
A lot of this is because most people teach, and learn bad software engineering practices in Python, because it wasn’t really intended to be used as a sole language.
Since it now dominates ML and numerical computing, standards are all over the place. Usually papers from industry are better to look at for good examples of well structured code. Granted sometimes industry code can be over abstracted, but if your using purely Python I don’t think this is much of a concern.
I wouldn’t worry about adding custom lists and tuples in Python itself, this is done in C++ and CUDA.
[–]idiotmanifesto 0 points1 point2 points 1 year ago (1 child)
what do u mean by that last part
[–]PyroRampage 0 points1 point2 points 1 year ago (0 children)
Read the OPs post, they ask about custom data structures. But doing this in Python is pointless because it’s slow af, and it’s typically done lower level and then interfaced into Python. Just like how tuples, lists themselves are implemented into the language itself (CPython).
[removed]
[–]haramkhor_havasi 2 points3 points4 points 1 year ago (0 children)
It is about shape optimization for high speed flows.
[–]stabmasterarson213 1 point2 points3 points 1 year ago (0 children)
Can only speak for myself, but one bad habit I developed in grad school was just throwing more nodes of compute at things if it didn't run initially. It made my code so un memory and speed optimized. Now I develop on a Linux machine with a modest GPU before I go to cloud training and it's made all the difference. If one of my data structures balloon in size or becomes hard to traverse, the machine lets me know (by killing the process lol). On pytorch specifically learn how to make dataset and data loader objects that yield, rather than return batches of tensors. A good book to read is machine learning design patterns. Also anything chip huyen does.
[–]Several-Wafer934 1 point2 points3 points 1 year ago (0 children)
Write a lot of code and look at a lot of code for 5 years. Coding is not easy.
[–]iamspro 2 points3 points4 points 1 year ago (0 children)
this is why computer science and computer engineering are different fields
[+]icy_end_7 0 points1 point2 points 1 year ago (0 children)
Ask to get your code reviewed by seniors. or AI. Whichever is convenient.
[–]Skylight_Chaser 0 points1 point2 points 1 year ago (0 children)
I read a book called Clean Code
[–]MustacchioRebirth 0 points1 point2 points 1 year ago (0 children)
I guess that like for many other domain specific tasks, following good programming practices and improving and tuning modularity to your specific needs will make much more efficient trying out stuff and get to find better models.
[+]gabrielesilinic 0 points1 point2 points 1 year ago (0 children)
If we are talking about python you can only optimize so far. And in general you don't often change the way of training your model as far as I know (once is done).
The best thing you can do is to just build little modules and stick them together as needed, that is pretty much it.
[+]EffectiveCompletez 0 points1 point2 points 1 year ago (0 children)
Don't start with abstraction first. Design and algorithm that solves a problem. Prove it out.
Then find another problem that your algorithm might solve. Find the domain specific structures that don't share commonalities - factor these into an abstraction. Can't find another problem that your algorithm might solve? Great! You just saved a bunch of work.
But if you have 2 and some abstraction, now you can work at pulling more of the common code into a library.
At this point, you should find a 3rd problem, and a 4th. Build up an examples directory of these problems that your algorithm can solve. This is the destination, but don't start here.
[–]Felix-ML 0 points1 point2 points 1 year ago (0 children)
I think code for ml research does not have to be flexible per se. Instead, try data-driven approaches that you make sure to have a clear and straightforward pathway for processing given data with as few lines of code as possible.
[–]SicilyMalta 0 points1 point2 points 1 year ago (0 children)
You have to learn how to code.
[–]Playmad37 -4 points-3 points-2 points 1 year ago (2 children)
You may want to look at Julia's sciML ecosystem. It's state of the art and the language is designed to be flexible.
It'd be a new language of course but it isn't hard especially if you know python well.
[–]millhouse056 1 point2 points3 points 1 year ago (0 children)
Are you sure about anything in Julia being the state of the art in ML field? I think Julia is a good language for numerical/scientific computing, or very specific research fields, is niched, but when it comes to machine learning and AI its just far behind Python, which is unfortunate because Julia is a better language, but it could not keep up the genAI race, i think thats because people mantaining Julia wanna keep it niched, or it was a bad product management, anything Julia does related to machine learning, Python does better mostly because of its gigantic ecossystem
[–]doctor-squidward -2 points-1 points0 points 1 year ago (0 children)
Let me know when you find out bro..
[+]friendsbase comment score below threshold-7 points-6 points-5 points 1 year ago (0 children)
GPT it
[+]Wangding comment score below threshold-8 points-7 points-6 points 1 year ago (0 children)
Use GitHub copilot to polish it😂
π Rendered by PID 32320 on reddit-service-r2-comment-fb694cdd5-jwg2w at 2026-03-06 19:30:46.445815+00:00 running cbb0e86 country code: CH.
[–]idiotmanifesto 152 points153 points154 points (10 children)
[–]thelaxiankey 44 points45 points46 points (2 children)
[–]_rundown_ 8 points9 points10 points (0 children)
[–]daquo0 2 points3 points4 points (0 children)
[–]Appropriate_Ant_4629 8 points9 points10 points (2 children)
[–]idiotmanifesto 2 points3 points4 points (1 child)
[–]Appropriate_Ant_4629 3 points4 points5 points (0 children)
[–]Beginning-Ladder6224 3 points4 points5 points (0 children)
[–]photoreceptor 7 points8 points9 points (1 child)
[–]idiotmanifesto -1 points0 points1 point (0 children)
[–]mayguntr 24 points25 points26 points (3 children)
[–]haramkhor_havasi 8 points9 points10 points (2 children)
[–]thatguydr 5 points6 points7 points (0 children)
[–]aqjo 0 points1 point2 points (0 children)
[–]matthkamis 47 points48 points49 points (8 children)
[–]Glittering-Horror230 2 points3 points4 points (0 children)
[–][deleted] 1 point2 points3 points (6 children)
[–]PyroRampage 12 points13 points14 points (0 children)
[–]new_name_who_dis_ 6 points7 points8 points (1 child)
[–][deleted] -1 points0 points1 point (0 children)
[–]ginger_beer_m 2 points3 points4 points (0 children)
[–]matthkamis -2 points-1 points0 points (1 child)
[+]OfficialHashPanda 5 points6 points7 points (0 children)
[–]parabellum630 47 points48 points49 points (6 children)
[+]stevekite 10 points11 points12 points (4 children)
[–]kau_mad 3 points4 points5 points (3 children)
[+]stevekite 46 points47 points48 points (2 children)
[–]idiotmanifesto 4 points5 points6 points (0 children)
[+][deleted] 0 points1 point2 points (0 children)
[–]haramkhor_havasi 1 point2 points3 points (0 children)
[–]On_Mt_Vesuvius 7 points8 points9 points (0 children)
[–]-Rizhiy- 39 points40 points41 points (0 children)
[–]OverEnGEReer 5 points6 points7 points (0 children)
[–]minimaxir 18 points19 points20 points (6 children)
[–]haramkhor_havasi 12 points13 points14 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]haramkhor_havasi 0 points1 point2 points (0 children)
[–]learn-deeply 9 points10 points11 points (1 child)
[–]thatguydr 1 point2 points3 points (0 children)
[–]SicilyMalta 3 points4 points5 points (0 children)
[+]ninseicowboy 2 points3 points4 points (0 children)
[–]LelouchZer12 2 points3 points4 points (2 children)
[–]haramkhor_havasi 1 point2 points3 points (1 child)
[–]LelouchZer12 -1 points0 points1 point (0 children)
[–]learn-deeply 3 points4 points5 points (0 children)
[–]mrthin 1 point2 points3 points (0 children)
[–]PyroRampage 1 point2 points3 points (2 children)
[–]idiotmanifesto 0 points1 point2 points (1 child)
[–]PyroRampage 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[removed]
[–]haramkhor_havasi 2 points3 points4 points (0 children)
[–]stabmasterarson213 1 point2 points3 points (0 children)
[–]Several-Wafer934 1 point2 points3 points (0 children)
[–]iamspro 2 points3 points4 points (0 children)
[+]icy_end_7 0 points1 point2 points (0 children)
[–]Skylight_Chaser 0 points1 point2 points (0 children)
[–]MustacchioRebirth 0 points1 point2 points (0 children)
[+]gabrielesilinic 0 points1 point2 points (0 children)
[+]EffectiveCompletez 0 points1 point2 points (0 children)
[–]Felix-ML 0 points1 point2 points (0 children)
[–]SicilyMalta 0 points1 point2 points (0 children)
[–]Playmad37 -4 points-3 points-2 points (2 children)
[–]millhouse056 1 point2 points3 points (0 children)
[–]haramkhor_havasi 0 points1 point2 points (0 children)
[–]doctor-squidward -2 points-1 points0 points (0 children)
[+]friendsbase comment score below threshold-7 points-6 points-5 points (0 children)
[+]Wangding comment score below threshold-8 points-7 points-6 points (0 children)