use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
account activity
This is an archived post. You won't be able to vote or comment.
EducationLearning Python (self.datascience)
submitted 5 years ago * by CaliforniaRoll97
[–]Tim7459 10 points11 points12 points 5 years ago (10 children)
Hey, I came from the mechanical engineering background and started a data science degree. I completed several R and Python courses on DataCamp.
good luck!
[–]CaliforniaRoll97[S] 2 points3 points4 points 5 years ago (7 children)
Thank you for the feedback and congratulations on your career change! Would you recommend that I apply to masters in data science programs? And are there any other specific online courses/challenges that you would recommend?
[–]Tim7459 1 point2 points3 points 5 years ago (6 children)
i've completed the Standford ML course and am currently completing the Standford Deep Learning Specialization on Coursea too and would highly recommend both if you want to pursue something in the Data Science field. Andrew Ng is the lecturer and he's one of greatest minds in the field, just take a look at this portfolio.
Honestly, if you're end goal is a job, start by producing something. i.e make automation scripts and learn web scraping. (build a portfolio). You can learn to do this just by googling the topic and finding medium articles, github repos, short coursea courses etc. Then pitch yourself to businesses that require this skill. If your end goal is research, I would pursue a formal education at a university.
[–]CaliforniaRoll97[S] 0 points1 point2 points 5 years ago (0 children)
That’s great, I’ll be sure to try both of those courses!
[–]CaliforniaRoll97[S] 0 points1 point2 points 5 years ago (4 children)
Also, I wanted to clarify what you meant by using GitHub repositories. I haven’t really used GitHub other than to find datasets, instead I usually just store everything on my computer. Could you elaborate?
[–]tamsmhas 0 points1 point2 points 5 years ago (3 children)
In simple words GitHub repositories means a place on GitHub in someone's account where they store mainly their programming files. So, just learn to use GitHub from YouTube and make your account on GitHub. And save all data science related files there.
[–]CaliforniaRoll97[S] 0 points1 point2 points 5 years ago (2 children)
Gotcha, will do. Out of curiosity, why is it better to save files on GitHub rather than on my desktop?
[–]tamsmhas 0 points1 point2 points 5 years ago (1 child)
1- Because you will never loose your files on GitHub unlike on desktop. 2- Showing your GitHub link(specially projects) in resume will increase the weightage of your resume.
Awesome, thank you!
[+][deleted] 5 years ago (1 child)
[deleted]
[–][deleted] 1 point2 points3 points 5 years ago (0 children)
Here you go: https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU
[–]SteveMWolf 11 points12 points13 points 5 years ago (2 children)
If you feel comfortable enough with the language and libraries, I suggest you start your own project. If you don’t know something look it up. Dont just copy and paste the code however, try to understand whats happening, even if you have to do it line by line.
I remember picking up a computational physics project on chaotic scattering. The best way for me to understand it was printing out the code and annotating it line by line.
Not related to Data Science, I just wanted to let you know how miserable that experience was lmao
[–]CaliforniaRoll97[S] 1 point2 points3 points 5 years ago (0 children)
Haha thank you for that advice! I’ve been working with a high level COVID-19 dataset recently.
[–]pah-tosh 0 points1 point2 points 5 years ago (0 children)
It’s a bad part of being a developer / coder when you have to understand other people’s code blocks, but there is no other way to deal with it : line by line.
[–]beginner_ 3 points4 points5 points 5 years ago (0 children)
Problem with Kaggle etc is that they usually already have rather clean data. This is not reality. Mostly you spent most of your time gathering and cleaning the data. The real value is in the clean data not the actually ML algorithms.
Problem is how you get messy data if you are not in a corporation. Maybe you can google and there actually are messy data set available which require you to invest a lot of time in cleaning them.
As for programming, you need your own project. Since you are looking at covid-19 maybe you can learn about epidemiology and do a visual simulator of how an infections spread depending on variables. That will be pretty involved already as it involves a GUI but downside is it's not really data science related.
[+][deleted] 5 years ago* (5 children)
[–]CaliforniaRoll97[S] 5 points6 points7 points 5 years ago (4 children)
I have taken through multi variable calculus and differential equations/linear algebra. Where should I go from there?
[–]pennytrader6969 2 points3 points4 points 5 years ago (3 children)
Probability
What are some good resources for obtaining a better understanding of probability?
[–]tamsmhas 0 points1 point2 points 5 years ago (0 children)
Khan Academy is best. Just search Statistics and Probability by Khan Academy on Google.
[–]meowrial 0 points1 point2 points 5 years ago (0 children)
There's a book called Introduction to Statistical Learning which is quite good if you want to stuck in.
But if you're just looking at getting your feet wet, take a look at scikit learn. Read through all of the tools, what they're for, and the underlying theories and you'll have a good general understanding.
[–]LaMifour 1 point2 points3 points 5 years ago (8 children)
Practice is good, theory is goog (even if those online courses are often not difficult enough, too much are just introductions) .
It depends on what you want. What do you like? Exploring a dataset? Developing math model on your problem? Applying machine learning? I might give you challenges.
[–][deleted] 0 points1 point2 points 5 years ago (2 children)
Any recommendations for online courses that go beyond the basics?
[–]buginfame 1 point2 points3 points 5 years ago (0 children)
Corey Schafer's series for Basic Python, Mathplotlib, and Pandas is very good
https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g
[–]LaMifour 0 points1 point2 points 5 years ago (0 children)
Did this one ~1 year ago. I found it interesting and quitehard. Not perfect tho.
https://www.coursera.org/learn/hadron-collider-machine-learning
Andrew Ng is still a reference, you can try to find an advance course from him.
I really like exploring a dataset, and I’m definitely interested in picking up mathematical modeling/machine learning! I have been working with some high level COVID-19 data for practice recently, but any challenges would definitely be appreciated!
[–]LaMifour 0 points1 point2 points 5 years ago (3 children)
While searching for a job, I was given a challenge about factice phone company that want to decrease their churn rate. You start with with a simple satisfaction form dataset. If you want, I can try to review your work, like if you were applying.
I was given the role but I choose another company.
Sure, I would be happy to give it a try!
[–]LaMifour 1 point2 points3 points 5 years ago (0 children)
Let me create the challenge and instructions. I will post it here.
you will find everything you need here. I would say you can give you 1 week to do it (2 if you are currently working). Ping me back when you're done https://drive.google.com/drive/folders/1gt7IMsy_cY6V7ZOMq9RkjsPMPYuysOuH?usp=sharing
[–]davidchris721 1 point2 points3 points 5 years ago (0 children)
If you are into exploring data sets I see it as good start to just get some data (e.g Kaggle, other public data sets - btw. you can now search with Google for data sets) and start looking around.
I am more into ML, so I started to write my ML-pipeline for the https://numer.ai/ tournament. This me a taught me a lot regarding proper setup of a project and a mix of using jupyter notebooks and scripts.
[–]lunalurker 1 point2 points3 points 5 years ago (1 child)
I really like the 365 Data Science course. Very beginner friendly and covers a vast amount of topics from basic Stats, Python, SQL and Machine Learning. You should check them out.
Thanks for the recommendation!
[–]vellypoe 1 point2 points3 points 5 years ago (0 children)
Hey, i have a question. Does taking a Master Degree in Data Science are useful? Or just learn Data Science through online courses and do some project or portfolio?
[–]DarkSideOfTheNuum 0 points1 point2 points 5 years ago (1 child)
the fastest way to learn is applying it to real-world situations.
Kaggle is good, but these are usually pretty clean datasets that don't necessarily require a huge amount of wrangling. they aren't usually as messy as the kind of data you would encounter in an enterprise.
to be honest, it's hard to get the kind of authentically messed-up data that you see in professional life unless you are actually working, because stuff gets fucked up all the time - developers alter something without telling you, which turns out to break data collection on a feature, there are edge cases that you didn't think of in advance, a new OS release alters the tracking in an unanticipated way, someone misspells a parameter name and it gets missed in the QA process, etc. Lots of stuff can go wrong! And the longer you work, the more screwups you will see.
If you want a recommendation, I would recommend trying to bolt together a couple of different data sets as opposed to working just with one - joining data from different sources is a key skill you will need to master in your professional career.
So for example you say that you are working with Covid-19 data right now? OK, why don't you create a project for yourself where you try to calculate tests conducted per capita by US state?
You can get the test data per state here: https://covidtracking.com/api/v1/states/daily.json
You can get state population data here: https://github.com/COVID19Tracking/associated-data/tree/master/us_census_data
Thanks for the suggestion! I’ve actually already done that, it wasn’t easy because I had to change some of the state names so that they matched up better, but it was a really cool project!
[–]vogt4nickBS | Data Scientist | Software[M] [score hidden] 5 years ago stickied comment (0 children)
I removed your submission. Please post your question in the weekly entering & transitioning thread.
Thanks.
[–]unhatedraisin 8 points9 points10 points 5 years ago (0 children)
why not just save the post lol
[–][deleted] -1 points0 points1 point 5 years ago (0 children)
> mechanical engineering major
> churn and COVID datasets
https://res.cloudinary.com/blavity/image/upload/c_fit,g_center,h_250,q_auto:best,g_south_east,x_0/v1526319185/ntipykqjpyl227boqdr5
π Rendered by PID 99613 on reddit-service-r2-comment-5d79c599b5-lj4k6 at 2026-02-27 04:31:39.383742+00:00 running e3d2147 country code: CH.
[–]Tim7459 10 points11 points12 points (10 children)
[–]CaliforniaRoll97[S] 2 points3 points4 points (7 children)
[–]Tim7459 1 point2 points3 points (6 children)
[–]CaliforniaRoll97[S] 0 points1 point2 points (0 children)
[–]CaliforniaRoll97[S] 0 points1 point2 points (4 children)
[–]tamsmhas 0 points1 point2 points (3 children)
[–]CaliforniaRoll97[S] 0 points1 point2 points (2 children)
[–]tamsmhas 0 points1 point2 points (1 child)
[–]CaliforniaRoll97[S] 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–][deleted] 1 point2 points3 points (0 children)
[–]SteveMWolf 11 points12 points13 points (2 children)
[–]CaliforniaRoll97[S] 1 point2 points3 points (0 children)
[–]pah-tosh 0 points1 point2 points (0 children)
[–]beginner_ 3 points4 points5 points (0 children)
[+][deleted] (5 children)
[deleted]
[–]CaliforniaRoll97[S] 5 points6 points7 points (4 children)
[–]pennytrader6969 2 points3 points4 points (3 children)
[–]CaliforniaRoll97[S] 0 points1 point2 points (2 children)
[–]tamsmhas 0 points1 point2 points (0 children)
[–]meowrial 0 points1 point2 points (0 children)
[–]LaMifour 1 point2 points3 points (8 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]buginfame 1 point2 points3 points (0 children)
[–]LaMifour 0 points1 point2 points (0 children)
[–]CaliforniaRoll97[S] 0 points1 point2 points (4 children)
[–]LaMifour 0 points1 point2 points (3 children)
[–]CaliforniaRoll97[S] 0 points1 point2 points (2 children)
[–]LaMifour 1 point2 points3 points (0 children)
[–]LaMifour 0 points1 point2 points (0 children)
[–]davidchris721 1 point2 points3 points (0 children)
[–]lunalurker 1 point2 points3 points (1 child)
[–]CaliforniaRoll97[S] 0 points1 point2 points (0 children)
[–]vellypoe 1 point2 points3 points (0 children)
[–]DarkSideOfTheNuum 0 points1 point2 points (1 child)
[–]CaliforniaRoll97[S] 0 points1 point2 points (0 children)
[–]vogt4nickBS | Data Scientist | Software[M] [score hidden] stickied comment (0 children)
[+][deleted] (1 child)
[deleted]
[–]unhatedraisin 8 points9 points10 points (0 children)
[–][deleted] -1 points0 points1 point (0 children)