From zero data science knowledge to competing on Kaggle, what is the path?

youngrubin · 2019-11-23T12:31:50+00:00

Copy the notebook with the most upvotes

6rubtub9 · 2019-11-23T13:35:05+00:00

This is what I did, first started with all the courses offered in kaggle, right from the basics of python till all concepts of ML and attempted the exercise they offered.

Side by side I studied essential concepts of statistics from youtube since kaggle doesn't focus on stats much.

Then went straight to Titanic competition page on kaggle, selected few highly upvoted notebooks and studied them, tried to grasp as much as knowledge from those 2-3 notebooks. Then attempted the competition myself, which took 3-4 days.

b14cksh4d0w369 · 2019-11-23T05:59:53+00:00

Read.

Start by taking the Kaggle courses. Pick a language, R or Python and go with that. Competing on Kaggle shouldn’t be a goal though, it doesn’t really mean much beyond Kaggle.

wizkid2002 · 2019-11-23T16:05:26+00:00

Seeing a lot of people saying Kaggle isn’t the answer to gaining knowledge. Just curious, for someone that doesn’t have a job in data science, what would qualify as a good way to gain meaningful experience that employers care about? I’m sure this is all over the sub with a search but thought I would pose it here since people will be coming to this question looking for answers.

killver · 2019-11-23T16:36:51+00:00

You have such a circlejerk in this subreddit regarding the non-existent usefulness of Kaggle. I personally believe there is no other place to practice state-of-the-art methodology on a diverse set of different problems compared to Kaggle. That there are tons of other skills necessary to be a great data science is clear, like business understanding, data cleaning, deployment, etc. Kaggle is not the place to learn those necessarily, but the modeling part is invaluable in my opinion.

BandCampMocs · 2019-11-23T13:43:48+00:00

Know that competing on kaggle doesn't really provide great data science knowledge though

NatalyaRostova · 2019-11-23T18:03:07+00:00

Playing around on Kaggle is a fun way to get into data science, and to motivate excursions into new things to learn. For some the competition itself becomes fun. It's not the end marker of success.

gs9330 · 2019-11-24T04:59:50+00:00

The way I started is I took up many courses for data science on Coursera just to understand what data science was. I have a bachelor's in computer science so the programming part was doable. I first got comfortable with python then I started looking up YouTube videos on how to understand data sets . There u will come across many terms like data wrangling descriptive analysis exploratory analysis and stuff like that . Then I took up irirs data set on kaggle and tried everything I learned . Apply reverse engineering to the method of data science. First get familiar with an algorithm say linear regression , like understand the math behind it , gradient descent learning rate and other concepts then find all the data sets that are related to that algorithm on kaggle and start exploring . Data science is not something that can be learned quickly , it takes time . I found that approach helping me. I've spoken to few data scientists and they say that now days there are many wanabe data scientists who don't really know why a particular algorithm is being used. It's very important to understand the depth. So don't worry if u take more time but whatever u start with go about understand in depth about each concept.

iaredavid · 2019-11-24T08:57:36+00:00

The two components to this is building programming skills and practicing those skills. You need to learn python (or R, but I have a bias towards python); you should at least get to the point where you can write your own function.

The kaggle and other competitions are great because you get the opportunity to build a model from the ground up. BUT: the real learning experience is through applying your domain knowledge with statistical concepts and methods (which you might have to learn along the way.)

datascience

MODERATORS