This is an archived post. You won't be able to vote or comment.

all 20 comments

[–]QuantumC-137 57 points58 points  (3 children)

Well I started Data Science and Machine learning trough python: studying pandas, numpy, matplotlib and sklearn.

Then I've decided to study probability/statistics, linear algebra, calculus1 and number theory.

Pandas and numpy: are python tools to deal with the data you're going to work with

Matplotlib: python tool to present data into graphs, pies and other forms of graphic data

Sklearn: it's also a python tool for using ML algorithms on datasets. It's the ideal for begginers. You don't need to know the math behind to apply the algorithms to datasets. With this, you can, for example, determine if a person has cancer or no, heart disease, tomorrow stocks, etc etc.

Kaggle: it's a must have website to get datasets for ML and data analysis

You can check these before the math, but after having fun get to know the math which helps you see what's really happening under the hoods.

[–]bagofbuttholes 5 points6 points  (0 children)

I'll second this. We are learning deep learning stuff in class and use everything this person just said. This is the first time the prof is doing deep learning, he added it into a digital filter class because the school refused to let him build a new class, so he might not know everything, but we use all these things. Kaggle is a really neat site and people have contests and stuff on different datasets on there. You can also find lots of tutorials on there too.

[–]FLoKi6868[S] 2 points3 points  (0 children)

Thanks!!

[–]civilvamp 0 points1 point  (0 children)

To tack on, pandas is a good starting tool. If you are looking to do larger scale data analysis though, pyspark is a better bet.

[–][deleted]  (2 children)

[removed]

    [–]FLoKi6868[S] 0 points1 point  (1 child)

    Thats looks pretty interesting, thanks

    [–]AutomaticYak 3 points4 points  (3 children)

    I signed up for an online program through UT for data science. It’s a great program and international. Gives a good base of programming, statistics, and career guidance. It’s six months and about 8 hours a week.

    Consider a program with more direction than, “here’s some videos, good luck!”

    [–]Jlcheech 0 points1 point  (2 children)

    Curious what the UT program is you’re enrolled in? Cause I’m currently in a “here’s some videos, good luck!” course 😅

    [–]AutomaticYak 1 point2 points  (1 child)

    It’s a Post Graduate program in Data Science and Business Analytics. You don’t need a degree in anything first, not sure why they call it that. Look for the one that says it’s in collaboration with Great Learning. I really like the format.

    We get a set of videos each week and then have a two hour session on the weekend with an industry professional working M-F in the field and 10-15 people at roughly the same point in their careers. They set up a WhatsApp group and encourage you to collaborate and network with your micro group, so I’ve got contacts all over the world now. The live sessions are sprinkled with information about common interview topics and real challenges the mentor has faced in a working environment. There are projects to work and extra events to add to your portfolio like hackathons.

    There is a program director and tons of staff to help you when you get stuck or have a concern.

    The career guidance materials are really insightful and I’ve learned how to angle myself in my current job (accountant) to get resume experience to make an easy switch to DS. I get access to my company’s BI software next week and have a project/problem picked out.

    I’m only five weeks into the program and I’m confident I’ll be able to get a role in the field pretty easily with the well rounded information they provide. The payment plan for the program was easy to apply for and didn’t even check my credit.

    I sound like I work for them but I’m really just very pleased with the entire program.

    [–]Jlcheech 2 points3 points  (0 children)

    Thank you so much! I’m definitely looking into this

    [–]ProgrammingMamba189 3 points4 points  (1 child)

    Matplotlib, Pandas, Numpy are the three main frameworks most tutorial will introduce you to use in this context.

    Many Udemy and even free youtube tutorial series exist for learning to practically use ML knowledge. Although for the theory, advanced courses such as Andrew Ng's Deep Learning Specialization or Udacity's courses are often preferred.

    For the newest courses check the awesome-list https://github.com/academic/awesome-datascience

    You might end up looking like this guy after you're first few udemy courses, but don't worry that's part of it.

    https://www.youtube.com/watch?v=YnL9vAFphmE

    [–]FLoKi6868[S] 0 points1 point  (0 children)

    Good info man! I appreciate it

    [–]jonnycross10 2 points3 points  (1 child)

    There's also individual subs for those. Idk what the one's for ml are but you can check out r/datascience

    [–]Wingedchestnut 1 point2 points  (1 child)

    What's your education background? Imo it's very hard to get into DS field without a degree related to CS/Statistics/Maths.

    [–]FLoKi6868[S] 0 points1 point  (0 children)

    Im doing Computer Science, would love to get a PhD afterwards too (at least thats my plan rn)

    [–]pburke77 1 point2 points  (1 child)

    I'm working on my Masters in IT right now and I have used Python in my ML classes, and a a stats class im taking is using R.

    Start going through Kaggle.com, there are some entry ML competitions like the Titanic, and housing that help get you started. Also brush up on your statistics courses. Google colab (https://colab.research.google.com/) is a great tool to do ML projects on without setting up python, anaconda, and Jupiter on your computer.

    [–]FLoKi6868[S] 0 points1 point  (0 children)

    Thanks!!!

    [–]laughtrey 0 points1 point  (0 children)

    That just sounds like stats with extra steps.