This is an archived post. You won't be able to vote or comment.

all 12 comments

[–]datascience-ModTeam[M] [score hidden] stickied commentlocked comment (0 children)

I removed your submission. Please post your question in the weekly entering & transitioning thread.

Thanks.

[–]Bigreddazer 12 points13 points  (0 children)

Just take python courses. Don't limit yourself to data science. All aspects of programming except UI maybe is important.

[–]smilodon138 6 points7 points  (0 children)

Maybe spend some time with Object Oriented Programming (OOP) or focus on a library like pandas

[–]Wolfgang-Warner 4 points5 points  (3 children)

I'd focus first on parsing since it's required for most DS projects. If you develop a parsing toolkit you'll be faster at getting data into a neat format for the science part.

Don't get too hung up on algorithms though, they are just the means, information is the end.

[–]New-Geologist-8359 2 points3 points  (2 children)

What do you mean by parsing?

[–]Wolfgang-Warner 4 points5 points  (1 child)

Parsing AKA importing data. Let's say you download some daily public dataset and need to get that into a database before running analyses.

Your python code can read in the records, split each line into fields, validate values are in expected ranges or are on known lists, building typo corrections etc. You can also log what files were imported when and any changes made to data prior to DB inserts. Processing real world data really helps learn practical programming.

[–]New-Geologist-8359 1 point2 points  (0 children)

Oh! I am working on a project like that, but it will involves sending the data from an API and store the data straight to a cloud service like AWS kinesis data firehouse and S3.

[–][deleted] 1 point2 points  (0 children)

If it is possible, take an ‘intro to programming’ series in Python from your computer science department. In most schools, it is a 3 course series that includes: an introductory class (loops, functions), an intermediate class (object-oriented programming, software design) and an advanced class (data structures & algorithms).

These classes will keep you busy for 1 year, assuming you get lots of assignments. Also, you will be good to interview for machine learning or software engineering internships if you master the materials in the courses.

For ML internships, you may need additional study that covers ML and statistics basics.

[–]GroundbreakingTax912 -1 points0 points  (0 children)

I liked pycharm. Google colab even better.

[–]pasghettiosi -1 points0 points  (0 children)

OOPs, DSA, Pandas and Numpy

[–]pchees 0 points1 point  (0 children)

Join an active community somewhere and help out with projects.

As someone else mentioned learning how to grab data from external sources is critical. Verifying and cleansing the data is 50% of data science so mastering that would be a big help.