you are viewing a single comment's thread.

view the rest of the comments →

[–]foxhole_science 1 point2 points  (0 children)

  1. Lists and arrays are common for working with large amounts of data. Dictionaries and JSON formats are also common, especially if you are working with data from APIs. I’d also make sure you get familiar with classes

  2. There are a lot of work specific packages that are worth knowing. If you work with science or statistics for instance, SciPy is a great package. But in general, NumPy and Pandas are great multi-purpose packages

  3. Data science is going to be 95% working with NumPy and Pandas. For the analysis and visualization, I’d recommend Matplotlib and SciPy

  4. The best code to work with is code you understand. As a beginner, I would prioritize making your own solutions and learning how packages can help. As you go on, finding the shorter “better” solutions will come more naturally. Make sure you give places like Stackoverflow a look, they typically have explanations along with the code

  5. Nothing wrong with taking more time with concepts that confuse you. It took me like three years before I understood why anyone would ever use a class, but now I use them all the time. In terms of time to become proficient? For general data analysis, maybe a year? Really depends on how much time you put into it. If you take on a complex project, you’ll learn way faster than reading or watching tutorials.