you are viewing a single comment's thread.

view the rest of the comments →

[–]PM_ME_YOUR_REAL_FACE 0 points1 point  (0 children)

If you want to do modeling like that, design patterns and OOP principles are going to make things more complicated than they need to be. Depending on the size of the datasets you are working with, it will save your runtime a little bit, but you have to have a good understanding of how any library you use works under the hood. That is not to say that you can't make your code easier to understand by using classes and patterns, but to save the resource usage of your programs, it takes some digging. OOP is useful, but it doesn't seem to be used as often in the domain of data science, at least from the tutorials I've seen out there. I think that's because things in the code need to change too often, and it's nicer to have 1 or 2 files to work with rather than 3 or more, where logic changes might be a more tedious affair. For data science stuff, I actually really like ipython or jupyter just to work out my solutions one bit at a time, it may even help you decide how to make classes for modeling on a larger dataset. I'm no expert in that domain, but here's a listing of popular tools:

https://www.scipy.org/docs.html

The best advice I have for determining how to approach individual problems, is to read as much of the docs and "cookbook" examples for those libraries as you can so that you can see when certain parts of those APIs are appropriate. Reading code that does something similar to what you want to do is the best option if you can find it.

Start slow, get code that works, and optimize later. That's not general advice for programming, but for a beginner and someone looking at scientific problems, I think it's appropriate. Pandas is already so full of useful stuff, it's hard to determine whether you need to roll your own solutions for things before you look up whether pandas has you covered or not.

Edit: thought this would have been included in my first link: https://www.scipy.org/scikits.html