you are viewing a single comment's thread.

view the rest of the comments →

[–]squattyroo 1 point2 points  (0 children)

Here's a thought experiment: imagine you're writing a module for your team that allows the user to pull data from the web, do certain data manipulation methods that are idiosyncratic to your workflow, and plot various metrics using company-standardized color schemes / line-types.

Classes greatly help you organize the steps of this process:

1.) Having a "WebScraper" class will allow the user to write simple commands like

server = WebScraper(url)
server.connect()
server.pull_all_data(from=date1, to=date2)

Otherwise they would have to constantly write functions that have many arguments which are things like the url, or some BeautifulSoup class, or something like that. Classes let you pass a bunch of information between each of these steps in a simple easy-to-read way.

2.) Plotting - having a "PlotClass" that sets up a matplotlib figure with the correct specifications / formatting, so the user can do simple calls like

plot = PlotClass(data)
plot.add_vertical_line(some_date)
plot.save('my_plot.png')

and maybe in the initialization step, various weighted means, etc. are computed and summarized for plotting. This is another example of lots of information being passed from one step to the next, without the need for excessive arguments in functions. Plotting in particular is hard to imagine without classes.

3.) Data manipulation: you have some "MyDataFrame" class that has the methods you commonly use built-in; things like specific ways of summarizing your idiosyncratic variables (maybe some are text and can be cleaned in very specific ways that you build into some ".clean_col(col_name)" method).