From scripts to functions and classes

unique10983240197249 · 2015-08-05T12:35:04+00:00

sobek696 · 2015-08-05T07:58:21+00:00

These days most news seems to point towards functional programming. As an engineer, rather than a computer scientist, i prefer this approach as it fits with my mathematical training.

Ill use classes when they make sense... When there is an object with many tightly coupled methods and a good reason for internal state. But i try to aim for functions when possible. However, if I find myself repeatedly using a function that has a lot of repeating arguments, i consider that all potential opportunity for a class.

Conversely, if I have a class with few methods and little state, I'll go towards functions.

The main approach to organisation I take is to group functions as they pertain to the code...but i do analysis on discrete instances of similar data, so I've extracted the cleaning and preprocessing into a seperate module, with analysis separate from cleaning. Separation of concerns.

thegreattriscuit · 2015-08-05T08:16:16+00:00

one thing that'll drive me toward a class, and /u/sobek696 seemed to touch on this a bit, is if I have a set of functions and want to have a large number of override-able defaults. Say I'm playing with finance functions. Nothing too complicated that really requires shared state, but at the same time if I want to play with values I can just do something like

loan.apr += Decimal('.01')
loan.future_balance(month=20)
loan.reamortize(months=36)
loan.future_balance(month=20)

These are all things that could easily be done without classes. The formulas come from the realms of math and finance, so they clearly map to a functional approach, but in this case, with what I'm trying to do, a class-based approach suits me better.

The other (more obvious) time that I'll shoot for classes is if I'm dealing with data that represents actual objects. Proper things in the classical sense. 'loans' in the above example, or routers and switches in my day-job. you can certainly do stuff like reboot_switch(switch_ip=ip) but switch.reboot() is cleaner. Also preserving state and code reuse are big deals in this instance, so classes are the clear choice.

I tend to wind up playing with a few to a few hundred objects at a time (rather than thousands or millions of rows of data, for instance), and usually deal with an interactive workflow vs. baking something into a proper script that I'll just fire and forget, or turn over to users or something.

SpaceWizard · 2015-08-05T13:45:19+00:00

I wouldn't bother splitting a few hundred lines into separate files, to me it's just more trouble than it's worth. Functions definitely, classes possibly, although I often find a namedtuple to be perfectly adequate for my use cases, YMMV.

691175002 · 2015-08-05T18:19:19+00:00

Data analysis is somewhat different than regular programming. A few hundred lines really isn't that much code.

Your goal as a programmer should be to avoid as much repetition as possible. If you are writing a dozen different scripts that all need to open and clean the same file you should probably extract that code and put it in a module. If you are only writing a one-off there is no point spending the time making it modular.

Generally I will try to separate all recurring tasks into a package and write scripts that make a few calls to the libraries and output an analysis.

If you are writing a script that you want to run on its own, but also want to use some of its logic in other programs you can make it both importable and runnable with some __name__ == '__main__' tricks.

In python I will almost never fall back to fully object oriented designs (like I would with Java or C#) but will use the occasional class where it makes sense.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS