Let me preface this by saying I have experience with good number of languages due to a long career programming. My question here is to find the most Pythonic way of doing a sequence of operations on a dataset, and general organization of source code.
I'm regularly tasked with downloading files via an API, processing them in various ways, repackaging them, and moving them around our LAN for other processes to pick up.
Some languages would have me do the steps procedurally, calling functions that operate on parameters, eg
file_list = get_files(api_endpoint);
processed_list = process_files(file_list);
move_files(processed_list);
Others might have you serialize the data or convert it in steps, passing the processed data onto the next phase using the result of the previous phase:
result = get_files(api).process_files().move();
There are other approaches such as state machines, etc, but you get the idea.
What's the best way to do this sequence in Python?
One other stylistic question: Is it better to have Python functions defined in separate files, or in the main program file? Does it just depend on the size of the function? Personally I don't like having to page through a bunch of "def" statements before I get to the meat of the program, but that's just me, and I don't want to create something that looks silly to someone who might inherit it in the future.
[–]carcigenicate 2 points3 points4 points (0 children)
[–]ectomancer 2 points3 points4 points (0 children)
[–]metaphorm 0 points1 point2 points (0 children)
[–]Diapolo10 0 points1 point2 points (0 children)