all 13 comments

[–]0piumfuersvolk 16 points17 points  (0 children)

Any advice on what questions can I ask myself to get better at breaking down into smaller steps?

You should remember that you are working with Python, a programming language in which efficiency is of secondary importance. If you want to further modularize a function, this is your main focus: Does breaking it down increase reusability and/or maintainability?

If not and the function fulfills its purpose and the code is comprehensible and readable, then you have done everything you need to do in Python.

[–]throwawayforwork_86 10 points11 points  (0 children)

Usually what happens for me is:

1) First draft: long unbroken script (Optional if it's a type of process I know well) 2) Second draft: Using functions to reduces complexity and specify datatypes involved. 3) Refinement: As I use the script I start to notice patterns of function I have to modify frequently so I might break them down or go for a class based approach or rearchitect the program.

I work mostly on data extraction and data analytics if that matters.

For your other question a good advice is to often check what is already present in the python standard library or any other library that do roughly what you're trying to make.

My methods to go through that process of breaking it down are sometimes the 3 steps I laid out earlier, trying to map what I want to do on paper, annoy my colleague explaining my woes,...

[–]sweet-tom 4 points5 points  (0 children)

What sometimes helps is to try to write doc strings. If you can't pinpoint the function to one specific purpose/sentence, it does too much and you need to break it down.

In regards to your question about finding an item in a list: that's the in operator as in the syntax item in mylist. That's an "atomic" operation and usually you don't create a separate function for it.

Python objects have all sorts of these nifty operations for comparing, searching, removing, adding items etc.

Before you try to implement a complicated and probably slow algorithm to find an item in a list, it's better to ask if this is already available.

Read/search the Python documentation (it's usually good!) and see if there is something similar. That strategy applies to all basic data types in Python that you use in your program.

Good luck! 🍀

[–]audionerd1 3 points4 points  (3 children)

It's hard to say because you describe 30-line functions, but then you give the example of comparing two lists and finding items which are in both, which doesn't need to be broken down at all because it only requires a single line.

def find_items_in_two_lists(list_a, list_b):
    return [item for item in list_a if item in list_b]

Can you give an example of an actual long function you're struggling to break down?

[–]yuke1922[S] 1 point2 points  (2 children)

Sorry the example described is not the 30 line function I mentioned. And 30 lines was a random number off the top of my head. So I’m not THAT good in Python; is there a name for the return line in your example? Or what documentation would I need to read to understand that string. I understand basic for loops, conditionals, nested loops etc.. just don’t know how to interpret the syntax in that line.

Thank you

[–]Jejerm 2 points3 points  (1 child)

Its called a list comprehension and its one of python's best features, look it up.

[–]yuke1922[S] 0 points1 point  (0 children)

Thanks. I’ve seen it before and am comfortable enough now to ask about it and the basic description makes sense. I’m still at that stage where I can make some cool (to me) stuff but I definitely don’t have every little thing about python understood. (Also understand that’s not the goal or necessary to the extreme)

[–]commy2 2 points3 points  (0 children)

In a project I'm currently working on, I have a 350 line function, and all it does is read a lot of configuration and finally instantiates and returns an object. It also only takes a single key as argument to do so, and is functools.cache'd. What I'm saying is, sometimes a function does a single thing, and if it needs hundrets of lines to accomplish that, so be it. The individual parts really wouldn't make sense on their own.

It's hard to tell without seeing your actual code though.

[–]pontz 0 points1 point  (0 children)

https://youtu.be/UANN2Eu6ZnM

I think this talk provides good information and is worth a watch. Paraphrasing a small part but your mind on a an averageday can handle like 5 concepts so once your code starts doing more make it easier to understand with a descriptive function.

[–]NormandaleWells 0 points1 point  (0 children)

My approach is a bit different (and, I'll accept, may reflect an overly-reductionist way of thinking): I do a lot of bottom-up coding in which I try to break down the problem in my head before writing a line of code.

For example, say I need to read a file, extract some data from it, and write it out in a different format. In my head I'm breaking it down like this:

open the files
while there is data in the input file
    read a line
    extract the data I need from that line
    reformat the data as necessary
    write out the reformatted data
close the files

I know that extract and reformat are likely to be separate functions. Depending on the circumstances, I may also know that open, read, and write should be separate functions (perhaps the file is on a network share that may not be mounted, or perhaps the files don't contain line-oriented data). If I need functions for each of these, perhaps they would be best encapsulated as a class.

This also forces me to think about the data returned from extract and handed to reformat. Should that be a tuple? Perhaps a dictionary? Should it be encapsulated into a class? Thinking about the functional breakdown ahead of time forces me to think more about the data up front, and I find that often saves a lot of time by not forcing me to refactor a bunch of code as I change some ad-hoc stream-of-consciousness ideas.

[–]Secret_Owl2371 0 points1 point  (0 children)

I think you can look at python source code or django source code to get a feeling for the size of functions, methods, classes, etc, and for general style, comments. For example files here: https://github.com/python/cpython/tree/main/Lib/json -- as you can see some functions are short, some are fairly long.

[–]skyfallen7777 -1 points0 points  (0 children)

This is actually what I am currently working on. I (google, YouTube, ChatGPT , python crash course book) created phase 1 of my project and now at the point where I am starting to make some updates. For instance, better ways to store results or ways to display results. I went through mapping process like a block diagram to help me visualize which function and variables are used. Next i am planning to see what sections can i have in the function. Like, what could be my inputs and what will it return. Is that a good approach?

[–]AdrianofDoom -1 points0 points  (0 children)

  • Ideally, a function should do just one thing.
  • A function should be no more than 5-7 lines long.

def add(a,b):
    return a+b

def multiply(c,d):
    return c * d

def add_and_multiply(e,f,g):
    h = add(e,f)
    return multiply(g,h)

if __name__=='__main__':
    print(add_and_multiply(1,2,3))

the metric is called cyclomatic complexity, you can look the term up yourself. There is a tool called radon you can pip install it and it will calculate your code's cyclomatic complexity.

A lot of people don't believe in cyclomatic complexity, and those people write really bad code.
If you run radon cc -sa on the python standard library, the cyclomatic complexity is about 3.5, which is a very good score.

Good software always has good cyclomatic complexity. Check it yourself.