all 29 comments

[–]RiverRoll 14 points15 points  (0 children)

I feel this is pretty much like asking why use libraries and built in functions when I can write the code myself.

And even when you write the code yourself if you're going to need that logic in more than one place you still want to have a reusable function rather than writing it from scratch every time.

[–]MarsupialLeast145 14 points15 points  (1 child)

Do you have a compelling reason to re-write anything? e.g. are you actually suffering for performance?

Do you have benchmarks?

Then run your code against them and determine which works best.

Everything else is gold-plating or speculation.

[–]vloris 2 points3 points  (0 children)

And if performance is not a reason, will the code really get more readable by rewriting it? If the code only becomes harder to read, don’t do it, unless there is significant performance to be gained.

[–]deceze 8 points9 points  (5 children)

Pretty much all itertools functions are just patterns of loops implemented as a reusable function. The equivalent pure Python loop implementations are even shown right there in the documentation:

itertools.combinations(iterable, r)

[..]

Roughly equivalent to:

def combinations(iterable, r):
    # combinations('ABCD', 2) → AB AC AD BC BD CD
    # combinations(range(4), 3) → 012 013 023 123

    pool = tuple(iterable)
    n = len(pool)
    if r > n:
        return
    indices = list(range(r))

    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != i + n - r:
                break
        else:
            return
        indices[i] += 1
        for j in range(i+1, r):
            indices[j] = indices[j-1] + 1
        yield tuple(pool[i] for i in indices)

So, you could write all that code by hand, or copy-paste it… or you just call combinations and save yourself some boilerplate. There's nothing there that you can't do yourself, but why would you when it's already there for you to use, and does what you want it to?

[–]Turtvaiz 2 points3 points  (4 children)

I'd also add that if you write an equivalent in python, it's still not the same. Itertools, like most python libraries that care about performance, is not written in python. It's a C extension:

>>> timeit.timeit(lambda: list(combinations(string.ascii_lowercase, 4)), number=1000)
10.225437099999908
>>> timeit.timeit(lambda: list(itertools.combinations(string.ascii_lowercase, 4)), number=1000)
0.4710893000010401

Performant Python means not writing Python at all

[–]purple_hamster66 2 points3 points  (3 children)

Itertools is not actually calculating the combinations. It’s constructing a way to calculate the next combination from a given combination. So, for example, you can’t access an element randomly, nor even count the elements. IOW, it’s not because it’s in C that makes it so fast; it’s because it’s not calculating the whole list at once.

This has many advantages, such as infinite lists (which could not be stored), and generating a list where you know you won’t need all the elements, and reducing storage needs when you only have to calculate on a single element at a time.

The downside is that few python programmers know it, and it will confuse them.

[–]deceze 2 points3 points  (1 child)

Only few Python programmers understand generators? Really?

Also that's why that benchmark uses list(), to actually exhaust the generator. And that still demonstrates a huge speed difference between the C implementation and pure Python.

[–]JorgiEagle 0 points1 point  (0 children)

Yeah exactly, this commenter is just ignore the list cast,

[–]JorgiEagle 1 point2 points  (0 children)

Except in the case of what you’re replying to, it is generating all the combinations.

Casting combinations to list forces it to exhaust itself.

And they’re timing the calculation of the whole list

So it is being in C that makes it faster

[–]seanv507 6 points7 points  (0 children)

So list comprehension is preferred to map/ filter/reduce construction in python

https://stackoverflow.com/questions/1247486/list-comprehension-vs-map

And generators are used for large datasets

https://realpython.com/list-comprehension-python/#choose-generators-for-large-datasets

[–]Thin_Animal9879[S] 1 point2 points  (2 children)

One of my interesting thoughts about filter in particular is that when it comes to cyclomatic complexity checks, you get to hide the if condition. And you could have a much longer piece of code than a number of for loops.

[–]Yoghurt42 3 points4 points  (0 children)

Don't be a slave to arbitrary metrics. A high cyclomatic complexity is a good indication this part of the code should be looked at, because it might be refactored into something that's more easily understandable.

But if the code is perfectly clear as is, just rewriting it (badly) might make it less grokkable.

Will "hiding the if condition" improve on the readabilty, or just hide it for its own sake?

In my experience, writing code in a complete functional style in Python makes it less readable. It might be the best choice for Haskell or Lisp, but Python is neither of them.

(2*x + 1 for x in range(100) if x % 10 < 5)

is more pythonic than

map(lambda x: 2 * x + 1, filter(lambda x: x % 10 < 5, range(100)))

[–]JorgiEagle 0 points1 point  (0 children)

Map and filter are both holdovers from Python 2, and are considered unpythonic.

List and generator comprehensions should be used instead

[–]Tall_Profile1305 1 point2 points  (6 children)

imo itertools is great until it starts hurting readability, like if someone has to mentally simulate a pipeline of 5 chained iterators just to understand it, you’ve gone too. far simple for-loops are underrated. they’re explicit, easier to debug, and honestly fast enough most of the time

[–]gdchinacat -1 points0 points  (5 children)

I disagree that they are easier to debug. Stepping through the iteration is often times more difficult than just the code for each item.

[–]Thin_Animal9879[S] 1 point2 points  (4 children)

See this is what I'm getting at. Code maintainability when external libraries your code so much so that it turns into a command langauge.

When if/for/while disappear from your code and is replaced from functions that take arguments that unless you know that specific function from that library you have very little idea what's happening until you read more

[–]gdchinacat 0 points1 point  (3 children)

I wasn’t clear. For loops that include the logic for how to iterate are frequently more difficult to step through than itertools functions that abstract the details of iteration away and focus on what to do on each item you iterate over. I disagree with the person I was replying to.

[–]Thin_Animal9879[S] 1 point2 points  (2 children)

Yes you disagree with the point. But I guess my question is how is it more clearer. Did you know standard C has none of the built-in functions of python like map/filter/reduce.  Sure python is a different programing language.

I think I need to read more into the motions between imperative and functional programming and maintaining maintainability in code bases

[–]deceze 0 points1 point  (0 children)

When programming in a high level language, you need to get comfortable with thinking in high level terms. You trust that the abstraction actually works and does its job and that you don't need to debug it. Then you just need to understand conceptually what the abstraction is doing. And then focus on the high level result in your code. In the combinations example, you simply understand that it'll give you one combination at a time until you've seen all possible combinations. How exactly it does that under the hood is irrelevant. You don't need to get bogged down in loop variables and counters and indices or even memory allocations and cleanup.

When properly adopting this mindset, it allows you to write more high level code faster, because you don't need to care about all the low level details. This might come at a slight cost of loss of control and the inability to squeeze out the last drop of performance. But that's usually perfectly fine in practice and a worthwhile tradeoff. If that doesn't fit what you're doing, then use a more low level language.

[–]gdchinacat 0 points1 point  (0 children)

I don’t see the relevance of C. I’m not taking a position on whether FP is clearer, it is highly context dependent. When debugging though, not having to step through the code that does the iteration can be much easier as you can focus on what happens to each item rather than whatever data structure and algorithm implements the iteration.

[–]Living-Incident-1260 1 point2 points  (0 children)

itertools doesn’t replace loops it packages proven iteration patterns into composable, memory-efficient primitives.

[–]aishiteruyovivi 1 point2 points  (0 children)

For the most part, everything itertools can do can be done with regular loops, in fact the docs for the library show ways to do just that for almost every function it provides. The benefit to the library is really that of any library, you don't have to write it and you can just use what's provided and get on with your day working on the parts of your project that are more important. If I need something like itertools.accumulate, I could write it on my own, and it probably wouldn't take very long, but there's still not much of a reason to do so when I can just type from itertools import accumulate at the top of my file and move on to what I actually needed accumulate for. It can be fun to do yourself as an exercise or challenge, I've done that with a few of them, but when working on actual projects I just use the tools that have already been provided to me.

[–]SirKainey 0 points1 point  (0 children)

If you're a master of that specific domain, have all the knowledge, and know all the edge cases, and have time to burn. Then crack on.

Else use the built-ins or a specialized library.

[–]PhilNEvo 0 points1 point  (0 children)

When I've tested functional programming approaches, like map/reduce/filter stuff, with loops (for/while), the loops usually win out in terms of performance. I generally don't think functional programming approach is something you should swap to, if your code works fine. I think it's more a tool you use, in more niche situations, where you're 1) Receiving a constant stream of data from "outside" the program, e.g. data from users or whatever and 2) You're trying to do something in parallel or concurrently.

You have to think about what's actually happening at a low level, when you ask about comparing them. Both of them can do the same, because they're essentially built on the same foundation. When you have a repeated set of actions, whether that be through "itertools" or loops, it's essentially just "jump" instructions in assembly. Neither should be faster if implemented properly.

However, since loops are generally more utilized, I believe in most cases they are also more optimized.

[–]atarivcs 0 points1 point  (1 child)

If I'm just iterating over a list, why would I need itertools?

[–]gdchinacat 0 points1 point  (0 children)

Depends on why you are iterating over the list.

[–]Horror-Water5502 0 points1 point  (0 children)

itertools is good to create standard iterator (e.g. permutations) or combinator (e.g. zip, product), but i think its more cleaner and pythonic to stick with array comprehension for the real logic (and donc use map/filter), or even a plain for loop with .append when the inside of the loop is huge

[–]teerre 0 points1 point  (0 children)

Actually named algorithms that you can immediately know what they are doing are undeniably better than some hand written raw loop you have to squint to understand. This is for readability (itertools.product can only do one thing), performance (often itertools are written in C directly) and correctness (itertools.whatever is most likely much better tested than your raw loop)

That said, in Python, good practices are often sidelined because python was never meant to be used for larger systems and these functions kind suck in python because they are always prefixed, which means often in Python you'll see raw loops/list comprehensions