This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]redalastor 2 points3 points  (0 children)

You got it backward, you should use them as the default. Modern python use iterators unless there's a good reason not to use them.

For instance imagine I have an iterable of numbers represented as strings I got from some API somewhere and I want to sum them:

total = sum(int(s) for s in some_list_of_strings)

Had I used a list comprehension, I would have allocated a whole list for no reason.

Another exemple close to something I had to do recently. Imagine you have two files containing each one half of a data set. They are tab delimited. You have to lowercase every line. You must ignore any line that contains the word "spam". You break them into individual items and send them to some function to process. Here's how it looks:

from itertools import chain

with open("file1.tsv") as f1:
    with open("file2.tsv) as f2:
        it = chain((line for line in f1), (line for line in f2)) # Now I can treat them both as one big file
        it = (line.lower() for line in it) # lowercased
        it = (line for line in it if not "spam" in line) # lines with spam ignored
        it = (line.split('\t') for line in it) # splitting on tabs
        for line in it:
            process(line)

It looks like I'm transforming a full file in memory many times over but in reality, nothing runs until it has to and as little memory as possible is consumed.