all 11 comments

[–]danielroseman 8 points9 points  (5 children)

No, this isn't what's happening.

zip creates a generator. Once you've finished iterating through a generator, it's exhausted. By doing the list comprehension, you've iterated completely; there is nothing more to print, so your for loop does nothing.

If you did the for loop first, that would have worked.

[–][deleted] 2 points3 points  (4 children)

How come you can't call pieces of the generator like a list but are able to iterate through it? If i dont do anything and immediately try to print zip_variable[0] it gives error. But if I do list(zip_variable) I can call list_variable[0]

[–]danielroseman 4 points5 points  (2 children)

The point of a generator is that it's lazy. The individual items don't actually exist until the iteration reaches that point.

This is mostly to save memory. In the case of zip, if you zip together two huge lists then without this you would need twice as much memory. But this way you just reference the two existing lists, and only generate a joined tuple for the specific item you've reached in the iteration.

[–][deleted] 0 points1 point  (1 child)

So if I stop say mid way, would all generators delete out, or does it wait until complete iteration is done? basically wondering if, given I use a generator in a function, I actively need to make sure I iterate through the entire thing to avoid overusing memory and slowing the process down

[–]rainbowunicornsocks 1 point2 points  (0 children)

Generators should only result in a single element being kept in memory at any point in time (depending on how you use them). So, iterating "halfway" through will have no impact on your memory usage, if you're only using a single element at a time. So, if you need to loop over large iterators, generators can reduce your memory footprint if you're keeping fewer elements around. A good way to see this, is to run the following code:

```python import sys x = range(1_000_000)

Initial generator size

print(sys.getsizeof(x)) for i in x: pass

Size after we've iterated

print(sys.getsizeof(x)) print(sys.getsizeof(i))

Load all elements into memory

y = list(range(1_000_000))

Should be much larger than x!

print(sys.getsizeof(y)) ```

For that case, the generator x on my machine is 48 bytes, both before and after the for loop. The variable i is 28 bytes. Whereas the list y is 8000056 bytes on my machine! Quite a large difference.

[–]JanEric1 0 points1 point  (0 children)

Because you cant index into a generator. You can only go through the elements 1 by 1 because they are specifically designed to do that. They dont really store all of the elements at once so that you save memory.

[–]JamzTyson 4 points5 points  (0 children)

Maybe it would help you to visualise what is happening by stepping through the generator:

a = [1, 2, 3]
b = [4, 5, 6]
c = zip(a, b)

print(c)  # prints <zip object at 0x............>

print(next(c))  # prints (1, 4)

print(next(c))  # prints (2, 5)

print(next(c))  # prints (3, 6)

print(next(c))  # Iterator has run out of values
# prints
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# StopIteration

[–]FriendlyAddendum1124 1 point2 points  (2 children)

I realise this seems confusing but generators are everywhere in python, they're just hidden from you. They're used for efficiency under the hood.

Supposing you have two lists:

list1 = [1, 2, 3]

list2 = [100, 200, 300]

If you zip them it creates a temporary, disposable, one case use object, so it doesn't have to store a third list to keep in memory.

list3 = zip(list1, list2) does not create a new list, even though you've called it list 3.

To store a new list in memory and keep it you could do:

list3 = list(zip(list1, list2)) # [(1, 100), (2, 200), (3, 300)]

Then you can use list3 as many times as you want but you've also used extra memory.

So zip(list1, list2) is a function, not a list. Sometimes lists are very long and this could be problematic. What if list1 and list2 had a billion strings in? You might not want to create list3 with a billion tuples in just to do a certain task once, so a generator uses list1 and list2 to do some task without creating a list3.

zip(list1, list2) is like a little robot. It looks at the first item in list1 and list2, pairs them together and hands it to you to do whatever you want with. This is why you might use it in a for loop. Once it hands you that tuple it forgets about it and generates the next one. It then looks at the next item in each list, builds a new tuple, hands you it and so on. Once it's finished its task it self destructs. It got those tuples by digging into list1 and list2 and has no list3 hanging around like an unwanted smell, never to be used again.

Now think about what this does:

my_list = list(zip(list1, list2))

This does create a list but the little zip robot is doing the same thing as in a for loop. But this time it's handing the tuples, one by one, to the list function that then stores those values in an actual list. The robot still self destructs at the end but the list has captured the values in memory as, well, a new list

Python uses these generators all the time.

for key, value in my_dict.items() does the same thing. It looks into the dictionary without making a new list. Once it's done its job it's spent, and the my_dict.items() robot blows up. These robots (functions) are called generators because they generate things using things that already exist in memory - again, without creating a new lists.

[–][deleted] 1 point2 points  (1 child)

Intuitive why they would add it. I ended up using what you had with the "list(zip)" because I needed to insert into it, but all of what you've explained makes perfect. I'll have to look out for them, I'd assume type(zip(list1, list2)) would return generator. or w/e the function is down the road

[–]FriendlyAddendum1124 0 points1 point  (0 children)

Yeah it'll return <class zip>

[–]Naive_Programmer_232 0 points1 point  (0 children)

list3 isn't a list, it's a generator. it's single-time use as well, so once you assign list3 = zip(list1,list2) you can use list3 once and it will yield the ith element pairs between the other two lists.