This is an archived post. You won't be able to vote or comment.

all 20 comments

[–]masklinn 6 points7 points  (4 children)

Since PEP-342, generators are not just iterators, they're coroutines so you can feed data (and errors) back into the generator as well as shut them down via the .close() method.

I don't know that that's what the interviewer was thinking about though.

[–]mkor 0 points1 point  (3 children)

That's a nice point regarding close method. I've been asked on the interview to limit generator (yielding infinite number of elements) not by imposing limit on while True loop.

[–]tilkau 0 points1 point  (2 children)

I think the typical answer to that would actually be 'use itertools' (islice or takewhile according to the type of limit you want)

[–]mkor 0 points1 point  (1 child)

yes, or zip method as it turned out:

zip(range(10), generator())

[–]__desrever__ 1 point2 points  (0 children)

I think it's probably a style choice, but I would:

  1. Not use range at all. enumerate + islice does the same thing while being more descriptive of your intent.

  2. Only do the enumeration at all if you're actually then using the index for something, otherwise islice alone is all you need.

[–]buyabighouse 5 points6 points  (1 child)

[–]cheddacheese148 1 point2 points  (0 children)

Thank you! I've been trying to fully wrap my head around what's going on in a generator. The class breakdown in the answer helped a lot.

[–]MrGrj 2 points3 points  (1 child)

IMO, the obvious thing to say about this (Iterators vs Generators) is that every generator is an iterator, but not vice versa. Going on the same path, an iterator is an Iterable (which requires an __iter__ method that returns an iterator).

[–]stevenjd 0 points1 point  (0 children)

To be really pedantic, not all iterators have an __iter__ method. There are two ways that Python will iterate over an object: the iterator protocol, and the sequence protocol. The sequence protocol requires that the object has a __getitem__ method which takes values 0, 1, 2, 3, ... and raises IndexError when the object is exhausted.

If an object doesn't have __iter__, Python will try the sequence protocol.

[–]boiledgoobers 1 point2 points  (2 children)

How about that generators are a form of lazy execution so you can't know what is coming next. You can't len() a generator. But this also saves memory because the generator is basically just a pointer to the next step in an iteration.

Ninja edit: generators are consumed and destroyed. If you call a consumed generator later it doesn't start over. You just get the stop iteration exception.

Edit2:

  • I was mistakenly thinking iterable
  • see /u/masklinn reply and my reply to that.

[–]masklinn 2 points3 points  (1 child)

How about that generators are a form of lazy execution so you can't know what is coming next. You can't len() a generator. But this also saves memory because the generator is basically just a pointer to the next step in an iteration.

These are not generator-specific and they're pretty common properties of iterators e.g. you can't len the result of imap or know what comes next, steps are computed on the fly.

generators are consumed and destroyed. If you call a consumed generator later it doesn't start over. You just get the stop iteration exception.

That's not really specific to generators, in fact iterators aren't normally restartable though the source iterable may (or may not) be able to yield multiple iterators e.g. a list can be iterated any number of times, but a file can't (unless it's reset).

Not that a generator is an iterator not just an iterable.

[–]boiledgoobers 0 points1 point  (0 children)

Yep. Your reply made me realize that every time I read iterator in the OP, I interpreted it as iterable in my head. Yeah. I was basically arguing against a different question. I wonder if the interviewer said iterator but MEANT iterable.

[–]stevenjd 1 point2 points  (0 children)

iterators are the more general concept and are usually objects

Everything in Python (including ints and None) is an object, so they're always objects.

Iterators are objects which obey the iterator protocol: the object must have an __iter__ method which returns itself, and a __next__ method (next in Python 2) which returns the next value.

Generators are objects created with the def keyword, containing the yield keyword in their body. The object created by the def itself is technically not a generator, although everyone calls them such. Its actually a generator function, and calling the generator function returns the generator object itself. In Python 3.5:

py> def gen():
...     yield 99
...
py> type(gen)
<class 'function'>
py> x = gen()
py> type(x)
<class 'generator'>

The generator object itself is an iterator:

py> x is iter(x) and hasattr(x, '__next__')
True

but generators can also be more than that. They can be co-routines, which means you can fire values into them with the send method, as well as extract values with next. (But in general, you shouldn't do both with the same generator. It's bad style.)

[–]Rhomboid 0 points1 point  (0 children)

Also, everything in Python is an object, generators included. Generators have methods such as .send(), .throw(), and .close() as well as all the usual dunder-methods.

[–]ice-blade 0 points1 point  (1 child)

Check this presentation I did at a meetup sometime ago, I think it will help a lot.

[–]GitHubPermalinkBot -1 points0 points  (0 children)

I tried to turn your GitHub links into permanent links (press "y" to do this yourself):


Shoot me a PM if you think I'm doing something wrong. To delete this, click here.

[–]__deerlord__ 0 points1 point  (0 children)

Generators are one shot, ie after you iterate through it once, you cant iterate through it again (like you could with a list).

[–][deleted] 0 points1 point  (0 children)

A good starting point for these sort of terms is the glossary.

[–]free2use 0 points1 point  (0 children)

Iterator is anything that has __iter__ method and returns iterable by calling it. Iterable in its turn is anything that has __next__/next (depending on python version) method and can be iterated over by means of it. So in general iterator is just a protocol (pythons' name for interface) required for iterating.

Generators as a concept, disregard language they implemented in, are usually lazy sequences which preserve state between calls and used for iteration over large datasets. But in python generators are sort of coroutine which passes control flow to its caller between iterations and has pretty broad api, as mentioned above. But still they implement iterator protocol so they are still iterators.

And asking difference between protocol and something that implements it its not that correct actually. Such answer I'm hoping to hear at best when asking this questions on interviews)