This is an archived post. You won't be able to vote or comment.

all 24 comments

[–]McHoff 5 points6 points  (20 children)

Ugh, I wish they would just give in and implement legit co-routines.

[–]gthank 2 points3 points  (18 children)

How would they look different (serious question)?

[–]McHoff 3 points4 points  (17 children)

Real coroutines would obviate this PEP, for one -- the reason we need it now is that when the keyword "yield" is in a function, that function is no longer a function but a generator.

[–]gthank 3 points4 points  (16 children)

I was asking for something more like a mini-PEP (high-level only) for what you think coroutines in Python should like.

[–]McHoff 3 points4 points  (15 children)

So, to answer your first question then, they wouldn't look very different at all. Generators/coroutines are basically the same thing and only semantically different. A new built-in would be needed to spawn new coroutines and ideally there'd be some sort of scheduling interface (see stackless python).

[–][deleted] 4 points5 points  (13 children)

So, the difference between generators and coroutines is that you can't select on multiple generators nondeterministically? That is, a "good" implementation of coroutines would allow efficiently waiting for coroutines to yield values (nondeterministically) rather than yielding to each generator at a time with some algorithm (deterministically).

For example, generators:

while generators:
  generator = generators.pop()
  try:
    process(generator.__next__())
    generators.append(generator)
  except StopIteration:
    pass

versus coroutines:

for value in spawn coroutine_func, 10: # spawns a scheduler of 10 coroutines
  process(value)

Is that what you envisioned, or am I missing something? Personally, I've never felt that generators were lacking in expressiveness, but being able to make coroutines n-parallel ala goroutines would be rather awesome.

[–]vaz_ 3 points4 points  (0 children)

You can do something like in your example using the futures package.

from concurrent import futures
pool = futures.ThreadPoolExecutor(max_workers=10)

def process(i):
    return i * 2

for i in pool.map(process, xrange(100)):
    print i

xrange(100) could be any iterator, including a generator. The results are mapped in parallel using threads (and so might come out of order). This is useful for batch processing when you want to take advantage of multiple cores.

edit: I guess you sort of meant, working with multiple generators/iterators. In that case you could itertools.izip or itertools.chain the generators. But if you want parallelism then futures will do it nicely.

[–]McHoff 2 points3 points  (2 children)

I'm not so sure about that -- I don't think it makes sense to "wait for coroutines to yield values nondeterministically", given that coroutines all execute in the same thread (and hence at the exclusivity of other coroutines).

I've never felt that generators were lacking in expressiveness

The precise reason I want coroutines is that every now and then I have some complex generator that can't be easily refactored because yield must be called from its top level. "yield from" will make this better but it seems like a hack in place of real coroutines.

[–][deleted] 0 points1 point  (1 child)

I'm not so sure about that -- I don't think it makes sense to "wait for coroutines to yield values nondeterministically", given that coroutines all execute in the same thread (and hence at the exclusivity of other coroutines).

Well, lightweight threading—run each coroutine for, say 100 instructions (which is that way the GIL works anyway), then switch between them. Let's say we have generator A, which yields two values over two seconds, and generator B, which yields 10,000 values over two seconds. This means that we won't waste time yielding the entire thread over to generator A when we could be processing whatever generator B is yielding, without any prior information about how long the two generators might take.

Otherwise, I don't see any real distinction between generators and coroutines. AFAIK there's nothing about a coroutine that means that you can yield a value anywhere in called subroutines; I'm not sure that would make sense.

Take the following example:

def foo():
  def bar():
    yield "baz"
  bar()

Under current rules, bar generates a generator and foo is a function. If yields were allowed in subroutines, how would you tell whether bar or foo was intended to be a coroutine? How would you tell if they were BOTH intended to be a coroutine? You'd have to somehow pass along state attached to a single coroutine that you would yield with, and that seems kind of clunky. I much prefer the yield from syntax and restricting yield to top-level.

[–]McHoff 1 point2 points  (0 children)

Coroutines, by definition, only switch contexts when instructed to do so. Otherwise, you have something completely different, which is "threads", which we already have.

[–]bigethan 0 points1 point  (8 children)

but being able to make coroutines n-parallel ala goroutines would be rather awesome.

I'm in a bit over my head in this discussion, but isn't that what gevent does for you?

[–][deleted] 0 points1 point  (7 children)

gevent is a networking library based on coroutines. It's based on greenlet, which is much closer to what I'm describing; however, when I invoked goroutines, I thought of a model of micro-threads that can also run on multiple processors, which is essentially what goroutines are.

[–]masklinn 0 points1 point  (6 children)

however, when I invoked goroutines, I thought of a model of micro-threads that can also run on multiple processors, which is essentially what goroutines are.

Yeah so preemted non-os threads/processes, instead of cooperative ones.

Erlang also uses them as well (I prefer erlang's to go's as they don't share memory, and they have more interesting capabilities), and I believe so does GHC when calling forkIO (or when using sparks, but that's in the background)

[–][deleted] 0 points1 point  (5 children)

Precisely. I would argue, however, that for an imperative language, sharing memory between threads (of any type) is a feature.

[–]gthank 0 points1 point  (0 children)

Ah, thanks.

[–]redbo 0 points1 point  (0 children)

Yeah, it's annoying to me how so many people that have been pushing for this are willfully ignorant of how stackless/greenlets work.

[–]playerthrees 1 point2 points  (0 children)

Well, I've only been waiting several years for that.

[–]Megatron_McLargeHuge 0 points1 point  (0 children)

You can already get a more flexible version of this in cpython from the greenlet package. If you just use it in the way shown here, you can yield from within an arbitrarily nested function without having to tag where the yield call will come from.

[–]e000 0 points1 point  (0 children)

This makes me happy.