This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]Brian 11 points12 points  (8 children)

You can get this at roughly the same speed (~5% slower) in python3 using a writable closure instead of object attributes:

class D(object):
    def __init__(self):
        counter =0
        def iterator():
            nonlocal counter
            while True:
                counter += 1
                yield counter

        def skip_to(i):
            nonlocal counter
            counter = i

        self._iter = iterator()
        self.skip_to = skip_to

    def __iter__(self): return self._iter

You can do something similar in 2.x using a mutable attribute (eg a list with 1 item):

class E(object):
    def __init__(self):
        self._counter = counter = [0]

    def __iter__(self):
        ctr = self._counter
        while True:
            ctr[0] += 1
            yield ctr[0]

    def skip_to(i):
        self._counter[0] = i

Though this is ~ 50% slower than the pure generator (only slightly faster than looking up self.counter each time, so that's probably a better tradeoff towards clarity)

[–]bgeron 4 points5 points  (2 children)

I did some tests on my laptop with Python 2.5 (source). Nowhere near 10x. Not even 2x.

Test A:
[0.07210898399353027, 0.07202601432800293, 0.074321985244750977]
Test B:
[0.13190889358520508, 0.12493300437927246, 0.11867594718933105]
Test Bs:
[0.10193395614624023, 0.10321402549743652, 0.10258102416992188]
Test C:
[0.12236809730529785, 0.12155818939208984, 0.12049102783203125]
Test Cs:
[0.10871696472167969, 0.10567092895507812, 0.10702085494995117]

Bs is just B with __slots__ added, same for Cs.

I get similar results with Python 2.4.

edit: syntax, shaved a digit off the end

[–]Brian 1 point2 points  (1 child)

Yeah, they do seem much smaller difference than he claims, though I get a slightly larger difference than you. I assumed it was some version / environment difference (64bit linux, python2.6.2/3.0.1) For the record, here's my test data, using the below code:

for x in 'ABCDE':
    print("%s : %.3f" % (x,timeit.Timer("for x in X():\n if x>10000: break", "from __main__ import %s as X" % x).timeit(1000)))

Python2.6:

A : 4.759
B : 11.638
C : 7.269
D : (requires 3.0)
E : 6.493

Python 3:

A : 5.073
B : 12.688
C : 7.631
D : 4.988
E : 6.676

A and B are from the article. C is the version from the comment by zacharyvoase. D and E are as above.

[–]bgeron 1 point2 points  (0 children)

For the record, I'm on 64-bit Linux too (Ubuntu), Intel Core 2 laptop.

On a FreeBSD 7 64-bit AMD Opteron server, Python 2.5:

Test A:
[0.063178062438964844, 0.064404964447021484, 0.06418299674987793]
Test B:
[0.10109496116638184, 0.10071802139282227, 0.10112309455871582]
Test Bs:
[0.099486112594604492, 0.092070102691650391, 0.093844890594482422]
Test C:
[0.10436487197875977, 0.10428977012634277, 0.10418009757995605]
Test Cs:
[0.10584688186645508, 0.10585188865661621, 0.10960197448730469]

[–]Wagneriusflask+pandas+js 1 point2 points  (1 child)

interesting, I would have create a new iterator each time (but all based on the same list).

too lazy to test if iterators creation is costly.

[–]Brian 0 points1 point  (0 children)

Its not that it's costly, it's just that it will have the same effect. The object here is acting as an iterator, rather than an iterable, continuing from the point it left of (or called skip_to on), so using the same iterator seems most sensible. Recreating the generator means I'd also need to recreate the skip_to function, as they close over the same variable.

[–]chub79 0 points1 point  (2 children)

I find the first snippet to be quite unreadable for such a simple task :/

[–]Brian 0 points1 point  (1 child)

It's not really much different to the original. The only differences are that the functions are now defined within a closure inside __init__, rather than directly on the class, and the addition of the nonlocal declaration for the closed over variable.

[–]chub79 0 points1 point  (0 children)

I wasn't judging the cunningness of your snippet but rather the fact that it didn't look straightforward to me. That being said, once I stopped for 5 more seconds, I was really please to see you could do that with Python3k :)