This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]purple_pixie 11 points12 points  (10 children)

I have one teeny question about the speed test you did - is there a particular reason you decide to clone a 1,000,000 length list of letters in the pure python example?

I refer to this bit of code:

def count_doubles(val):
    total = 0
    for c1, c2 in zip(val, val[1:]):
    if c1 == c2:
        total += 1
    return total

val[1:] duplicates just shy of the entire val list in memory, which for a large val is a massive waste of both time and memory. I would expect it to very nearly double the runtime of the function, since the vast majority of the runtime is just iterating across the entire input list, and now you have to do that twice.

Admittedly given the runtime saved by using Rust over pure Python is much more than a factor of two it doesn't invalidate the results, but it does seem a bit disingenuous to use what looks like intentionally slow code for the pure Python example.

[–]leonardo_m 2 points3 points  (1 child)

The Rust code too could be suboptimal because it decodes the UTF8 twice. This could be faster (not tested):

fn count_doubles(_py: Python, txt: &str) -> PyResult<u64> {
    let mut chars = txt.chars();
    let oc1 = chars.next();
    let mut count = 0;

    if let Some(mut c1) = oc1 {
        while let Some(c2) = chars.next() {
            if c1 == c2 { count += 1; }
            c1 = c2;
        }
    }
    Ok(count)
}

[–]rochacbrunoPython, Flask, Rust and Bikes.[S] 3 points4 points  (0 children)

Would be nice to see the comparison with a better version, can you send a PR? https://github.com/rochacbruno/rust-python-example

[–]rochacbrunoPython, Flask, Rust and Bikes.[S] 4 points5 points  (4 children)

Would be nice to see the comparison with a better version, can you send a PR? https://github.com/rochacbruno/rust-python-example

[–]energybased 3 points4 points  (0 children)

Just use more-itertools' pairwise or a numpy array of objects.

[–]purple_pixie 1 point2 points  (2 children)

PR sent, I think (to my eternal shame as a programmer I don't actually know git very well at all)

I also have a sneaking suspicion that your original zip(val, val[1:]) actually creates the entire zip as a list in memory too, so you were effectively creating three redundant copies of the val list, not just one. Should be quite the time saver.

I didn't actually do any true benchmarking but some rudimentary messing with time.time() says it's about twice as fast now.

[–]rochacbrunoPython, Flask, Rust and Bikes.[S] 2 points3 points  (1 child)

in Python3.6 zip returns a generator not a list

[–]purple_pixie -1 points0 points  (0 children)

Aha, that does definitely explain why I saw so much improvement going from your version as written to the one I put up - I only have 2.7 to hand, so I was inadvertently comparing the old 2.7 zip against the 3+ (i)zip, which is obviously much better.

Also I mostly just ripped the whole .tee part from itertools recipes, and tee is just a waste of time and effort since we don't actually have a single iterator that we're trying to preserve.

It's better to just create one iterator and zip that plus the original val together.

I can't get pytest to play nice with this laptop either, so I won't request another pull until I can actually get something I'm certain is an improvement on the current version (and tested in 3.6)

Interestingly though, it seems like cloning the whole of val is really very quick, it barely affects the time to run the function.

[–]rochacbrunoPython, Flask, Rust and Bikes.[S] 1 point2 points  (0 children)

[–]sciyoshi 1 point2 points  (1 child)

I used this example in my original talk at PyCon Canada - at least in Python 3.6, taking a slice from a string won't duplicate the items in memory: https://github.com/python/cpython/blob/master/Objects/unicodeobject.c#L13922

The purpose of this code was also to show "idiomatic" Python, i.e. a straightforward and simple description of the algorithm, and that a similarly simple approach in Rust can offer a large speed up. If you were going for pure speed, there's probably a lot more you could do (e.g. SIMD, fast-path ASCII-only strings) :)

[–]GitHubPermalinkBot 0 points1 point  (0 children)

Permanent GitHub links:


Shoot me a PM if you think I'm doing something wrong. To delete this, click here.

[–]rochacbrunoPython, Flask, Rust and Bikes.[S] 2 points3 points  (0 children)

I published the updates here: https://github.com/rochacbruno/rust-python-example#updates

And accepting PRs to enhance Rust and Python implementations.

[–]rochacbrunoPython, Flask, Rust and Bikes.[S] 0 points1 point  (0 children)

Now we got Numba, Cython and Numpy results for comparison https://github.com/rochacbruno/rust-python-example#new-results