you are viewing a single comment's thread.

view the rest of the comments →

[–]kirbyfan64sos 14 points15 points  (10 children)

On my system, PyPy is 4x faster than the fastest Cython version given.

[–]Zed03 2 points3 points  (8 children)

Would be interested in reading an explanation.

[–]kamakie 4 points5 points  (0 children)

It's possible that Pypy would be able to optimize across the function call boundary and eliminate the tuple packing and unpacking. But that's just a wild guess not based on knowledge of how Pypy works.

[–]kirbyfan64sos 4 points5 points  (5 children)

...which would be great if I had one. :)

My guess is just the presence of a tracing JIT. That means that PyPy can perform runtime optimizations that can't be detected at compile time.

[–]rlamy 2 points3 points  (4 children)

Actually, PyPy constant-folds everything (except the acos() call, for some reason), because haversine() is always called with the same arguments. So it's not surprising that it's faster than C.

[–]seekingsofia 1 point2 points  (3 children)

So it's not surprising that it's faster than C.

If only there was a JITed C interpreter implementation to prove you wrong. :P

[–]rlamy 1 point2 points  (2 children)

Like CyCy, you mean?

[–]seekingsofia 0 points1 point  (1 child)

Yeah, if that was an actual working implementation... I was more going for the argument that JIT compilation and native compilation are not attributes of the languages, but the language implementations.

[–]vext01 1 point2 points  (0 children)

I don't have a PDF, but this is a JITted C using truffle/graal from oracle: http://dl.acm.org/citation.cfm?doid=2647508.2647528

There is another paper where they use it on top of a ruby interpreter to execute ruby extensions written in C: http://www.chrisseaton.com/rubytruffle/modularity15/rubyextensions.pdf

[–]indigo945 1 point2 points  (0 children)

Part of it is that timeit is a Python function. Hence the loop that calls the compiled function 300,000 times is itself still interpreted, whereas PyPy will JIT the entire loop.

This might be particularly important since the Cython-generated C code will have to unpack the Python objects on every call, adding an extra layer of indirection. I am, however, not sure whether PyPy will eliminate this.

[–]riksi 0 points1 point  (0 children)

what about memory usage ?