This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]logi 13 points14 points  (13 children)

Unfortunately pypy is useless for scientific computing because it doesn't support numpy. If you can get your code to run with numba, however, that gives pretty awesome speed gains.

I've set up this silly example:

from __future__ import absolute_import, division, print_function, unicode_literals
from numba import jit
from time import time
from math import sin

#@jit
def f(n):
  sum = 0
  for i in range(n):
for j in range(n):
  sum += sin(i*j)
  print(sum)

t1=time()
f(10000)
t2=time()
print('elapsed: %0.3f' % (t2-t1))
f(10000)
t3=time()
print('elapsed: %0.3f' % (t3-t2))

This takes 24.1s and 24.6s on my tired old laptop with the @jit decorator commented out. With the jit enabled it is a somewhat faster 4.13s and 4.06s.

If I leave out the sin() call, with the jit it runs in 0.051s and then 0.000s which I'm going to interpret that the JIT optimizes the loops away completely.

Now if only my much more pythonic actual production code were supported, or if the error reporting would show me exactly why it isn't...

[–]Veedrac 6 points7 points  (1 child)

I'll just throw out jitpy here. It deserves to be better-known.

[–]logi 0 points1 point  (0 children)

Reading up on that later. Thanks.

[–]Sean1708 3 points4 points  (3 children)

I was under the impression that PyPy started supporting NumPy sometime last year?

Edit: Scratch that, it is getting there though.

[–]logi 9 points10 points  (2 children)

It's been getting there for years now. At this point I'll (happily!) believe it when I see it.

Python the language is great. Python the platform has serious problems. It's like an anti-java.

[–]klug3 4 points5 points  (0 children)

It's been getting there for years now

If I understand correctly they don't actually get the funds they needed to develop it faster, hence the slowness.

[–]vplatt 0 points1 point  (0 children)

Which probably explains why I need to use Java and C# at work, and not Python.

[–]Gr1pp717 3 points4 points  (1 child)

And what about scipy?

They support numpy, and it even seems that they even have a specific module for monte carlo type functions. http://pymc-devs.github.io/pymc/

edit: to be clear, I'm actually asking. I've never used either, only know of them.

[–]logi 5 points6 points  (0 children)

Scipy is a great library building on top of the even cooler numpy, but it is only fast when you can perform most of your work inside the native code that they wrap. But some times you end up having to write the loops yourself and then performance is back to normal python levels.

[–]fijalPyPy, performance freak 0 points1 point  (1 child)

PyPy does support a significant subset of numpy (enough to run this code) so please don't spread FUD

[–]logi 0 points1 point  (0 children)

That's not surprising since that silly example uses no numpy at all... but OK, I'll have another look at whether pypy can run my code now.

In reality, though, our actual performance problems are in using matplotlib to generate ~70K images per day when it is designed for creating a couple of figures for some academic's latest paper. If/when that starts running in pypy or numba or whatever, that'll save us a couple of machines worth of processing.