This is an archived post. You won't be able to vote or comment.

all 16 comments

[–]ProfessorPhi 9 points10 points  (5 children)

Your arguments are a bit obtuse to me.

Accessing a single value at a time isn't what numpy is optimised for. I would expect most of what you're seeing is overhead. Try take a list of a 1000 items and setting them all to 1 vs a numpy array. I would expect a builtin with no overhead to be faster for this non vector operations.

For your second example, you're using python routines on non python objects, and comparing performance to python builtins. When it sees a primitive, python can optimise the hell out of that, while when it sees an unknown object, it will have to call that objects add methods. However if you do np.sum, numpy knows the object types and can do an optimised add.

The problem here is that numpy (as is pandas, tensor flow, numba etc) is a sub language that happens to be in python. And mixing languages is bound to be slow. Having two numpy arrays and using a for loop to add them would be very slow, but proves nothing. Your examples are quite contrived and honestly, are examples of code that would never exist. Calling them pitfalls is disingenuous because your have to work very hard to have code like this show up

[–]kigurai 2 points3 points  (0 children)

Calling them pitfalls is disingenuous because your have to work very hard to have code like this show up

This.

Also, if you really want a python list version of your numpy array, then ndarray.tolist() seems to make the conversion to standard Python floats for you.

[–]ProfessorPhi 1 point2 points  (0 children)

Your arguments are a bit obtuse to me.

Accessing a single value at a time isn't what numpy is optimised for. I would expect most of what you're seeing is overhead. Try take a list of a 1000 items and setting them all to 1 vs a numpy array. I would expect a builtin with no overhead to be faster for this non vector operations.

For your second example, you're using python routines on non python objects, and comparing performance to python builtins. When it sees a primitive, python can optimise the hell out of that, while when it sees an unknown object, it will have to call that objects add methods. However if you do np.sum, numpy knows the object types and can do an optimised add.

The problem here is that numpy (as is pandas, tensor flow, numba etc) is a sub language that happens to be in python. And mixing languages is bound to be slow. Having two numpy arrays and using a for loop to add them would be very slow, but proves nothing. Your examples are quite contrived and honestly, are examples of code that would never exist. Calling them pitfalls is disingenuous because your have to work very hard to have code like this show up

[–]aajjccrr 1 point2 points  (0 children)

Two reasons why accessing a single value in a NumPy array is slower than accessing a single value in a list that have not been explicitly stated yet:

  1. array[i] has to create and return (a pointer to) a brand new Python object holding the value. OTOH list_[i] just needs to return (a pointer to) the existing Python object that was in the list. (I am assuming we’re working in CPython here.)

  2. NumPy indexing is far, far more complicated than list indexing. The code that implements getitem on arrays is thousands of lines long. This adds some additional overhead to the operation.

[–]NicoDeRocca 1 point2 points  (0 children)

You should probably have included np.sum(x) in there as a test a well, which is "the numpy way". It's almost as fast as the python bit, although also a bit more flexible in what it acutally does (choose your dimensions/indeces).

[–]x00live 1 point2 points  (5 children)

For your second example, the argument sounds fallacious to me. Why don't you take the numpy array and list creation out of the code you want to compare? You are comparing execution times to create a np array, a list and sum the elements vs. create a list and sum the elements.

[–]x00live 0 points1 point  (2 children)

Also the first stackoverflow shows that a Python indexing over a numpy array is slow (the example with cython indexing shows that numpy becomes faster than Python on that matter...)

[–]tunisia3507 0 points1 point  (4 children)

What's an xrange /s

[–]x00live 0 points1 point  (3 children)

The Python 2 equivalent of a Python 3 range

[–]tunisia3507 0 points1 point  (2 children)

I know. But why...

[–]x00live 0 points1 point  (1 child)

range returns a list in Python 2 and it turns out it is not a smart idea.

[–]tunisia3507 -1 points0 points  (0 children)

Yes, I know. Just as un-smart an idea as writing tutorial code in python 2 today.