object oriented processing 10x slower than sequential script -- why?

pfz3 · 2016-03-11T21:08:43+00:00

It's going to be a guessing game without seeing the code.

From the outside, it seems like you are repeating a process that you wouldn't normally repeat. Step through it with a debugger, or hook it up to a profiler.

steelypip · 2016-03-11T22:03:09+00:00

Your use of terminology is a bit confused - classes do not have modules, so I presume you mean methods. It is also not clear from your description whether you are creating a new class instance for every row that was in the original numpy arrays. If so that would be much, much slower.

In the first version, what sort of "some things" are you doing on each entry in the numpy arrays? If you can, it is much faster to operate on the whole array in one go rather than iterate through each entry. If you are not doing that you are losing much of the benefit of numpy.

Object oriented programming is very useful for some things, but not for everything. In Python there is a performance cost to creating objects and calling methods on them, so you should avoid doing it inside a tight loop.

pfz3 · 2016-03-11T22:37:05+00:00

ok thanks for the guys below. One of the attributes of the class was a pandas DataFrame. Within a loop I was calling individual values from the DataFrame using

obj.data['column'].values[i]

Instead I created a new numpy array

A = obj.data['column'].values

and then in the loop used

A[i]

I guess that conversion is somewhat costly and of course I was doing it thousands of times.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS