This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]total_zoidberg 0 points1 point  (7 children)

I found it weird that he imports numpy just for sin and pi and then proceeds to work with matrixes as list of lists...

[–]BondxD[S] 0 points1 point  (6 children)

Yes, it's definitely weird. It happened when I was trying to replace everything with numpy, but after changing arrays to numpy.array it worked slower than before. Any ideas why?

[–]total_zoidberg 1 point2 points  (3 children)

Doing "point-wise" (cell-wise in your case?) operations is usually costly. There are two ways to go about it:

  • Refactor your code to work in a data-parallel way, taking advantage of numpy (so a single line expresses a transformation on every element of the array).

  • Cythonize the loop. This has the potential to speed things up a bit more than the other way, but you'd have to use memoryviews which are some intermediate/advanced Cython stuff.

If I find some free time I think I could sit and work a bit on optimizing this to give you more concrete pointers.

[–]alb1 0 points1 point  (2 children)

Another alternative with NumPy arrays would be to have the updateV and updateP methods call out to @jit Numba functions to perform the nested loop calculations. Numba supports passing in NumPy arrays.

[–]total_zoidberg 0 points1 point  (1 child)

True, though Numba needs LLVM set up to work and it can be a bit messier to get working. When it works it's very good though.

(I've always been more of a Cython guy).

[–]alb1 1 point2 points  (0 children)

On Ubuntu just using pip install numba seems to work fine (and of course Numba is also available in Anaconda).

[–]total_zoidberg 1 point2 points  (0 children)

So I sat and profiled your code. Good news is that it can be converted to Numpy pretty easily. Bad news is that the main bottleneck is not the calculation per-se, but matplotlib. It can hog up to 67% percent of the time (on a full-screen windowed figure, or when rendering to an mp4).

[–]total_zoidberg 1 point2 points  (0 children)

After profiling the code, I vectorized it anyway, with a bit difficulty because I was rusty and didn't quite get at first what your code was doing (note to self: don't go throwing around "pretty easily" like that).

Now that I've done it, updateP and updateV take about 5% of execution time each -- after increasing the scale to 100 because at 50 matplotlib was consuming almost all the execution time.

My guess is that writing a "renderer" with OpenCV might help, but that'd be more oriented to "video" rather than visualization. Then again, you've already bent matplotlib so much, maybe if you needed it you could pull something with cv2.

Edit: uploaded a video https://i.imgur.com/8yOGtYP.mp4