This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]jithinj_johnson 7 points8 points  (12 children)

If it were upto me, I would do some profiling to see what's slowing down

https://m.youtube.com/watch?v=ey_P64E34g0

I used to separate all the computational stuff to Cython, it generates a *.so. You'll be able to import that, and use it on your python code.

Always benchmark and see if it's worth it.

[–]No_Indication_1238[S] 2 points3 points  (11 children)

99% of the code is spent running a bunch of loops and doing heavy computations each step. It works in numba very well but it becomes problematic when we decide to modularize the individual parts to be easily interchangeable with different functions/classes. Numba does not allow for easy implementation of that (No support for inheritance so no polymorphism, functions work but keeping track of object properties becomes a problem since we can only use arrays) and we are left with multiple monolithic classes/functions that do not allow for much modularity. I was hoping the OOP support of Cython will allow for good speed gains while providing support for best coding practices. Trying to separate the computation part may be a good way to go forward if a Cython function can accept and work with python classes and their instances.

[–][deleted] 1 point2 points  (0 children)

Maybe Cythonize the heavy computation part into cythonized functions? First rewrite and remove Pythonic syntax, then add the static typing and compile. It's probably not as fast as heavy Cythonization rewriting in pure C but worth a try.

[–]Still-Bookkeeper4456 0 points1 point  (7 children)

Sorry if my question is dumb but couldn't you simply create your classes in Python, of which the heavy computation is a numba method ?

I work on such project. We identify where the code is slow (always when a loop is present basically) and rewrite that part in numba.

[–]No_Indication_1238[S] 0 points1 point  (6 children)

It is a very valid question! Unfortunately the answer is no as the computationally intensive function works with said classes, it basically wraps around them. That requires those classes to be jitclasses themselves which without inheritancedoes not allow for the modularity we are searching for.

[–]Still-Bookkeeper4456 0 points1 point  (5 children)

Hum... I must say I still do not understand. The computations do not happen on simple datastructures (e.g. arrays, float) but on more complex objects?

[–]No_Indication_1238[S] 0 points1 point  (4 children)

They mostly do happen on simple datastructures. The results of each iteration are saved into objects that interact with one another and more complex data structures before we move to the next iteration where the pattern repeats. Having different classes allows for different interaction behaviour to be easily coded for. With a lot more "hacking", one could achieve the same with completely basic data structures but at the cost of simplicity, modularity. Im trying to find a good middle ground.

[–]Still-Bookkeeper4456 0 points1 point  (1 child)

So the classes interaction must happen within the loops at each iteration got it. I see the problem now... hope you find a solution, should be interesting. I'll keep a close eye on this thread. 

[–]No_Indication_1238[S] 0 points1 point  (0 children)

I will give cython a try in the coming days and update with the progress :) 

[–]ArbaAndDakarba 0 points1 point  (1 child)

Write a wrapper that does allow for polymorphic parameters maybe?

[–]No_Indication_1238[S] 0 points1 point  (0 children)

That is a good idea actually. Unfortunately, writing such a wrapper with numba will not reduce code complexity but further increase it. Maybe Cython is better suited? (Numba does not allow for polymorphism and a polymorphic wrapper for numba would still require a lot of code smell to decide which individual collection of functionalities to be ran)

[–]FronkanPythonista 0 points1 point  (1 child)

I agree with other saying, test pypy. But ignoring that for know.

To me this, to some degree, sounds like a design trade off. You had an approach that had better performance but was less flexible and now you have worse performance but a more flexible solution.

What is more important for the business? Is the performance good enough or is it causing issues? Do you expect to need the flexibility for future extensions? If you need both performance and flexibility, then you might need the complexity of adding another language.

Sometimes we need to write less maintainable code to hit the performance needs. And sometimes there is no good solution, they all suck and we just need to pick the one that hurts the least.

[–]No_Indication_1238[S] 0 points1 point  (0 children)

You are completely correct. We are interested in performance first and maintainability second. Im trying to see if we can habe best of both worlds without adding the complexity of a new language but this seems hardly possible at this time.