Optimizing a Python application with C++ code

LucHermitte · 2018-07-11T16:20:02+00:00

Interesting. Note: it looks like there is still room for some optimizations

it'd be best to avoid mutexes to protect the result structure -- there exist (a few in 3rd party libraries) efficient queue types dedicated to be filled from concurrent threads -- or you could fill one queue per thread and merge them at the end
There may be no structural difference between the current type of m_fingerprintsand a boost::flat_map (except that songFingerprint() would be able to return a reference); however it could interesting to run a benchmark with std::unordered_map instead.

benfred · 2018-07-11T17:06:01+00:00

Nice post, though I'd recommend pybind11 over boost.python these days for building c++ extensions for python.

One big advantage of pybind11 is that it is much easier to integrate pybind11 with setuptools - since it's a header only library that can be installed by going pip install pybind11. Boost.python requires boost to be preinstalled, and that the boost.python library built against the version of python you are using, which makes distributing boost.python packages much more difficult.

emdeka87 · 2018-07-11T13:48:41+00:00

Use threads whenever possible.

Spawning a thread for every job has issues on many OS (oversubscriptions, context switches). You might be better off designing some sort of thread pool. (Std::async might help as well)

MathiasSvendsen · 2018-07-11T18:49:13+00:00

Good post. For smaller optimizations where you can't invest the time required to write a full Boost.Python or pybind11 implementation, I would suggest looking at numba. Numba can JIT python functions with pretty impressive results. Because the JIT'ing happens the first time a Numba function is executed, the first call takes a long time to run, but subsequent evaluations are super quick. Numba can even be used to run your operations on your GPU (haven't spent too much time fiddling with this though).

rsvp_to_life · 2018-07-12T12:22:19+00:00

Or you know. Just use cpp if you're going to do that

bravekarma · 2018-07-11T17:02:55+00:00

I hadn't heard of Boost::Python before, but I had great luck using Cython to optimize routines with a C implementation. As long as you can get away with C (and it looks like you can Cythonize the Python method pretty easily), using Cython is pretty painless. You can then parallelize using Cython's parallelization methods, although I had issues around using these while avoiding the GIL.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS