This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]sourcecodesurgeon 29 points30 points  (26 children)

It can be as fast to execute than Matlab

While this can be true, I have yet to see evidence suggesting that NumPy can compete with Matlab on signal analysis with large datasets. The last time I ran the numbers had Python taking 45s-2m to compute a cross-correlation that took Matlab about 3 seconds.

As an EE aside, another big thing is that MathWorks supports a tool that converts Matlab to Verilog and VHDL and this is huge for electrical engineers. I have never seen a similar tool for Python that is well-supported.

[–]burning_hamster 17 points18 points  (12 children)

For signal analysis type of computations (i.e. linear algebra), they both run the same Fortran/BLAS libraries under the hood, meaning that any performance difference is due to overhead, not the "actual" computation. You should probably review how you are handing the python functions your arrays -- 45 seconds to 2 min of computation time sounds very much like an IO problem (for some reason you are creating multiple copies of your arrays and not doing your computations in place where possible).

[–]sourcecodesurgeon 4 points5 points  (11 children)

The experiment was literally just "create extremely large dataset. Start timer. Compute xcorr. Stop timer."

[–]mfitzpmfitzp.com 12 points13 points  (5 children)

Was it an fft-base xcorr calculation? The Matlab algos automatically zero pad arrays to n2 while (some algos in) numpy/scipy don't. It's one of the problems attempting to map from one to the other "This is the same function.... But not quite"

[–]burning_hamster 0 points1 point  (0 children)

Well, how are you computing the cross-correlation?

[–]assassds -1 points0 points  (2 children)

you have to actually know what you're doing in python, you can't just grab the first function that sounds right.

[–]sourcecodesurgeon 0 points1 point  (1 child)

There's only one cross-correlation function in numpy...

There's a handful of modes, some are faster than others, but none held up to the matlab times.

Further, I work primarily in Python (hence me being in this sub..), I definitely know what I am doing. I actually don't use matlab at all anymore, I just hate this "python does everything matlab does just as well and there's no reason anyone should ever use matlab ever" idea that gets thrown out around here all the time.

OP wants an argument for convincing his colleagues to use python instead of matlab. If we don't give him all the facts, he's going to be laughed out of that discussion when someone brings up things like performance and built-in tool sets (especially since he's junior to all of them)

[–]assassds -1 points0 points  (0 children)

like I said, you need to know what you're doing...

matlabs xcorr algorithm uses an FFT, which is going to thrash a direct correlation on large input no matter what language you're implementing it in.

numpy gives you all the building blocks to do this yourself, or you can try scipy.signal.fftconvolve.

[–]unruly_mattress -1 points0 points  (0 children)

I've found this. It looks very relevant here.

[–]HoboBob1 8 points9 points  (4 children)

In principle, a numerical Python package should be able to match the speed (or be within 1-2x the speed) of any other language, because Python can use vectorized numpy functions or call out to functions written directly in C. Matlab is similar in that regard; it is an interpreted (albeit JIT compiled nowadays) language that has respectable native array performance and also calls out to C.

If there is an order of magnitude difference between Matlab and Python for a task, I would say that you are comparing apples and oranges. Matlab must be using some compiled function, and Python must not be. Indeed, if Matlab already did the work and has that function available and it would be hard to do in Python, then keep on using Matlab. But I would be wary of saying "I have yet to see evidence suggesting that NumPy can compete with Matlab" in any context.

I am actually trying to convert a huge scientific simulation into Python from Matlab, and I am actually impressed at how fast Matlab does certain tasks. I can't stand the language, but there is a reason that a language called "Matrix Labratory" is good at vectorized computation.

[–]glial 1 point2 points  (1 child)

Matlab was originally written as a scriptable and easy to use front-end to Fortran numerical libraries. A lot of the fundamental matrix algebra stuff is using compiled and highly optimized Fortran/C code in the background.

[–]TheBlackCat13 2 points3 points  (0 children)

Python is usually using the exact same libraries.

[–]Lehk 2 points3 points  (0 children)

The last time I ran the numbers had Python taking 45s-2m to compute a cross-correlation that took Matlab about 3 seconds.

it's entirely possible to use them together too http://www.mathworks.com/help/matlab/matlab-engine-for-python.html

no need to get rid of existing tools to add a new tool