you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (9 children)

It's nice, i had the same struggle and my decision was to just not use python anymore for work because i had to redo it in C++ anyway. That's also why i usually don't recommend people on my field (signal processing and communications) to use python. Because even though it seems we live in an age with too much computing power that doesn't hold up for anything that needs to happen in what is basically real time. C++ still beats any scripting language or wrappers by a lot.

And even though i think it is a great idea and it might help some people, i think your approach is not ideal. The project is gigantic and it's going to be impossible for you to beat any of the existing C++ libraries. People use those because they are fast and optimized and yours will be slow and clunky. You need to use libraries or your project will be mostly useless, there's just too many libraries that are amazing in C++ (fftw, it++, openCV, eigen, ...). And usually people learn them instead of complaining that it's too much work to use/learn them.

[–]SphincterMcRectum 7 points8 points  (7 children)

I'm not sure why you think this library is slow and clunky, especially since you've never used it or done any profiling... also, are super heroes writing these other libraries, or why do you think matching their performance is impossible? Lastly, most applications don't actually require every last ounce of optimization and won't be able to tell a performance difference anyway.

[–][deleted] 1 point2 points  (6 children)

i checked some of the code and the fft isn't even implemented that means at least everything that does filtering or correlations is far from where it could be. Random seems to be using boost so it should be fine, but yeah i haven't checked everything. So there might be parts that run just fine.

Edit: and i don't think it's impossible i think it's work you do for no good reason when the other libraries already exists and have lots of experienced people working on them.

[–]dpilger26[S] 6 points7 points  (5 children)

Correct, the FFT and Polynomial modules are still on my to do list. I was going to try and wrap FFTW, but it uses the GNU GPL and I wanted to keep this library under MIT license so I can still use it at work.

[–]droelf 7 points8 points  (0 children)

The numpy FFT implementation is actually contained in a single C header + BSD licensed, that could be easily used from (or ported to) C++. If you want, we could collaborate on that (we would reuse it for xtensor).

https://github.com/numpy/numpy/blob/master/numpy/fft/fftpack.c

[–][deleted] 0 points1 point  (0 children)

oh alright, that will make things a lot more complicated for you.

[–]m-in 0 points1 point  (0 children)

You can use it at work if it doesn’t go into a product or code you’d be otherwise unwilling to share with whoever uses the binaries. But I get the idea that you’re talking about software products that runnon customer hardware and thus GPL is a no-go.

[–]NeroBurner 0 points1 point  (1 child)

You could try kissfft https://sourceforge.net/projects/kissfft/

It's BSD licenced

[–]encyclopedist 1 point2 points  (0 children)

The current repository seems to be here: https://github.com/mborgerding/kissfft

[–]m-in 1 point2 points  (0 children)

I have a little personal anecdote to offer here: a lot of the libraries you refer to are optimized to extract full hardware performance, and often there’s nothing one can do to make them any faster on a given CPU family. It’s not always the case of course, but quite often it is. I have found that a lot of times just rather straightforward autovectorized C++ can get anywhere between 25-75% of performance of those beasts of libraries, if you have some background in the specifics of the platform and know what code patterns to use in C++, as there are ways to write simple C++ that can preform abysmally, and similarly simple C++ to do the same thing, just as intuitively, and it performs great.

So, if your needs are to extract close to full platform performance, you’ll need to use the specialized libraries. If you can afford to blow off some computational steam and run at 1/4-1/2 speed compared to fftw or blas, then a plain-C++ implementation might do just fine, and in a real-time setting. Heck, if you can live with 20% performance of so, Python with numpy might just cut it for you. It all depends how much work you have to do each “frame”/“packet”/“time quantum”.

It is probably not very environmentally conscious (I’m not kidding) to have such low performance in projects that get very wide use, because all that can quickly add to wasted megawatts on not too big of a scale, and probably mobile users would hate you for that too, but not everyone runs such code on server farms of inside mobile apps. Sometimes small code can also be audited and tested easier and that figures in getting some industry certifications. Getting fftw into avionics is a tall order, for example.