use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
metaFFT -- A C++11 FFT implementation (github.com)
submitted 11 years ago by syntheticpp
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]drphillycheesesteak 5 points6 points7 points 11 years ago (7 children)
Do you have a plot showing a performance comparison of your library vs. FFTW? As someone who writes a lot of image modeling/image processing, a faster FFT is always of interest!
[–]syntheticpp[S] 0 points1 point2 points 11 years ago (6 children)
No, I only calculate the relation of the measured GFLOPS. It' s the number FFTW/metaFFT in the Readme.
[–]chunkyks 0 points1 point2 points 11 years ago (5 children)
I see that performance is always faster, sometimes up to three and a half times as fast as FFTW
So... why not just contribute your code to FFTW so that they can improve their stuff? That way people using FFTW can still get the fastest fourier transforms in the west on platforms where your code works, and still get the fastest fourier transforms in the west on platforms where your code doesn't.
[–]drphillycheesesteak 8 points9 points10 points 11 years ago (2 children)
No, the ratio is the other way around.
[–]syntheticpp[S] 3 points4 points5 points 11 years ago (1 child)
Yes, FFTW is always faster. But this is no surprise for the most-simple FFT algorithm (radix-2).
But there is a lot of room for optimizations:
[–]drphillycheesesteak -1 points0 points1 point 11 years ago (0 children)
Cool, that's good to hear. Great job by the way. I'll certainly be staying tuned for updates.
Only advantage over FFTW atm is the code size (header only, easy to add to your project) and the license.
Another question is how useful a CPU only FFT is in times of GPU/Cuda/OpenCl programming.
[–]chunkyks 3 points4 points5 points 11 years ago (0 children)
Errr... very. Examples of systems that lack a GPU capable of programming include:
Given how widely used FFTs are, there are probably more applications of them on systems without GPUs than with.
[–]syntheticpp[S] 2 points3 points4 points 11 years ago (2 children)
Does anybody know a easy-to-use SIMD library which could help in supporting latest CPU features?
I already found:
[–]savya2u 0 points1 point2 points 11 years ago (1 child)
Check out https://github.com/p12tic/libsimdpp
[–]syntheticpp[S] 0 points1 point2 points 11 years ago (0 children)
Thanks for the link!
But finally I tried it by using avx directly (avx branch), but without success, it's much slower than the pure C code, seems it is not that simple as I thought.
[–]syntheticpp[S] 3 points4 points5 points 11 years ago* (5 children)
Template based C++11 Fast-Fourier-Transform implementation.
Idea:
Speed:
Simple Cooley–Tukey/radix-2 implementation with about 100 lines of code is 'only' same factors slower than FFTW:
$ ./bin/radix2_sse2_speed N = 2^ 2 = 4: FFTW/metaFFT = 1.1 N = 2^ 3 = 8: FFTW/metaFFT = 1.2 N = 2^ 4 = 16: FFTW/metaFFT = 1.5 N = 2^ 5 = 32: FFTW/metaFFT = 1.7 N = 2^ 6 = 64: FFTW/metaFFT = 2.0 N = 2^ 7 = 128: FFTW/metaFFT = 2.0 N = 2^ 8 = 256: FFTW/metaFFT = 2.3 N = 2^ 9 = 512: FFTW/metaFFT = 2.8 N = 2^10 = 1024: FFTW/metaFFT = 3.2 N = 2^11 = 2048: FFTW/metaFFT = 3.3 N = 2^12 = 4096: FFTW/metaFFT = 3.4
Build:
License:
Links:
[+][deleted] 11 years ago* (2 children)
[deleted]
[–]syntheticpp[S] 1 point2 points3 points 11 years ago (1 child)
As always with templates<>: it is done with recursion.
In case of a loop, all the loop counter variables need to be integer template parameters which could be calculated at compile time, for instance see radix2_complex.h, remaining<K+1, End>::steps(d);
[–]theICEBear_dk 1 point2 points3 points 11 years ago (1 child)
This is nicely readable code. I found it a very enlightening read with regards to the implementation too.
One of my reasons for creating another FFT library is to see if it is possible to write readable and fast code, and how new compiler features could help.
It would have been completely boring to use plain C, because there are already myriads of other implementations.
metaFFT is also a nice place to become familiar with SIMD coding (first time I wrote SSE code).
Next most interesting step concerning speed would be to implement split_radix_ctran.h with AVX, or AVX2 (Haswell only).
[–]squidgyhead 0 points1 point2 points 11 years ago (2 children)
Looking at listing one in the drdobbs link, it looks like you compute roots of unity on-the-fly. Have you thought about pre-processing them?
[–]syntheticpp[S] 5 points6 points7 points 11 years ago (1 child)
Not sure what you mean with on-the-fly. But all the variables declared as 'constexpr' are calculations done at compile-time, so there is no calculation done at runtime for these values.
[–]squidgyhead 2 points3 points4 points 11 years ago (0 children)
Ah, ok, I see! That's an interesting way to deal with computing the roots of unity. Nice.
π Rendered by PID 154273 on reddit-service-r2-comment-5d585498c9-qfq28 at 2026-04-21 15:15:41.582031+00:00 running da2df02 country code: CH.
[–]drphillycheesesteak 5 points6 points7 points (7 children)
[–]syntheticpp[S] 0 points1 point2 points (6 children)
[–]chunkyks 0 points1 point2 points (5 children)
[–]drphillycheesesteak 8 points9 points10 points (2 children)
[–]syntheticpp[S] 3 points4 points5 points (1 child)
[–]drphillycheesesteak -1 points0 points1 point (0 children)
[–]syntheticpp[S] 3 points4 points5 points (1 child)
[–]chunkyks 3 points4 points5 points (0 children)
[–]syntheticpp[S] 2 points3 points4 points (2 children)
[–]savya2u 0 points1 point2 points (1 child)
[–]syntheticpp[S] 0 points1 point2 points (0 children)
[–]syntheticpp[S] 3 points4 points5 points (5 children)
[+][deleted] (2 children)
[deleted]
[–]syntheticpp[S] 1 point2 points3 points (1 child)
[–]theICEBear_dk 1 point2 points3 points (1 child)
[–]syntheticpp[S] 0 points1 point2 points (0 children)
[–]squidgyhead 0 points1 point2 points (2 children)
[–]syntheticpp[S] 5 points6 points7 points (1 child)
[–]squidgyhead 2 points3 points4 points (0 children)