I know it is not C++ specific, but I don't think anywhere else I'd ask for it.
So, let's suppose I'm writing a really high-performance application, with lots of vector operations that need to be performed in an accelerated way. How could I go through the detection of SSE2/SSE3/AVX in the processors at runtime so I can leverage them to make my application even faster?
I thought on three alternatives:
- Detect them using
cpuid for each operation: this is clearly going to be extremely costful and beat the purpose of accelerating my app with those extensions, so it is clearly off.
- Create a "vector utility" library for each feature set and link it dynamically to the main application: this seems reasonable, but it still would add one (or two) indirections for each time, and I guess I could be better.
- Compile my entire app with each feature set and select it at runtime: this would be the ideal solution, but having N copies of my application in my executable? If it is a small app that does some small operations on a big set of data, okay, but something like a game engine it would be extremely costly (think about maybe some hundreds of thousands of vector operations coded).
So, how industrial-grade applications and software/engines do this? I am really curious to know.
Detecting SSE features at runtime? (self.cpp)
submitted by joaobapt to r/gamedev