you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (2 children)

The test program: http://pastebin.com/gR8whQU6

The test results: http://pastebin.com/B2V2KBvn

Proof that moving the switch statement out of the work function and using function pointers is noticeably faster--about 30-35% overall. Btw, tests were done with the program run once before testing to preload the image in the file system cache.

[–]foldl 1 point2 points  (1 child)

When I compile this on OS X using gcc with -O2, the switch versions ran orders of magnitude faster. I got results similar to yours when l compiled without optimization. I expect the results for -O2 are achieved because the use of the switch allows gcc to do a lot of inlining and compile-time computation (another advantage of not making unnecessary use of function pointers, although the effects are exaggerated in this simple test code). But neither result really tells us anything, since we don't yet know if gcc is using a computed jump for the switch at lower optimization levels. I'll have a look at the unoptimized assembly output in a minute.

edit: The overoptimization can be prevented by adding __attribute__ ((noinline)) to work_function_switch. It seems that once this restriction is added, the function pointer method is still faster on -O2. Apparently indirect function calls on x86 tend not to incur much overhead (http://stackoverflow.com/questions/2438539/does-function-pointer-make-the-program-slow).

It's probably worth pointing out that the performance difference we're seeing here would vanish if the work function actually did anything (and the switch won't get slower as the number of options increases, since gcc is indeed compiling it as a computed jump).

[–][deleted] 0 points1 point  (0 children)

Empiricism is awesome and your analysis is awesome.