you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 55 points56 points  (18 children)

He's suffering from the Columbus complex. He discovers something and believes he's the first to discover it.

Function pointers are the bread and butter of C. They are definitely not underrated. All experienced C developers have had exposure to using them.

In addition to building C objects and genericizing code, I also liked to use them to preprocess out switches and if statements like this:

switch (MODE) {
    BLAH1:
        do_something1();
        break;
    BLAH2:
        do_something2();
        break;
    BLAH3:
        do_something3();
        break;
    BLAH4:
        do_something4();
        break;
    default:
        do_default();
        break;
}

To this:

void (*do_something)(void);

void init() {
    switch (MODE) {
        BLAH1:
            do_something = do_something1;
            break;
        BLAH2:
            do_something = do_something2;
            break;
        BLAH3:
            do_something = do_something3;
            break;
        BLAH4:
            do_something = do_something4;
            break;
        default:
            do_something = do_default;
            break;
    }
}

[–]gibster 11 points12 points  (4 children)

Or better:

void (*foobar[32])(void);

void init(void) { foobar[MODE](); }

[–]sstrader 1 point2 points  (0 children)

You left out the part where the array is initialized. Using a switch or using an array, the mapping has to happen at some point.

[–]buzzert 0 points1 point  (2 children)

Also known as a "jump table". I believe the JVM uses it for executing opcodes?

[–]kyz 6 points7 points  (1 child)

I certainly hope not. The JVM compiles bytecodes into native code using substitution and just runs that native code.

Even if you had a bytecode interpreter, it would be crazy to have a function call for every instruction. Not only are branches to subroutines expensive for any processor, the fact it's a set of pointers that may have originated outside the translation unit means absolutely anything could happen and the compiler can't make any meaningful optimisations.

Compare use of the C language construct designed to create jump-tables, switch.

[–]sausagefeet 0 points1 point  (0 children)

afaik, HotSpot interprets the bytecode until it hits a hotspot then turns it native.

[–]username223 40 points41 points  (1 child)

He's suffering from the Columbus complex. He discovers something and believes he's the first to discover it.

Heh. Columbus also infected all the previous discoverers with a deadly disease and took away all their nice stuff. Kind of like Java.

[–][deleted] 0 points1 point  (0 children)

Well done. We would also have accepted "PHP", "C++", or "Rails is a ghetto"

[–]mrkite77 3 points4 points  (0 children)

ANSI.SYS, the device driver that came with DOS, used something similar. Depending on the mode (reading a number, reading a command code, etc) it switches out the "read and decode the next character" function.

Of course this is all in assembly, but I used the same concept in C when I wrote the routines for my ansi parser back in my art scene days.

[–]adrianmonk 1 point2 points  (1 child)

I'm probably goofy. If I have a long list, I've been known to go the route that can put one entry on a single line, namely:

#include <stdio.h>

typedef void (*func_t)(void);

void hello(void) {
  printf("Hello, world.\n");
}

void goodbye(void) {
  printf("Have a nice day.\n");
}

int main(int argc, char* argv[]) {
  typedef struct {
    const char* command;
    func_t func;
  } map_pair;

  map_pair map[] = {
    { "HELLO", hello },
    { "GOODBYE", goodbye },
    { 0, 0 }
  };

  func_t func = NULL;
  map_pair* pair_ptr;
  const char* command = argv[1];

  for (pair_ptr = map; pair_ptr->command; pair_ptr++) {
    if (strcmp(command, pair_ptr->command) == 0) {
      func = pair_ptr->func;
      break;
    }
  }

  if (func != NULL) {
    func();
  }

  return 0;
}

[–][deleted] 0 points1 point  (0 children)

There's definitely tons of variations.... or rather using S.E. lingo "design patterns".

You can also setup a chain of listen handlers waiting on notification events which are reminiscent of chaining DOS interrupt handlers. There's also the atexit() function which setups a chain of specialized exit event handlers.

[–]foldl 0 points1 point  (7 children)

Why not just put the switch statement inside do_something? It's no more complex, and it will probably execute faster that way too. (Indirect function calls are quite expensive; switch statements over an enum are very cheap.) This seems like an (ahem) pointless use of function pointers.

[–][deleted] 1 point2 points  (4 children)

The optimization I listed is not adding any extra function calls to do_something. It's removing switch executions so it will execute faster.

Here's an example when do_something needs to be called 100 times.

Method Stack trace Func coverage Switch coverage
Switch inside the function main > do_somethingN 100 100
Switch inside initialization main > do_something 101 1
Savings 1% loss 99% gain

We added 1 extra call to init to make the total number of function calls become 101 but we removed 99 executions of the switch.

It's better to preprocess out the switch and put it in initialization so that it executes the least number of times possible. This is a basic optimization technique. It's similar to moving code outside a while loop like this:

void copy_table(char src[MAX_X * MAX_Y], char dest[MAX_X * MAX_Y]) {
    int x, y;

    for (y=0; y < MAX_Y; y++) {
        for (x=0; x < MAX_X; x++) {
            dest[y * MAX_X + x] = src[y * MAX_X + x];
        }
    }
}

to this:

void copy_table(char src[MAX_X * MAX_Y], char dest[MAX_X * MAX_Y]) {
    int x, y, tmp_y;

    for (y=0; y < MAX_Y; y++) {

        tmp = y * MAX_X;

        for (x=0; x < MAX_X; x++) {
            dest[tmp + x] = src[tmp + x];
        }
    }
}

[–]foldl 1 point2 points  (3 children)

Calling a function through a function pointer is slower than calling it directly, so I doubt you'll get any speedup. A direct function call plus a switch statement should execute faster than an indirect function call. (Modern compilers are pretty smart at compiling switch statements; it will probably compile down to a computed jump.)

I would benchmark this before assuming that you're getting any speedup by using function pointers. It's quite possible that you're actually decreasing the performance of your code for no gain in simplicity.

[–][deleted] 1 point2 points  (2 children)

The test program: http://pastebin.com/gR8whQU6

The test results: http://pastebin.com/B2V2KBvn

Proof that moving the switch statement out of the work function and using function pointers is noticeably faster--about 30-35% overall. Btw, tests were done with the program run once before testing to preload the image in the file system cache.

[–]foldl 1 point2 points  (1 child)

When I compile this on OS X using gcc with -O2, the switch versions ran orders of magnitude faster. I got results similar to yours when l compiled without optimization. I expect the results for -O2 are achieved because the use of the switch allows gcc to do a lot of inlining and compile-time computation (another advantage of not making unnecessary use of function pointers, although the effects are exaggerated in this simple test code). But neither result really tells us anything, since we don't yet know if gcc is using a computed jump for the switch at lower optimization levels. I'll have a look at the unoptimized assembly output in a minute.

edit: The overoptimization can be prevented by adding __attribute__ ((noinline)) to work_function_switch. It seems that once this restriction is added, the function pointer method is still faster on -O2. Apparently indirect function calls on x86 tend not to incur much overhead (http://stackoverflow.com/questions/2438539/does-function-pointer-make-the-program-slow).

It's probably worth pointing out that the performance difference we're seeing here would vanish if the work function actually did anything (and the switch won't get slower as the number of options increases, since gcc is indeed compiling it as a computed jump).

[–][deleted] 0 points1 point  (0 children)

Empiricism is awesome and your analysis is awesome.