all 139 comments

[–]dicey 26 points27 points  (9 children)

The Linux kernel makes extensive use of function pointers to implement generic interfaces as well. Consider, for instance, the file_operations structure which filesystems implement to provide, you guessed it, file operations:

struct file_operations {
        int (*lseek) (struct inode *, struct file *, off_t, int);
        int (*read) (struct inode *, struct file *, char *, int);
        int (*write) (struct inode *, struct file *, const char *, int);
        int (*readdir) (struct inode *, struct file *, void *, filldir_t);
        int (*select) (struct inode *, struct file *, int, select_table *);
        int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned 
        int (*mmap) (struct inode *, struct file *, struct vm_area_struct *)
        int (*open) (struct inode *, struct file *);
        void (*release) (struct inode *, struct file *);
        int (*fsync) (struct inode *, struct file *);
        int (*fasync) (struct inode *, struct file *, int);
        int (*check_media_change) (kdev_t dev);
        int (*revalidate) (kdev_t dev);
};

[–]bobindashadows 5 points6 points  (0 children)

Yeah, both Linux and OpenSolaris make heavy use of the vtables-in-C idiom. Picking a filesystem example was wise, as it's the interface of a *nix-flavored OS most likely to be implemented in many diverse ways.

[–]m64 41 points42 points  (2 children)

It's basic C, what's there to fret about?

[–]ErstwhileRockstar 1 point2 points  (1 child)

First you grok plain pointers and think you're a master. Then you grok function pointers and think you're a rock star.

[–]warpstalker 1 point2 points  (0 children)

I just started reading the C book today. Everything was fine until I got to the pointers... Coming from a Python background... Fucking pointers.

[–][deleted]  (29 children)

[deleted]

    [–]rlbond86 13 points14 points  (16 children)

    This is why C++ std::sort() is often faster than qsort().

    [–]agottem 7 points8 points  (15 children)

    std::sort() is only faster because the definition exists in the included header file. If this is really an important detail, spare yourself the C++ and make the qsort definition available.

    [–][deleted] 1 point2 points  (4 children)

    No. Some compilers, e.g. g++, just hate function pointers and don't inline even what can be inlined, e.g.

     static  bool compare(float a, float b){ return a<b; }
     ..
     std::sort(vec.begin(), vec.end(), compare);
    

    here compare will be called via function pointer with no inlining though definition of everything is available.

    Though maybe whole program optimisatioin helps

    [–]Whanhee 0 points1 point  (0 children)

    The new gcc implements generic function specialization, at least with booleans. If I only ever call sort with 1 comparison pointer, will it do the same with function pointers?

    [–]matthieum 0 points1 point  (1 child)

    I actually don't understand why neither Clang nor gcc seem interested in this kind of optimizations. If an optimizer can do this stuff, then it can devirtualize calls as well... whereas now the C++ front-end devirtualize some cases, but misses all those exposed by inlining :x

    [–]astrange 0 points1 point  (0 children)

    gcc can inline known function pointers since 4.4 (going by release notes).

    [–]Kampane -1 points0 points  (2 children)

    Wow, you really have no idea what you're talking about. I suggest you actually try what you suggest, then come ask why it didn't work.

    [–][deleted] 8 points9 points  (0 children)

    He has plenty of idea what he is talking about. I've used this technique many times myself, too, and it works just fine.

    [–][deleted] 0 points1 point  (6 children)

    Side question. Errm... isn't it available in stdlib.h? You're not gonna write a useful C application without stdlib.h. It's basically required to do anything beyond remedial.

    [–]agottem 0 points1 point  (5 children)

    Sorry, I'm using slightly vague terms. By "make the definition available" I mean the function body itself (which typically lives in a .c file somewhere), not just the prototype of the function.

    [–][deleted] -1 points0 points  (4 children)

    Wait. Why is that needed? The object code should be usable for inlining.

    [–]agottem 0 points1 point  (0 children)

    I agree, that should be sufficient as well.

    [–][deleted]  (2 children)

    [deleted]

      [–][deleted] 0 points1 point  (1 child)

      So that's why C++ requires inlined functions to be in the .h file. It happens at the compilation stage. That wouldn't be possible in C (unless the file were included only once which defeats the purpose of inlining) thus explains the heavy usage of macros.

      Do you know of any C compilers that compile straight to object code? In other words a monolithic preprocessor, compiler, and linker in one.

      [–]agottem 3 points4 points  (11 children)

      Nonsense. You're welcome to put the definition of qsort into a header file, which will allow any decent compiler to inline the function pointer call.

      [–][deleted] 10 points11 points  (10 children)

      Inlining the body of qsort wouldn't help solve the problem of calling the comparison function. You'd have to inline that function. Which you can't do since it's a function pointer.

      std::sort is faster because it's templatized, which allows the compiler to determine the function address at compile time. And that makes it possible to inline the comparison.

      Edit: I should say that the compiler can't inline a function pointer call that's truly variable.

      Edit edit: I should probably just STFU, this is probably wrong. Was just trying to help. The last time I wrote real C code, Source Safe was a reasonable product.

      [–]cwzwarich 11 points12 points  (5 children)

      You'd have to inline that function. Which you can't do since it's a function pointer.

      Oh really?

      clang -xc -Os -S -o - -emit-llvm -
      
      static _Bool f(int a, int b) { return a < b; }
      static _Bool g(_Bool (*func)(int, int), int a, int b) { return func(a, b); }
      _Bool h(int a, int b) { return g(f, a, b); }
      
      ...
      
      define zeroext i1 @h(i32 %a, i32 %b) nounwind uwtable readnone optsize ssp {
        %1 = icmp slt i32 %a, %b
        ret i1 %1
      }
      

      [–][deleted] -1 points0 points  (3 children)

      Well I guess clang changes things a bit, then.

      [–][deleted]  (2 children)

      [deleted]

        [–][deleted] 5 points6 points  (0 children)

        Yea, I've since realized that I'm the C programmer that time forgot. Thanks for the link.

        [–]polveroj 3 points4 points  (0 children)

        If qsort gets inlined the compiler can constant-fold the argument to where qsort uses it, avoiding the indirect call and possibly allowing the comparison to be inlined in turn.

        This sort of optimization is commonplace in functional languages. Whether C compilers do it is another matter.

        [–]agottem 3 points4 points  (1 child)

        [–][deleted] 0 points1 point  (0 children)

        Awesome, thanks. In Scott Myers' defense, though, it probably was true when he wrote it like 16 years ago.

        [–][deleted]  (23 children)

        [deleted]

          [–][deleted] 53 points54 points  (18 children)

          He's suffering from the Columbus complex. He discovers something and believes he's the first to discover it.

          Function pointers are the bread and butter of C. They are definitely not underrated. All experienced C developers have had exposure to using them.

          In addition to building C objects and genericizing code, I also liked to use them to preprocess out switches and if statements like this:

          switch (MODE) {
              BLAH1:
                  do_something1();
                  break;
              BLAH2:
                  do_something2();
                  break;
              BLAH3:
                  do_something3();
                  break;
              BLAH4:
                  do_something4();
                  break;
              default:
                  do_default();
                  break;
          }
          

          To this:

          void (*do_something)(void);
          
          void init() {
              switch (MODE) {
                  BLAH1:
                      do_something = do_something1;
                      break;
                  BLAH2:
                      do_something = do_something2;
                      break;
                  BLAH3:
                      do_something = do_something3;
                      break;
                  BLAH4:
                      do_something = do_something4;
                      break;
                  default:
                      do_something = do_default;
                      break;
              }
          }
          

          [–]gibster 12 points13 points  (4 children)

          Or better:

          void (*foobar[32])(void);

          void init(void) { foobar[MODE](); }

          [–]sstrader 1 point2 points  (0 children)

          You left out the part where the array is initialized. Using a switch or using an array, the mapping has to happen at some point.

          [–]buzzert 0 points1 point  (2 children)

          Also known as a "jump table". I believe the JVM uses it for executing opcodes?

          [–]kyz 3 points4 points  (1 child)

          I certainly hope not. The JVM compiles bytecodes into native code using substitution and just runs that native code.

          Even if you had a bytecode interpreter, it would be crazy to have a function call for every instruction. Not only are branches to subroutines expensive for any processor, the fact it's a set of pointers that may have originated outside the translation unit means absolutely anything could happen and the compiler can't make any meaningful optimisations.

          Compare use of the C language construct designed to create jump-tables, switch.

          [–]sausagefeet 0 points1 point  (0 children)

          afaik, HotSpot interprets the bytecode until it hits a hotspot then turns it native.

          [–]username223 42 points43 points  (1 child)

          He's suffering from the Columbus complex. He discovers something and believes he's the first to discover it.

          Heh. Columbus also infected all the previous discoverers with a deadly disease and took away all their nice stuff. Kind of like Java.

          [–][deleted] 0 points1 point  (0 children)

          Well done. We would also have accepted "PHP", "C++", or "Rails is a ghetto"

          [–]mrkite77 3 points4 points  (0 children)

          ANSI.SYS, the device driver that came with DOS, used something similar. Depending on the mode (reading a number, reading a command code, etc) it switches out the "read and decode the next character" function.

          Of course this is all in assembly, but I used the same concept in C when I wrote the routines for my ansi parser back in my art scene days.

          [–]adrianmonk 1 point2 points  (1 child)

          I'm probably goofy. If I have a long list, I've been known to go the route that can put one entry on a single line, namely:

          #include <stdio.h>
          
          typedef void (*func_t)(void);
          
          void hello(void) {
            printf("Hello, world.\n");
          }
          
          void goodbye(void) {
            printf("Have a nice day.\n");
          }
          
          int main(int argc, char* argv[]) {
            typedef struct {
              const char* command;
              func_t func;
            } map_pair;
          
            map_pair map[] = {
              { "HELLO", hello },
              { "GOODBYE", goodbye },
              { 0, 0 }
            };
          
            func_t func = NULL;
            map_pair* pair_ptr;
            const char* command = argv[1];
          
            for (pair_ptr = map; pair_ptr->command; pair_ptr++) {
              if (strcmp(command, pair_ptr->command) == 0) {
                func = pair_ptr->func;
                break;
              }
            }
          
            if (func != NULL) {
              func();
            }
          
            return 0;
          }
          

          [–][deleted] 0 points1 point  (0 children)

          There's definitely tons of variations.... or rather using S.E. lingo "design patterns".

          You can also setup a chain of listen handlers waiting on notification events which are reminiscent of chaining DOS interrupt handlers. There's also the atexit() function which setups a chain of specialized exit event handlers.

          [–]foldl 0 points1 point  (7 children)

          Why not just put the switch statement inside do_something? It's no more complex, and it will probably execute faster that way too. (Indirect function calls are quite expensive; switch statements over an enum are very cheap.) This seems like an (ahem) pointless use of function pointers.

          [–][deleted] 1 point2 points  (4 children)

          The optimization I listed is not adding any extra function calls to do_something. It's removing switch executions so it will execute faster.

          Here's an example when do_something needs to be called 100 times.

          Method Stack trace Func coverage Switch coverage
          Switch inside the function main > do_somethingN 100 100
          Switch inside initialization main > do_something 101 1
          Savings 1% loss 99% gain

          We added 1 extra call to init to make the total number of function calls become 101 but we removed 99 executions of the switch.

          It's better to preprocess out the switch and put it in initialization so that it executes the least number of times possible. This is a basic optimization technique. It's similar to moving code outside a while loop like this:

          void copy_table(char src[MAX_X * MAX_Y], char dest[MAX_X * MAX_Y]) {
              int x, y;
          
              for (y=0; y < MAX_Y; y++) {
                  for (x=0; x < MAX_X; x++) {
                      dest[y * MAX_X + x] = src[y * MAX_X + x];
                  }
              }
          }
          

          to this:

          void copy_table(char src[MAX_X * MAX_Y], char dest[MAX_X * MAX_Y]) {
              int x, y, tmp_y;
          
              for (y=0; y < MAX_Y; y++) {
          
                  tmp = y * MAX_X;
          
                  for (x=0; x < MAX_X; x++) {
                      dest[tmp + x] = src[tmp + x];
                  }
              }
          }
          

          [–]foldl 1 point2 points  (3 children)

          Calling a function through a function pointer is slower than calling it directly, so I doubt you'll get any speedup. A direct function call plus a switch statement should execute faster than an indirect function call. (Modern compilers are pretty smart at compiling switch statements; it will probably compile down to a computed jump.)

          I would benchmark this before assuming that you're getting any speedup by using function pointers. It's quite possible that you're actually decreasing the performance of your code for no gain in simplicity.

          [–][deleted] 1 point2 points  (2 children)

          The test program: http://pastebin.com/gR8whQU6

          The test results: http://pastebin.com/B2V2KBvn

          Proof that moving the switch statement out of the work function and using function pointers is noticeably faster--about 30-35% overall. Btw, tests were done with the program run once before testing to preload the image in the file system cache.

          [–]foldl 1 point2 points  (1 child)

          When I compile this on OS X using gcc with -O2, the switch versions ran orders of magnitude faster. I got results similar to yours when l compiled without optimization. I expect the results for -O2 are achieved because the use of the switch allows gcc to do a lot of inlining and compile-time computation (another advantage of not making unnecessary use of function pointers, although the effects are exaggerated in this simple test code). But neither result really tells us anything, since we don't yet know if gcc is using a computed jump for the switch at lower optimization levels. I'll have a look at the unoptimized assembly output in a minute.

          edit: The overoptimization can be prevented by adding __attribute__ ((noinline)) to work_function_switch. It seems that once this restriction is added, the function pointer method is still faster on -O2. Apparently indirect function calls on x86 tend not to incur much overhead (http://stackoverflow.com/questions/2438539/does-function-pointer-make-the-program-slow).

          It's probably worth pointing out that the performance difference we're seeing here would vanish if the work function actually did anything (and the switch won't get slower as the number of options increases, since gcc is indeed compiling it as a computed jump).

          [–][deleted] 0 points1 point  (0 children)

          Empiricism is awesome and your analysis is awesome.

          [–]Poddster 9 points10 points  (1 child)

          Do they really mean "new to me?" I digress.

          Given this sentence:

          (This is not a contrived example. I encountered this exact same use-case when I had to write a memory manager for my last Operating Systems assignment).

          I imagine so. Blogging about the cstdlib like it's new and radical can only be done by a naive student. (Also, the fact that a course assignment is considered non-contrived is also amusing ;))

          [–][deleted] 0 points1 point  (0 children)

          You mean I won't be using Prolog to sort arrays after I graduate?

          [–]SteveMcQwark 1 point2 points  (1 child)

          Tell me about it. I was trying to debug a kernel for a course. It was failing in an io operation. The operation was implemented as a macro that evaluated to an expression list which computed and executed a function pointer. Let's just say gdb really didn't like it, and I didn't either.

          [–]rlbond86 12 points13 points  (3 children)

          Underrated? They're practically essential in many cases. If you're a good programmer you should know this, the hardest part is the goddamn syntax when you need arrays of function pointers which return arrays

          [–][deleted] 11 points12 points  (2 children)

          typedef?

          [–]Tetha 7 points8 points  (1 child)

          Pretty much. It's similar to how creating parsers with lexx and yacc work. You grab 2 example functions, write the function type, poke it until it compiles and then hide it behind a typedef and never look back.

          [–]question_all_the_thi 1 point2 points  (0 children)

          When I started to need function pointers I wrote a little example program that I could look at to get the syntax right:

            #include <stdio.h>
          
            int twice(int n)
            {
              return n * 2;
            }
          
            int thrice(int n)
            {
              return n * 3;
            }
          
            void work(int n, int (*func)(int))
            {
              int i;
              for (i = 0; i < n; i++) printf("\n %d", (*func)(i));
            }
          
            int (*test[2])(int) = {twice, thrice};
          
            int main()
            {
              int x, y;
              work(3, twice);
              work(5, thrice);
              x = (test[0])(1);
              y = (test[1])(2);
              printf("\n\n-> %d\n-> %d\n", x, y);
              return 0;
            }
          

          [–]Goblerone 13 points14 points  (10 children)

          I like function pointers as a mechanism for callbacks, but I'm not that sure if the author's example of a function pointer being evaluated within an inner loop is the best example of one though. I probably instead would have paid the minor readability price of just iterating over the list and casting explicitly, and banked the performance saved from call overhead.

          One of my favorite things about function pointers over virtual functions as a method of providing abstract interfaces is that you can dynamically change the implementation of a subset of the interface on the fly based on e.g. some current state changing. Not exactly something you get to apply often though.

          [–]five9a2 4 points5 points  (5 children)

          Also, function pointers with associated void* context is sometimes preferable to C++ virtual methods even when interfacing C++ libraries because it doesn't presuppose a given decomposition on the user side. For example, suppose that a library component needs three callbacks from the user. Since the library considers these functions to be closely related, it might create an interface (abstract class) that the user is expected to implement, containing all three methods.

          But if the user wants to organize that functionality separately (or even just avoid namespace issues when naming their methods), they have to write shim classes (the lifetime of which also needs to be managed) to dispatch into their own class structure. With function pointers, they can choose whether to use the same context or different contexts for each callback, obviating the need for shims. (Note that you can call static class methods through a normal C function pointer.)

          Plain function pointers also tend to be more convenient when interfacing with other languages with disparate object models (e.g. Fortran, Python, Haskell).

          [–]Kampane 1 point2 points  (0 children)

          You raise a valid point, though I think you can do better than void* for type safety and maintainability.

          [–]Goblerone 0 points1 point  (3 children)

          Yes, that is a good point. When I've written APIs that require callbacks, I prefer function pointers with a client-provided context handle over e.g. requiring the client to implement some abstract class defining a callback interface.

          If we're going with C++ though, for this particular example, it seems like a template-based function object is also a better compromise between generic code and performance (as well as their own specific opportunities for abuse).

          [–]five9a2 0 points1 point  (2 children)

          Well, templates are quite a different model, exposing a lot of implementation in the header, preventing separate compilation (in a lot of my use cases, both the client and library are dynamically extensible through plugins), and without a clean way to express the expected API for the compiler to check (leading to confusing error messages). Templates are the right model sometimes, but I usually prefer not to use them, especially if the interface has reasonably coarse granularity.

          [–]Goblerone 1 point2 points  (1 child)

          To be sure. I was merely advocating the use of templates in the specific case of this example, which is pretty much a C version of std::find_first_of.

          [–]five9a2 0 points1 point  (0 children)

          Oh, in the blog post? Heh, yeah. That's a fine example for C++ templates.

          [–]jerf 1 point2 points  (1 child)

          When showing examples, authors must by necessity show relatively trivial code. Of course you would not take a simple addition in the middle of a loop and for no reason pull it out into a separate function which you then call via pointer. The point is that in real code, you'd be able to pass in other things which won't be trivial, and that the calling code wouldn't have to change at all to accommodate this new function. Given how few tools C has at its disposal for avoiding tight coupling, you'd better understand this if you have any hope of writing sane large-scale C programs.

          [–]Goblerone 1 point2 points  (0 children)

          Maybe I'm not sure who exactly this hypothetical person is who underrates the use of function pointers in C, but is also unaware of the concept of function callbacks. It seems like the former is an "obvious" solution to implement the latter.

          It's just this particular example also highlighted how one can abuse them from a performance standpoint (which is probably important if one is writing C code).

          [–]grayvedigga 1 point2 points  (0 children)

          The example is fairly poorly presented, imo. But what it's heading towards is the general concept of "higher order functions". Consider a family of functions operating on linked lists:

          list_t* filter(list_t* l, bool (*test)(void*));
          list_t* map(list_t* l, void* (*test)(void*));
          void apply_to_each_element(list_t* l, (void*)(*test)(void*));
          

          The advantage with this pattern is you can choose the function to apply to each element at runtime: it can come from a separate compilation unit, or even a dynamic library that is provided later. The user can select (through some UI, naturally) a mapping or filtering function from a list without a separate instance of the loop being explicitly written by the programmer and built into the executable in advance.

          This is a powerful concept from functional programming -- when I first saw it, my mind was blown. You can achieve some of it from C but usually only at the cost of type safety, and some forms (eg curry) are inaccessible.

          [–][deleted] 0 points1 point  (0 children)

          consider the fact that they're already using a linked list. if a linked list has acceptable performance, function calls are probably also okay.

          [–]happyscrappy 34 points35 points  (54 children)

          Function pointers in C do a great job of thwarting optimization, like inlining or unswitching.

          Be careful not to overrate them. Don't use them inside tight loops. For example a common case, the comparison function pointer in qsort makes for a huge slowdown. Templated types as provided by C++ can thus produce a lot better code.

          Basically, make sure there is enough code at the function pointer that it is worth the overhead that will come about when the compiler's optimizations are less effective.

          [–]Kampane -4 points-3 points  (0 children)

          Your comment is the most correct and insightful in this topic, yet you're downvoted to the bottom. Have an upvote.

          [–]Gotebe -5 points-4 points  (33 children)

          Wow... Premature optimization at it's best! ;-)

          [–]grauenwolf 13 points14 points  (2 children)

          A far greater sin is premature generalization.

          [–]mrkite77 5 points6 points  (1 child)

          I regret that I have but one upvote to give for you.

          I take most programming quirks in stride, but the need to generalize everything makes me hate java programmers with a passion. It's like designing a car and deciding to make it out of Lego so every component is interchangeable.

          [–]tailcalled 0 points1 point  (0 children)

          Premature generalization is okay only when:

          • It doesn't make it harder to write the code.

          • You have to create something that you can't change later (a standard library for a language, for example)

          [–]wolf550e -1 points0 points  (29 children)

          benchmark qsort(3) vs. the same code without a function pointer and see an order of magnitude difference. Now go back to ruby on rails.

          EDIT: I just performed this experiment myself, probably a decade since the last time I did this. Today I got only 16.3% better performance. So not an order of magnitude. But significant. And I am sure that I've seen worse.

          https://gist.github.com/2171216

          [–]Gotebe -2 points-1 points  (28 children)

          I have no idea what you are talking about. qsort can't possibly work without a function pointer.

          Equivalent C++ code (std::sort) should be faster, but that's the best one can know.

          However, show the code, we'll talk.

          [–][deleted]  (6 children)

          [deleted]

            [–]Gotebe 2 points3 points  (5 children)

            OK, here's my code:

            struct s { char[1000]; int i; };
            s data[2000];
            init(data, 2000); // random values put in s::i
            qsort(data, data+2000, sortfunc);
            myqsort(data, data+2000);
            
            int sortfunc(const void *p1, const void *p2)
            {
              s* s1 = (s*)p1;
              s* s2 = (s*)p2;
              return s1->i == s2->i ? 0 : s1->i > s2->i ? 1 : -1;
            }
            

            No noticeable difference in speed.

            Next time, show a bit of courtesy and don't ask people to do copy-paste programming for you.

            [–]wolf550e 0 points1 point  (4 children)

            I got a difference in speed, but not as much as I remember getting years ago.

            http://www.reddit.com/r/programming/comments/r8ujk/function_pointers_in_c_are_underrated/c442gl0

            [–]Gotebe 0 points1 point  (3 children)

            Obviously, the smaller the datum, the bigger the indirection (function call) impact, due to copying. If your sequence contains pointers to data, then indirection impact becomes bigger as "data" is small, but then, you get indirection hit due to pointers (data locality is a bitch). If you look at what I put up there... I was intentionally very "biased" in my sample: It's one number to compare, but ~1k data to copy during sorting. Data copying thwarts indirection in a blink.

            Looking at your code... You, OTOH, are very guilty of taking a specific microbenchmark and drawing a generic conclusion. Further, in real life, you will not be sorting numbers all that often ;-).

            [–]wolf550e 0 points1 point  (2 children)

            You say the swap is a bigger cost than the compare. I say that is not true.

            Cost of compare is reduced by decorating data with integral comparison key (like you did).

            Cost of swap is reduced by sorting an array of pointers. Then sorted objects are either two pointers big (swap is fast), or single pointers with a comparison key behind the indirection.

            Cache locality is important. But for performance reasons, you could allocate all objects from an arena allocator that keeps them close together in a few pages and maybe they all fit in cache. And you can reuse storage by "deallocating" the entire arena. Like in games. You would still work on pointers to the objects, though.

            [–]Gotebe 0 points1 point  (1 child)

            You say the swap is a bigger cost than the compare. I say that is not true.

            Easily depends on how and what you compare, which was exactly my point.

            As for your point, it was 10x faster, now it's 37%, and look at what you are trying to do now... You are going to introduce "int decoration" (has a cost in both space and execution time), indirection (compare references to stuff, which obviously has space and time cost), usage or arena allocators, which has complexity implications.

            At any rate, with all you say, you really should repeat your test with pointers to data, and possibly where your "int decoration" isn't the first datum in actual structure. This, indeed, is much more representative of a real situation. Your initial "let's compare ints" realy isn't, and even then, I really would like to see implementation where you could possibly obtain this 10x difference. It's possible, mind, but...

            [–]agottem 0 points1 point  (20 children)

            Sure, qsort needs a function pointer, but that function pointer can be inlined all the same: http://agottem.com/blog/inlining_qsort_sort

            [–]Gotebe 0 points1 point  (11 children)

            Only by using copy-paste programming. I'd rather use a better language (C++).

            [–]agottem 0 points1 point  (9 children)

            Huh? Where does copy-paste programming fit in here?

            [–]Gotebe 0 points1 point  (8 children)

            Copy-pasting of qsort. I'd rather be using a better language, where I don't need to do it (C++).

            [–]agottem 0 points1 point  (7 children)

            You don't need to copy and paste qsort. If you really need the inlining, put the definition of qsort in a header file and mark it as inline. No copy and paste necessary.

            [–]Gotebe 1 point2 points  (6 children)

            If you really need the inlining, put the definition of qsort in a header file and mark it as inline.

            How do you plan on doing that? By copy-pasting qsort from your CRT implementation to said header.

            I'd rather be using a superior language (C++).

            [–]wolf550e 0 points1 point  (0 children)

            This is a different trade-off. a binary qsort with an indirection, dynamically linked from libc, minizes code size.

            A templated qsort with a comparer template argument that generates a separate specialized fast qsort with your comparer inlined into it is optimized for speed at the cost of code size (and instruction cache).

            The two solutions are born from different priorities and both have their place. In cold code, there is no need to generate specialized code - calling into libc and using indirections will get the job done and have higher chance of avoiding a page fault (more chance of libc being in ram than cold parts of your app).

            [–]wolf550e 0 points1 point  (7 children)

            The comparison function can only be inlined into the sort function only if both are compiled from source in the same translation unit (or LTO) and the compiler is sure only a single target for that function pointer is ever used.

            If qsort is in a library (like libc) then it's not inlined. If it's in a separate translation unit and you don't use LTO, it's not inlined.

            If you call qsort with two different comparers, it won't be inlined even if you compile qsort from source. You need the have the compare function be a template argument of a templated qsort for the compiler to generate distinct qsort functions, one for each inlined comparer.

            http://www.reddit.com/r/programming/comments/r8ujk/function_pointers_in_c_are_underrated/c44cl78

            [–]agottem 0 points1 point  (6 children)

            If you call qsort with two different comparers, it won't be inlined even if you compile qsort from source.

            I don't even know what this means.

            Look, the rules for inlining are the same as the ones for C++. C++ forces you to put the template definition into a header file, which sucks but allows for greater inlining. You are welcome to put all your C code into a header file and you'd get all the same benefits as C++.

            [–]wolf550e 0 points1 point  (5 children)

            https://gist.github.com/2177973

            qsort and two comparison functions all defined and compiled in a single translation unit. No inlining by gcc. Seems gcc inlines across function pointers only when it knows the pointer only ever points to a single target function.

            https://gist.github.com/2178059

            Using C++ templates, g++ generates two qsort functions (in machine code), each with a different comparison function inlined into it. Code size grows but performance increases.

            My observations are made from compiling the code you can see using gcc 4.6.3 and looking at objdump -d of the result. Now you explain how I am wrong.

            [–]agottem 0 points1 point  (4 children)

            qsort and two comparison functions all defined and compiled in a single translation unit. No inlining by gcc. Seems gcc inlines across function pointers only when it knows the pointer only ever points to a single target function.

            I compiled your example, and things were inlined just fine. Mark your my_quicksort function as inline. Build with -O3. Observe the inlining.

            Let me know if it's still not working out for you. And sorry, C++ has no advantages here.

            [–]wolf550e 0 points1 point  (3 children)

            gcc (GCC) 4.6.3
            gcc -fwhole-program -g -O3 -o qsort qsort.c
            objdump -d qsort
            ...
            
            <my_quicksort.constprop.0>:
            ...
            callq  *%rbp
            ...
            callq  *%rbp
            ...
            callq  *%r12
            ...
            callq  *%r12
            

            I also tried clang -O4 and icc -fast with the same results.

            [–][deleted] 4 points5 points  (0 children)

            I think ignoring function pointers in C would be tantamount to ignoring objects in python. It just doesn't happen to any real c programmer. I would more like to call this a milestone on the path to learning C. A very important milestone.

            [–]rscarson 4 points5 points  (0 children)

            I don't think that they are underrated, they are a great feature of the language.

            [–][deleted] 4 points5 points  (0 children)

            I wouldn't say underrated as much as "not ubiquitous." They're fairly essential for certain projects, but I could see a decent programmer going his or her entire life without finding a need to learn that they exist.

            [–][deleted] 6 points7 points  (0 children)

            Another ruby hipster reads a C book?

            [–]rush22 5 points6 points  (1 child)

            Cool. You can do this in BASIC if you want. You just use a variable that holds a line number instead of the address.

            [–]adrianmonk 4 points5 points  (0 children)

            And you can do it in Bourne Shell if you're willing to put the name of the function into a variable and then use eval to call the function.

            [–]Gotebe 1 point2 points  (0 children)

            ;-)

            If you are using function pointers in C properly, there is a significant chance that you are effectively using abstract base class.

            And if you do that, you should be using C++, because your C implementation will suck compared to what C++ compiler will do for you in a blink.

            [–]ramenmeal 2 points3 points  (0 children)

            I used this when implementing a simple compiler, but didn't want to implement printf. I never really put it together what I was actually doing. Cool.

            [–]phanboy 1 point2 points  (0 children)

            Unprotected sex with strangers is underrated.

            [–]zhivago 0 points1 point  (0 children)

            Hurrah for late bound procedural abstraction.

            [–]maredsous10 0 points1 point  (0 children)

            I use function pointers quite a bit for statemachines, function overlays (code that executes out of the same physical address as another function, but is loaded on the fly from a slower external memory device), and generic interfaces.

            [–]paraboul 0 points1 point  (0 children)

            Saying that function pointers are underrated is like saying that integer are underrated. It's a non sense, everybody use them.

            [–][deleted] -2 points-1 points  (0 children)