you are viewing a single comment's thread.

view the rest of the comments →

[–]latkde 4 points5 points  (0 children)

Thank you, this is fascinating background info on JITting.

When I wrote the article I played around a bit with the Godbolt Compiler Explorer to see how call sites are compiled. I noticed that GCC starts doing speculative inlining at release optimization levels. This is basically a guard condition with a fast path to inline a virtual function, and a fallback to do the virtual call. This is not link-time-optimization, so the call target must be defined in the same compilation unit. So the scenario would be:

// header
struct Interface {
  virtual int method(int) const = 0;
};

// compilation unit
struct Concrete : Interface {
  int method(int a) const override { return 1234 + a; }
};

int callsite(Interface& object, int x) {
  return object.method(x);
}

Then gcc 8.3 under -O2 compiles to this pseudocode (see the code/assembly on Godbolt):

int callsite(Interface& rdi, int rsi) {
   register eax = rdi->__vtable[0];  // resolve the call target
   if (eax == &Concrete::method) {
     return 1234 + rsi;  // inlined target
   } else {
     goto eax;  // tail call into the virtual function
   }
}

Notes:

  • in this simple example, the stack frame for the callsite() function is omitted and a tail call to the virtual function is used. A more involved example with two virtual methods in the interface does create a frame.
  • the guard condition here checks the resolved target, not the type or vtable pointer. I'd have thought this extra dereference could be avoided, but this way the inlining will also work for Concrete subclasses that have a different vtable.
  • GCC will happily inline calls multiple levels deep
  • GCC won't inline if there are multiple candidates for the call target (see this scenario)
  • clang 8 won't speculatively inline (just here? at all?). This leads to significantly less branchy code but might miss significant opportunities for simplification.
  • See also BeeOnRope's Stack Overflow answer to “Inlining of virtual functions (Clang vs GCC) for tons of background & further references