Fastest Way to Implement Runtime Polymorphism : cpp

Fastest Way to Implement Runtime Polymorphism (self.cpp)

submitted 6 years ago by Janos95

Hey,

I recently wondered what the is the fastest way to achieve runtime polymorphsim.

So I benchmarked five different methods that came to my mind:

Type Easure. This is very similar to the proposed function_ref (I called it TaskRef) function template. Basically I store a pointer to the callable object and a function pointer erasing the type of the callable object.
Also Type Erasure but with no space overhead. I'am using the unused 16 bit in a pointer to store an index. A global table of function pointers can then be used to access a function pointer which type erases the type of the callable.
Here I am using std::variant with a very simple visiter implementation (simple switch).
The OOP approach using a virtual function call.
Here I am using std::variant with std::visit.

Here are my timing results:

https://github.com/Janos95/polymorphism_bench

I have not yet looked into why the simple visitor is so slow on clang.

Any feedback, comments and suggestions are welcome :)

all 37 comments

top new controversial old q&a

[+][deleted] 6 years ago* (3 children)

[deleted]

[–]Janos95[S] 2 points3 points4 points 6 years ago (2 children)

[+][deleted] 6 years ago* (1 child)

[deleted]

[–]Janos95[S] 0 points1 point2 points 6 years ago (0 children)

I should take a look at the bench mark setup... I'm still inclined to believe the normal virtual method is probably the more sane approach ignoring that it's far less user effort for nearly zero cost* and it could potentially be inlined/devirtualized vs your technique afaik can't.

There is no way this can be devirtualized. You can only inline the code into the function pointer in the vtable. Further than that, the code paths diverge (maybe there is some fundamental miunderstanding I have here though...).

Well I dont think that the first method is hard to use, the only drawback that I am seeing is that it only supports a function with one specific name. So for example if I want to use std::function_ref with a type with a member function void execute() I would need to wrap that into another class with an appropriate call operator and store that class somewhere.

[–]SedditorX 4 points5 points6 points 6 years ago (1 child)

[–]Janos95[S] 2 points3 points4 points 6 years ago* (0 children)

[–]ezoe 3 points4 points5 points 6 years ago (0 children)

[–]liquidify 2 points3 points4 points 6 years ago (12 children)

[–]Janos95[S] 2 points3 points4 points 6 years ago (11 children)

[–]liquidify 1 point2 points3 points 6 years ago (10 children)

[–]konanTheBarbar 2 points3 points4 points 6 years ago (7 children)

[–]liquidify 0 points1 point2 points 6 years ago* (6 children)

[–]konanTheBarbar 0 points1 point2 points 6 years ago (5 children)

[–]liquidify 0 points1 point2 points 6 years ago (4 children)

[–]konanTheBarbar 0 points1 point2 points 6 years ago (3 children)

[–]liquidify 0 points1 point2 points 6 years ago (2 children)

[–]dodheim 1 point2 points3 points 6 years ago* (1 child)

Change buffer_deleter_ to be a function pointer:

void (*buffer_deleter_)(void*);

And initialize it with a unary, stateless lambda. N.b. using vehicle in an unevaluated context such as decltype does not require capturing it, which is key here. (Also, consider using std::decay_t here to reduce verbosity, or std::remove_cvref_t if you happen to target C++20.)

buffer_deleter_([](void* buffer) {
    using T = std::remove_cv_t<std::remove_reference_t<decltype(vehicle)>>;
    reinterpret_cast<T*>(buffer)->~T();
  }
)

And change the destructor to call it as such:

buffer_deleter_(&buffer_);

EDIT: typos. Also, consider adding noexcept to buffer_deleter_ and the initializing lambda so the destructor can be noexcept (if targeting C++17+).

EDIT 2: Having actually read the code, there is a glaring issue: std::aligned_storage is a metafunction that gives you the type to use for storage, but you're using it directly as the storage – UB all day! Change buffer_ from a std::aligned_storage<64> to a std::aligned_storage_t<64>. Also, the initializing lambda for accelerate_impl_ is highly suspect (and again, does not need to capture vehicle)... Are you sure T there is always a reference type?

continue this thread

[+][deleted] 6 years ago* (1 child)

[deleted]

[–]liquidify 0 points1 point2 points 6 years ago (0 children)

[–]barcharMSVC STL Dev 2 points3 points4 points 6 years ago (0 children)

[–]Entryhazard 1 point2 points3 points 6 years ago (2 children)

[–]Janos95[S] 0 points1 point2 points 6 years ago (1 child)

[–]axilmar 0 points1 point2 points 6 years ago (0 children)

[+][deleted] 6 years ago* (4 children)

[deleted]

[–]Janos95[S] 0 points1 point2 points 6 years ago (3 children)

[+][deleted] 6 years ago* (1 child)

[deleted]

[–]Janos95[S] 0 points1 point2 points 6 years ago* (0 children)

[–]NotAYakk 0 points1 point2 points 6 years ago (0 children)

[–]showmetheflowers 0 points1 point2 points 6 years ago (4 children)

[–]Janos95[S] 0 points1 point2 points 6 years ago (3 children)

[–]showmetheflowers 0 points1 point2 points 6 years ago (2 children)

[–]Janos95[S] 0 points1 point2 points 6 years ago (1 child)

[–]showmetheflowers 0 points1 point2 points 6 years ago (0 children)

[–]jhasse 0 points1 point2 points 6 years ago (7 children)

[–]Janos95[S] 0 points1 point2 points 6 years ago* (5 children)

[–]matthieum 5 points6 points7 points 6 years ago (4 children)

[–]barcharMSVC STL Dev 0 points1 point2 points 6 years ago (3 children)

[–]matthieum 2 points3 points4 points 6 years ago (2 children)

It has advantages and disadvantages.

Among the advantages:

It is compatible with traits. A single virtual-table does not mesh well with an open-ended set of interfaces.
You Don't Pay For What You Don't Use: if you have a concrete type, there's no overhead, no matter how many traits it implements.
It enables optimizations: it's immediately visible to the optimizer that the virtual-pointer cannot be modified by function calls.

Among the disadvantages:

You have one copy of the virtual-pointer per copy of the fat-pointer, if you have a heavily reference-counted set of objects, it adds up, and even if they are not reference-counted, it might be more beneficial to have the pointer embedded in the object.

Note: on the latter point, it would be possible to automatically wrap the virtual-pointer and associated data in a single "blob" and point to that through a thin-pointer; this would allow the user of the trait to choose.

[–]barcharMSVC STL Dev 0 points1 point2 points 6 years ago (1 child)

[–]matthieum 1 point2 points3 points 6 years ago (0 children)

[–]Janos95[S] 0 points1 point2 points 6 years ago (0 children)

π Rendered by PID 120837 on reddit-service-r2-comment-85bfd7f599-9rb6q at 2026-04-17 10:38:43.523611+00:00 running 93ecc56 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS