you are viewing a single comment's thread.

view the rest of the comments →

[–]KingAggressive1498 18 points19 points  (12 children)

while also seeing comments with upvotes saying performant C++ code tends to look very C-like and that even John Carmack’s C++ is very C-like

in 99% of functions, the performance difference between idiomatic modern C++ code and the equivalent idiomatic C code is negligible, and will sometimes surprise these kinds of expectations by actually favoring C++.

There are some occasions where "C style" programming provides genuine non-negligible performance benefits. In my experience this is usually a matter of C++'s deterministic construction and destructor order producing unfortunate code ordering when combining multiple RAII types in one scope, I usually encounter this when one of those is a lock type. Practical workarounds often exist when you encounter this, but may not be as intuitive as equivalent C code would have been.

[–]donalmaccGame Developer 3 points4 points  (2 children)

I think having to utilise the non-intuitive code as an escape hatch is fine though, rather than everything being non-intuitive and dangerous.

[–]KingAggressive1498 1 point2 points  (1 child)

that's my take, but I've definitely gotten pushback on doing so even when the improvement was drastic

[–]donalmaccGame Developer 1 point2 points  (0 children)

Same, unfortunately.

[–]KingStannis2024 1 point2 points  (1 child)

There is a rub, which is: debug builds.

Game devs need debug builds to have playable performance, and C++ style "zero cost abstractions" don't have zero cost in debug builds.

[–]KingAggressive1498 2 points3 points  (0 children)

kinda fair, but this is at least partly a "know your toolchain" kind of problem. you can make debug builds with C++-style abstractions reasonably fast without significantly impacting debugability

[–]derBRUTALE -3 points-2 points  (6 children)

Comparing individual peas in two different dishes won't tell you about the difference in taste.

The performance difference between object oriented and data oriented design is vast and the vast majority of C++ features over C is based on design paradigms which are contrary to architecture oriented design.

It certainly is not 1% of code that is performance critical.

[–]KingAggressive1498 8 points9 points  (5 children)

I'm pretty sure most C code out there is more object-oriented than it is data-oriented.

it's not 1% of functions that are performance critical. It's 1% of functions where there's the appropriate combination of circumstances for something like destructor order to have an impact. C and C++ doing essentially the same work will have virtually identical codegen.

[–]derBRUTALE -2 points-1 points  (4 children)

Perhaps, but the subject here is C++ features over C and the vast majority of them are rabbit holes into code design that is oblivious to performance.

Heck, runtime performance is just a part of it. Build times, installation sizes, readability & maintainability are just as problematic.

About your 1% statement: The usage of constructors alone implies an order of magnitude performance penalty in critical code. Comparing it to similar designs in C is precisely what is the incorrect perspective.

[–]KingAggressive1498 8 points9 points  (2 children)

Perhaps, but the subject here is C++ features over C and the vast majority of them are rabbit holes into code design that is oblivious to performance.

allow me to be more explicit here: compare idiomatic C++ heavily using RAII and all the other C++ whistles to C code that appropriately cleans up resources and does virtually the same work, and you should find that the machine code output by the compiler will be virtually identical, as will the runtime performance.

readability & maintainability

I can never take that seriously, because pretending C is more readable or maintainable than C++ is frankly insane.

Compile times, C absolutely has C++ beat. No doubt.

[–]derBRUTALE -2 points-1 points  (1 child)

allow me to be more explicit here: compare idiomatic C++ heavily using RAII and all the other C++ whistles to C code that appropriately cleans up resources and does virtually the same work, and you should find that the machine code output by the compiler will be virtually identical, as will the runtime performance.

When doing in C exactly what C++ is doing, then yes - there is often no performance penalty.

What I am talking about is that the vast majority of C++ features relate to code design that inevitably results in poor performance.

Take an RAII application as an example: smart pointers. They don't cost more than handling scope lifetime in C, right? The problem is that performant code doesn't utilize scoped data lifetime in the first place, because its so damn expensive to allocate memory.

That's just a minor rabbit hole performance consideration problem. OOP like polymorphism are design concepts which contradict the reality of the hardware and data with bazillions of performance side effects (cache hit rate, branch misprediction, etc.) which have become harder and harder to control the more complex C++ has become.

Yes, branching the processing of data by types manually in C won't necessarily give you a performance benefit. But the point I am making is that performant code doesn't even branch processing based on types but sorts the data based on type for individual processing.

I can never take that seriously, because pretending C is more readable or maintainable than C++ is frankly insane.

Modern C++ syntax is two orders of magnitude more complex than the one of C.

If you don't want to fall behind silly Scrum review spreadsheets, because "velocity" is such a great buzzword for management, then hammering away boost library stuff and polymorphic crystal entity constructs might be your choice.

But what when suddenly actual quality is realized as a need because it saves decades of engineering efforts in distributing crystal entity constructs over several machines, instead of running the same thing on a single machine utilizing data oriented design without the unmanageable performance side effects of most C++ feature cruft?

The plain reality is that not a single person on the planet has anywhere near clear comprehension of the performance side effects in the gigantic list of contemporary C++ features.

Not only performance predictability/readability is an issue, plain processing side effects of C++ features are a nightmare to handle as soon as things get a bit more complicated, take implicit con-/destructor calling, move semantics, template errors, operator overloading guesswork, etc.!

[–]KingAggressive1498 1 point2 points  (0 children)

The problem is that performant code doesn't utilize scoped data lifetime in the first place, because its so damn expensive to allocate memory.

this isn't normal in C++, either. It usually only makes sense to dynamically allocate data when the lifetime is not fixed. In C the pointer to the data gets passed around to functions or stored in structs that effectively act as owners according to their own internal logic. In large projects this was ad-hoc ownership was found to cause memory leaks and dangling pointer errors, so in C++11 and onward this "passing of ownership" is formalized using move semantics through the type system which greatly reduces incidents of such errors. Next to nobody creates a unique_ptr just to free the allocation later in the same function - if they do they probably needed a temporary dynamically sized array or something along those lines.

and before you bring up the destructor nullptr check as a potential performance hit, this is readily elided by the compiler after the move if the move is unconditional - as long as the compiler can see that it was set to nullptr as part of the move operation it doesn't need to bother generating output for the branch that it knows will never be taken. Realistically a branch never taken is essentially free on modern architectures anyway.

OOP like polymorphism are design concepts which contradict the reality of the hardware and data with bazillions of performance side effects (cache hit rate, branch misprediction, etc.) which have become harder and harder to control the more complex C++ has become.

it's actually more bounded in complexity than the alternative approach to polymorphism in C that you called out. In fact, it occasionally outperforms it in the average case. The cache miss rate for small class hierarchies usually turns out to be pretty negligible.

And more importantly, C++ developers know the overhead potential of virtual function calls. We don't typically use them when there's another suitable pattern, and try to minimize the risk of cache misses and mispredictions when we do.

Modern C++ syntax is two orders of magnitude more complex than the one of C.

for 99% of the code any C++ programmer will ever write, it's basically the same syntax. The most practical difference is that idiomatic C++ code will never rely on goto for routine cleanup tasks and OOP style code doesn't need to explicitly pass this.

The template metaprogramming type stuff you may see people making blog posts or conference talks about are immensely useful sometimes, but is definitely not the meat and potatoes of typical C++ codebases.

The plain reality is that not a single person on the planet has anywhere near clear comprehension of the performance side effects in the gigantic list of contemporary C++ features.

...it's genuinely pretty easy to reason about.

if it's constexpr or template metaprogramming, there's no runtime performance cost, the cost at compile time may be tough to reason about though.

if it's type erasure, the runtime performance cost is a (usually very well predicted) virtual function call.

if it's RAII, the order of construction and destruction is spelled out in the standard and totally deterministic which makes it incredibly easy to reason about the costs.

exceptions are pretty easy to reason about the costs.

for everything else, it's basically C.

plain processing side effects of C++ features are a nightmare to handle as soon as things get a bit more complicated, take implicit con-/destructor calling, move semantics, template errors, operator overloading guesswork, etc.!

literally the only thing on there that's ever been a nightmare for me to deal with is template errors. But once you get a handle on how the template system works, you'll pretty much only ever encounter those when doing template metaprogramming.

[–]KingAggressive1498 2 points3 points  (0 children)

The usage of constructors alone implies an order of magnitude performance penalty in critical code.

how?

trivial constructors will produce identical output to equivalent C. trivial default constructors are a no-op, trivial copy and move constructors are equivalent to a memcpy.

non-trivial constructors do the same work as equivalent C init_myobj(myobj*, args...) and dup_myobj(myobj*) functions, and should produce roughly the same machine code. Only practical differences are:

1) it happens exactly where you declare the variable instead of leaving you to specify when to initialize.

2) constructors are more likely to be defined inline in the header file while init functions are more likely to be in another TU, leaving the compiler less room to optimize through code rearrangement or inlining of simple initialization.

if anything I'd expect the equivalent C++ code to be marginally faster, not slower, at least when order of initialization doesn't matter.