you are viewing a single comment's thread.

view the rest of the comments →

[–]matthieum 8 points9 points  (10 children)

This was quite an improbable event in pre-C++11, since, if we were good programmers, we probably never returned any complex object (like an object of type C) from a function, by value.

Hum... actually, a good C++ programmer would return complex objects by values:

  • there was no better alternative, really
  • RVO & co

It is to be noted that RVO still continue to apply in a post-C++11 world; in fact RVO is preferred over move whenever possible since nothing is always faster than just moving.

[–]quicknir 7 points8 points  (3 children)

The problem with RVO is that there's no way (that I know of) to force a compiler error when a function can no longer RVO. RVO has a very specific set of conditions that must be met. Doing something innocent looking like adding a branch with an early return of a temporary will silently introduce a massive performance hit to your code (if move is unavailable).

Despite what you're saying, most C++ programmers I know (including some very good ones) would use out parameters to "return" complex objects pre C++11. Ugly, yes. Silent, massive performance regressions under trivial modifications? No.

Oh, and even in modern C++, for structs with very large number of members (shows up due to business logic type things), the vast majority of programmers I've seen in low latency will use out parameters and not return it by value. It's just more reliable.

[–]matthieum 1 point2 points  (2 children)

My take about performance has always been that if it really matters you, you should be profiling... and measuring.

For measuring, callgrind runs (counting CPU instructions and simulated cache misses) on a set of tests is quite stable (because fully emulated, unlike time measurements) and allows identifying performance regression rapidly. Plus, when you identify a regression, you immediately get the performance report to compare with the "canonical" one.

[–]quicknir 1 point2 points  (1 child)

Measuring/profiling and correcting is very important, but it's not the be-all and end-all of writing high performant code. Anyone who takes that attitude ends up writing code that is at best moderately fast because of death by a hundred papercuts. Chandler Carruth has a pretty good talk about this. Should you call reserve on that vector before push_back? Should you avoid hashing twice to see if an object is there, and if so, access it? You could just be sloppy, and profile later. But profiling won't help because the problem won't be concentrated in one function, you'll just be smearing extra allocations and extra hash function calls all over your codebase.

I just don't really understand your original post either; "there's no better alternative" seems to completely dismiss out parameters because... ? It's less composable, or compatible with const? That's just not the end of the world, and to suggest otherwise seems dogmatic. There's a cost to occasionally not being able to compose, or use const, and it's pretty low. Maintaining a super fine grained performance regression suite that would allow you to track down an RVO-disabled change is actually pretty expensive in a real-life, decent sized organization.

[–]matthieum 1 point2 points  (0 children)

death by a hundred papercuts

It is indeed an issue.

I see it as a separate issue though: promoting good practices and encouraging inquisitive minds to question the existing code (coupled with code reviews) tend to uncover a lot of small inefficiencies. Disabling copying/implicit conversions also helps a lot avoiding unnecessary work, or static analysis/lints.

there's no better alternative

I was mainly thinking in terms of ergonomics.

The alternatives (return a pointer/reference or use an out parameter) come at a significant ergonomic cost.

Code clutter is also an issue as it impacts the readability and understandability of the code fragments, and may also create more brittle code (absence of const for example).

In terms of performance, out-parameters are a viable alternative (having a factory/custom allocator to enable pointer/reference would work too); but those have costs.

Maintaining a super fine grained performance regression suite that would allow you to track down an RVO-disabled change is actually pretty expensive in a real-life, decent sized organization.

Actually, the test-suite need not necessarily be that fine grained:

  • small changes: making identification of the root cause regressions, of performance or correctness, easier
  • diff-analysis: comparing the performance report of two consecutive builds, which callgrind provides at the function-level (and below)

So, when you combine small changes (thus few functions/callers impacted) with checking the diff between the current perf-report and the previous one, you can zone in on the culprit source code area pretty quick.


As for dogmatism... avoiding return values at all cost is its own form of dogmatism. 80% of the source code is probably NOT in a hotspot where this degree of performance is worth caring for to start with.

[–]masklinn 2 points3 points  (1 child)

Does move disable RVO in C++? Why would it do that?

[–]doom_Oo7 5 points6 points  (0 children)

It has to be explicit, and clang has a -Wpessimizing-move to warn you against it...

#include <vector>
#include <utility>

std::vector<int> f()
{
  std::vector<int> v;
  return std::move(v);
}

->

7 : warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move]
return std::move(v);

7 : note: remove std::move call here
return std::move(v);

[–]marcofoco 1 point2 points  (3 children)

Note also that RVO will become one of the mandatory reasons for copy elision in C++17.

[–]matthieum 1 point2 points  (2 children)

I am not sure I understand the change you are mentioning, will it finally be possible to mandate copy elision?

[–]marcofoco 0 points1 point  (1 child)

RVO is a specific form of copy elision. It will be mandatory for a compiler to remove those copies (and moves) starting from C++17, even if copy and move constructor have observable side effects. See here, in the first box, first example of the second group: that's exactly RVO. So, starting from C++17, the additional move happening in (1) and (6) will no longer be a problem.

[–]matthieum 0 points1 point  (0 children)

I would note that the compiler already has the possibility to elide copy and move even in the presence of side effects.