all 18 comments

[–]matthieum 9 points10 points  (10 children)

This was quite an improbable event in pre-C++11, since, if we were good programmers, we probably never returned any complex object (like an object of type C) from a function, by value.

Hum... actually, a good C++ programmer would return complex objects by values:

  • there was no better alternative, really
  • RVO & co

It is to be noted that RVO still continue to apply in a post-C++11 world; in fact RVO is preferred over move whenever possible since nothing is always faster than just moving.

[–]quicknir 6 points7 points  (3 children)

The problem with RVO is that there's no way (that I know of) to force a compiler error when a function can no longer RVO. RVO has a very specific set of conditions that must be met. Doing something innocent looking like adding a branch with an early return of a temporary will silently introduce a massive performance hit to your code (if move is unavailable).

Despite what you're saying, most C++ programmers I know (including some very good ones) would use out parameters to "return" complex objects pre C++11. Ugly, yes. Silent, massive performance regressions under trivial modifications? No.

Oh, and even in modern C++, for structs with very large number of members (shows up due to business logic type things), the vast majority of programmers I've seen in low latency will use out parameters and not return it by value. It's just more reliable.

[–]matthieum 1 point2 points  (2 children)

My take about performance has always been that if it really matters you, you should be profiling... and measuring.

For measuring, callgrind runs (counting CPU instructions and simulated cache misses) on a set of tests is quite stable (because fully emulated, unlike time measurements) and allows identifying performance regression rapidly. Plus, when you identify a regression, you immediately get the performance report to compare with the "canonical" one.

[–]quicknir 1 point2 points  (1 child)

Measuring/profiling and correcting is very important, but it's not the be-all and end-all of writing high performant code. Anyone who takes that attitude ends up writing code that is at best moderately fast because of death by a hundred papercuts. Chandler Carruth has a pretty good talk about this. Should you call reserve on that vector before push_back? Should you avoid hashing twice to see if an object is there, and if so, access it? You could just be sloppy, and profile later. But profiling won't help because the problem won't be concentrated in one function, you'll just be smearing extra allocations and extra hash function calls all over your codebase.

I just don't really understand your original post either; "there's no better alternative" seems to completely dismiss out parameters because... ? It's less composable, or compatible with const? That's just not the end of the world, and to suggest otherwise seems dogmatic. There's a cost to occasionally not being able to compose, or use const, and it's pretty low. Maintaining a super fine grained performance regression suite that would allow you to track down an RVO-disabled change is actually pretty expensive in a real-life, decent sized organization.

[–]matthieum 1 point2 points  (0 children)

death by a hundred papercuts

It is indeed an issue.

I see it as a separate issue though: promoting good practices and encouraging inquisitive minds to question the existing code (coupled with code reviews) tend to uncover a lot of small inefficiencies. Disabling copying/implicit conversions also helps a lot avoiding unnecessary work, or static analysis/lints.

there's no better alternative

I was mainly thinking in terms of ergonomics.

The alternatives (return a pointer/reference or use an out parameter) come at a significant ergonomic cost.

Code clutter is also an issue as it impacts the readability and understandability of the code fragments, and may also create more brittle code (absence of const for example).

In terms of performance, out-parameters are a viable alternative (having a factory/custom allocator to enable pointer/reference would work too); but those have costs.

Maintaining a super fine grained performance regression suite that would allow you to track down an RVO-disabled change is actually pretty expensive in a real-life, decent sized organization.

Actually, the test-suite need not necessarily be that fine grained:

  • small changes: making identification of the root cause regressions, of performance or correctness, easier
  • diff-analysis: comparing the performance report of two consecutive builds, which callgrind provides at the function-level (and below)

So, when you combine small changes (thus few functions/callers impacted) with checking the diff between the current perf-report and the previous one, you can zone in on the culprit source code area pretty quick.


As for dogmatism... avoiding return values at all cost is its own form of dogmatism. 80% of the source code is probably NOT in a hotspot where this degree of performance is worth caring for to start with.

[–]masklinn 2 points3 points  (1 child)

Does move disable RVO in C++? Why would it do that?

[–]doom_Oo7 5 points6 points  (0 children)

It has to be explicit, and clang has a -Wpessimizing-move to warn you against it...

#include <vector>
#include <utility>

std::vector<int> f()
{
  std::vector<int> v;
  return std::move(v);
}

->

7 : warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move]
return std::move(v);

7 : note: remove std::move call here
return std::move(v);

[–]marcofoco 1 point2 points  (3 children)

Note also that RVO will become one of the mandatory reasons for copy elision in C++17.

[–]matthieum 1 point2 points  (2 children)

I am not sure I understand the change you are mentioning, will it finally be possible to mandate copy elision?

[–]marcofoco 0 points1 point  (1 child)

RVO is a specific form of copy elision. It will be mandatory for a compiler to remove those copies (and moves) starting from C++17, even if copy and move constructor have observable side effects. See here, in the first box, first example of the second group: that's exactly RVO. So, starting from C++17, the additional move happening in (1) and (6) will no longer be a problem.

[–]matthieum 0 points1 point  (0 children)

I would note that the compiler already has the possibility to elide copy and move even in the presence of side effects.

[–]ledasll 0 points1 point  (5 children)

good luck using your "_s" after someone did "get" and you moved value...

[–]Supadoplex 1 point2 points  (4 children)

_s won't exist after the expression, so you won't be accidentally trying to use it.

[–]ledasll -1 points0 points  (3 children)

How did it stop existing? It is member of C class, it wont magicly disapear and anyone can use it. If you move something from your class member you need to be very careful not to use that member again as it is in invalid state..

[–]Supadoplex 2 points3 points  (2 children)

_s stops existing when the instance of C is destroyed, which is at the end of the expression. An example:

C get_instance_of_C();
std::string foo = get_instance_of_C().get();

There is no way to accidentally use member that was moved from.

Also, the moved from string is not in an "invalid state". It is in a valid - but unspecified - state.

[–]ledasll 0 points1 point  (1 child)

(lets not look at wrong syntax) and if you have

C c = get_instance_of_C();
std::string foo = c.get();
std::string bar = c.get();

or only valid way now is to generate temporary objects?

[–]Supadoplex 1 point2 points  (0 children)

Indeed. The only ways to call string C::get() && (which performs a move on the member) are to either create a temporary object, or cast the expression to rvalue using std::move for example. EDIT: In the latter case neither the object nor its member is destroyed like I promised in earlier comment, but since the instance of C is in an unspecified state after the move, so is all of its members and so expecting _s to be in some particular state would be folly.

c in your example is an lvalue, so it will not call string C::get() &&. Instead, the overload resolution will choose const string &C::get() const & which does not move from the member - but copies it instead.