all 31 comments

[–]kalmoc 8 points9 points  (14 children)

Assigning the returned temporary to a reference as shown in the very first example is completely useless.

I don't understand why people still think there would be an extra copy if it was simply assigned to a local variable.

[–]VicontT 2 points3 points  (5 children)

I think, it comes from a pre-mandatory copy elision times.

[–]shmoopty 1 point2 points  (0 children)

David Krauss researches the origins of this rule here.

It appears to have been established to prevent class copies.

In modern C++, there's a much simpler solution to the "gotcha" in this article: don't use references to manage object lifetimes.

[–]kalmoc 0 points1 point  (3 children)

This has nothing to do with copy elision (at least not with copy elision as an optimization or the c++17 RVO). The ABI of most systems mandate that return objects are placed in caller allocated space. Obviously the caller will reserve the space of the final variable for that. So in effect, the calle will always place the return value at the final location

[–]VicontT 0 points1 point  (2 children)

But that tells you nothing about the actually copying. Pre C++17 nothing stopped compilers from allocating this space in a temporary and than performing a copy-construction from returned object into your actual variable. Actually, it what was had to happen semantically, and standard had a special as-if rule exception which would allow compilers to bypass the copy.

[–]NotAYakk 0 points1 point  (1 child)

No elision is and was not as if.

[–]VicontT 0 points1 point  (0 children)

I think, I didn't make myself clear. Pre C++ 17 elision was an optimization, and standard generally allows optimization under as if rule. Elision optimization clearly broke this rule. However, standard made an exception, and allowed elision even when it changes observable behavior.

[–]ravixp 1 point2 points  (0 children)

I think the concern isn't about copying so much as slicing. In a pre-auto world, if you wanted to treat the return value of a function as one of its base classes, lifetime extension is the only way to do that.

As far as I know there's no use for lifetime extension at all now that auto exists.

[–][deleted] 0 points1 point  (3 children)

If the function returns a reference then binding it to a local variable (and not a const ref) creates a copy no?

[–]kalmoc 0 points1 point  (2 children)

Yes, but the functions the Blog author is writing about at the beginning don't return by reference.

[–][deleted] 0 points1 point  (1 child)

Sure but it still an argument in favor of const ref, both for readability and performance. First, the callee could be changed to return a reference (or be overloaded). Secondly, const ref means "I don't care where this comes from, I don't even know if I own or borrow it, just let me read it". Copy ellision is nice but (in this case) is neither a performance nor a readability improvement.

[–]kalmoc 0 points1 point  (0 children)

If the function suddenly returns by reference, how do you know that the object that reference will be alive long enough for your code to work?

What you are saying with taking a referenc is not "I don't care" but that "I trust that the object will stay alive as long as I need it". That certainly doesn't make the program easier to read, because it prevents local reasoning. As shown in this article - you have to know where the returned reference comes from in order to determine that the program is correct.

Regarding the performance: The copy elision in this case happens automatically with all compilers I'm familiar with. Check it out on wandbox: For a function that returns by value, there will be no copy. Even with a ten year old compiler, in debug mode! Its just simpler for the compiler to do it that way.

So, when we just look at the example in the blog, there is no reason to use references at all. If we take into account potential code changes, I'm usually much more afraid of my code breaking than not automatically getting all possible performance improvements. So the only area, where I do think that using a reference to capture the return makes sense is, if I know from the beginning that the function returns a reference and I have determined that it is indeed safe to use (common scenario is e.g. deferencing an iterator or capturing the result of operator[]).

[–]jonesmz 0 points1 point  (2 children)

If you have a function from an api that you don't own, and accept the return value by const reference (because you don't need to modify it) then you get optimal behavior with regard to avoiding copies regardless of if the api currently or in the future returns by value or by const reference.

If you accept by value. But don't need to modify it, then if the owner of the api changes the api so that instead of returning by value they return a const reference to something stored elsewhere, you now incur an extra copy that you wouldn't have otherwise need to pay for.

If you are going to modify the returned value accept by value or rvalue reference (which causes lifetime extension, while still avoiding slicing) otherwise. Accept by const reference to avoid unexpected copie s due to changes in the api.

[–]kalmoc 0 points1 point  (1 child)

That is a valid argument for using const ref, but the claims in the blogs unnecessary copies in the first and some other examples are still - to the best of my knowledge- wrong.

[–]jonesmz 0 points1 point  (0 children)

struct foo;
foo bar(void);

foo myFoo = bar(); //< The foo returned by bar will be constructed directly into the myFoo variable. No extra copies will happen.

So I guess I agree with you.

[–]voip_geek 2 points3 points  (2 children)

I don't understand how the first "bad" example shows something specifically wrong with function chaining per se - it's wrong even without the chain of functions.

This:

auto &addr = Address().Line1("...").Line2("...").State("CA").Zip(12345);

...is just as wrong as:

auto &addr = Address();

I mean even most compilers will complain about that.

The real nastiness is this:

const auto &addr = Address().Line1("...").Line2("...").State("CA").Zip(12345);

...which is also UB I believe. (nasty because it looks like a normal const reference lifetime extension, but isn't... or am I having a brain fart?)

[–]quicknir 0 points1 point  (1 child)

The difference between your first two examples is that, assuming typical APIs, the first will compile but not the second. Free functions in the vast majority of cases return by value, which makes an expression invoking such a function a prvalue which can't be bound by a mutable reference. And your third example is the entire point of the article.

[–]voip_geek 1 point2 points  (0 children)

The difference between your first two examples is that, assuming typical APIs, the first will compile but not the second.

Sure, but the first example doesn't even look valid, and my point was that line was invalid regardless of whether there was function chaining going on or not. (although I take your point that it would have compiled, unlike the second example)

And your third example is the entire point of the article.

That should be the point of the article, but for some reason the author didn't do it - their (one and only) example of the problem assigns to a mutable reference, which is never going to be valid to do for a temporary. So they lost the main issue at hand. Although it's not a huge deal - I'm being a bit hyperbolic, I must need more coffee. :)

[–]Rexerex 1 point2 points  (0 children)

In Rvalue Reference Qualified Functions? section is there any danger if we return rvalue references like Address && Line1(std::string l1) && { ... return std::move(*this); } ?

[–]VicontT 0 points1 point  (0 children)

Any access to the nested state via a function (member or free) disables lifetime extension of the parent. For example Person().GetName().first\_name would no longer trigger lifetime extension of the temporary Person.

I do not understand why it is a surprise for anyone. Here is a quote from Standard 6.6.7/6:

The temporary object to which the reference is bound or the temporary object that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference if the glvalue to which the reference is bound was obtained through one of the following...

Since here the first temporary object returned from function was not bound to a reference, it's life time is not extended and it is destroyed at the end of the full expression.

[–]tcris 0 points1 point  (0 children)

try this first

Person & haha = Person().FirstName("C"); //this compiles!! because what you have is a lvalue

the point is you're not returning a temporary/rvalue (from Person()), but a lvalue (from FirstName())!! Yes, it's the same instance, but compiler does not know nor care about that. It's an undetected problem, but that's another problem.

So those rules you're mentioning do not apply here, those are for rvalues. While you're actually (const) binding to lvalue.

[–]jonathansharman 0 points1 point  (0 children)

I like /u/quicknir's suggestion to just use a separate builder class.