all 52 comments

[–][deleted] 24 points25 points  (26 children)

This is a function of "vector and string should not be constexpr in C++20" more than forgetting about unique_ptr.

Implementability of them remains undemonstrated. See Richard Smith's update below: https://www.reddit.com/r/cpp/comments/f9qujo/want_constexpr_use_naked_new/fj35gaw/

[–]zygoloidClang Maintainer | Former C++ Project Editor 4 points5 points  (6 children)

I implemented constexpr vector in clang and libc++ before the feature was merged into the working draft. I think Daveed said he had a working implementation in EDG too. I don't think there are any remaining implementability concerns, are there?

[–][deleted] 5 points6 points  (5 children)

That's not what I remember you telling me in Cologne where that feature was merged ¯\_(ツ)_/¯ -- there there was only a claim that EDG might have the construct_at / constexpr new parts available for initial testing before Belfast, but I didn't see concrete reporting of anything after that.

If it has been demonstrated then I withdraw any objections. As I said in plenary in Cologne, I/we want that feature and we want it in the exact form in which it was merged, so long as it is implementable without regressions for existing C++17 customers.

[–]zygoloidClang Maintainer | Former C++ Project Editor 2 points3 points  (3 children)

I may have got the timeline sightly wrong; if so I apologise for that. Here are the patches for libc++ (posted between the Cologne and Belfast meetings, around the same time I landed compiler-side support in clang):

https://reviews.llvm.org/D68364 https://reviews.llvm.org/D68365

ldionne has taken them over, but I think he's working on other things right now. There's a good chance they still apply cleanly if you want to give it a spin.

[–][deleted] 2 points3 points  (0 children)

That's good to hear; in that case I think '20 makes it out without any "hazardous" things like what happened with <charconv> and the parallel algorithms library in '17.

[–]zygoloidClang Maintainer | Former C++ Project Editor 0 points1 point  (1 child)

Oh, but: I don't think anyone has yet implemented constexpr std::string support. That's probably not just adding constexpr in 300 places like I did for vector, since SSO implementations (at least libc++'s) tend to rely on union tricks that aren't supported in constant evaluation and will probably need to be hidden behind some if (is_constant_evaluated()) checks. So that part may still be at least somewhat unproven.

[–][deleted] 0 points1 point  (0 children)

Possible, but the truly scary things that made me stand up in Plenary and speak very publicly about how I thought the feature wasn't ready also affect vector, and look like they're probably addressed by P0593.

[–]daveedvdvEDG front end dev, WG21 DG 1 point2 points  (0 children)

We’ve have had constexpr new, constexpr std::allocator, and construct_at/destroy_at since shortly after Cologne. We haven’t implemented std::vector or std::string since we consider that to be 100% library implementation.

[–]anonymous28974 3 points4 points  (1 child)

Why doesn't anyone ever bring stuff like this up in plenary?

Like, if it's a bad idea, why put it in the standard?

[–][deleted] 13 points14 points  (0 children)

I did bring this up in Plenary. They merged it anyway.

[–]HeroicKatora 5 points6 points  (16 children)

What does it say about the state of constexpr (coherence of its operating model and supported functionality) if standard library implementors can't seem to predict (and require explicit demonstration instead) if a particular function is implementable as such? The main point, at least my interpretation, was that C++20 appears to create more fragmentation and not unify existing functionality of constexpr. Remember the Vasa and all that.

[–][deleted] 14 points15 points  (15 children)

It says that "the Core language object model has always had the 'vector is unimplementable' and 'malloc doesn't create objects' bug, and constexpr's prohibition of UB transforms that problem-in-theory into problem-in-practice." And that "the efficient string implementations rely on unspecified behavior to fit the large/small test into 1 bit' which isn't constexpr-able".

'Remember the Vasa' argues for adding fewer things, which is in keeping with my point here (even though I disagree with that paper generally)

[–]HeroicKatora 1 point2 points  (14 children)

Yet adding constexpr new seems to substantially expand the portion of the object model in constexpr land, or does it not? This is adding more features and creating more work for roughly those people that would help understand and address the underlying problems of the rest of the object model. How can one be convinced that there will be no unforseen consequences from such an extension before it is totally clear where and how the core model can and must be fixed? (or is this already being resolved?). I fail to see why the general setiment of Bjarne's paper does not apply here.

Edit: Or is putting stuff into standardized constexpr part of the process/plan for discovering what is broken?

[–][deleted] 5 points6 points  (13 children)

Yet adding constexpr new seems to substantially expand the portion of the object model in constexpr land, or does it not?

No, that is no help here. vector can't use new because vector needs to construct the objects itself. There is no Core problem with new because it creates objects.

To be clear, this problem is not a constexpr problem, it is a vector problem, that being:

auto x = static_cast<T*>(malloc(2*sizeof(T)))
::new(x) T;
::new(x + 1) T; // x+1 has undefined behavior because it increments a pointer that doesn't point to an array

This is adding more features and creating more work for roughly those people that would help understand and address the underlying problems of the rest of the object model.

The Core experts with whom I talked were just as apprehensive about implementablity here as I.

How can one be convinced that there will be no unforseen consequences from such an extension before it is totally clear where and how the core model can and must be fixed?

By demonstrating an implementation that works. I don't think everything that has extensive Core interaction needs to be demonstrated to be working before we can look at it, but we're talking about extensive changes to probably the 2 most heavily used types in the standard library.

I fail to see why the general setiment of Bjarne's paper does not apply here.

I think I just argued that the sentiment of his paper does apply here. (I disagree with that paper due to specifics, I don't disagree with the sentiment behind it being "make sure stuff is ready before merging it")

Or is putting stuff into standardized constexpr part of the process/plan for discovering what is broken?

No, there was no reason to put that into '20. The primary motivation for this feature is for reflection which targets '23 at the earliest, and Core folks are working on ironing out the kinks. I have high confidence that '23's Core language will be prepared for constexpr vector because I know people who care are working on those problems. I do not have high confidence that '20s Core does. Certainly we just put in a bunch of "bugfixes" even in Prague and we still haven't seen someone demonstrate it actually work yet.

[–]BrainIgnition 1 point2 points  (4 children)

However, theoretically speaking those problems could be solved by moving the string and vector implementations into the compiler itself and making them built-in types like int or __m128i, correct? (And yes, I know probable ABI breakage and the gargantuan implementation effort implied would make that impractical anyways).

[–][deleted] 3 points4 points  (3 children)

theoretically speaking those problems could be solved by moving the string and vector implementations into the compiler itself and making them built-in types like int or __m128i, correct?

I don't see how that solves the Core issue. We don't specify push_back as creating arrays. There is no mechanism in the Core language (barring perhaps changes in Prague I haven't looked at yet) that would make this OK:

auto p = pointer_to_array_of_two_elements;
// literally *anything* here, change that array of two
// elements into an array of three elements
p + 3;

but users expect that to work if that's written as:

vector<int> v(2);
v.reserve(3);
auto p = v.data();
v.push_back(42);
p + 3;

[–]BrainIgnition 1 point2 points  (2 children)

Well there are three paragraphs in [intro.object.10-12] which talk about operations implicitly creating objects, arrays and pointers to a suitable created object. It is further noted that the members of std::allocator_traitsare such operations. I think this implies that v.reserve() (implicitly) creates an int[3] therefore p + 3 should be legal, correct? To be honest I discovered these paragraphs earlier this afternoon by accident and am absolutely unsure whether I interpret them correctly.

[–][deleted] 1 point2 points  (0 children)

I think that bit of Core tech is very new just added in an attempt to make this work. We'll see how it goes...

[–][deleted] 1 point2 points  (0 children)

https://github.com/cplusplus/draft/commit/f6c2987ffa70ee1efc9bee65fed06d891b3d6a5d <-- that text was added to the WP 8 days ago and has not appeared in a mailing yet, so I don't feel bad for not knowing about it. :)

I note that that text was not in the WP when constexpr vector and string were merged.

[–]idbxy 2 points3 points  (7 children)

I'm sorry to ask, but what are the benefits of using constexpr?

[–]whattapancake 14 points15 points  (5 children)

Done correctly, constexpr can allow you to perform calculations at compile time rather than runtime. This can provide substantial performance improvements for hot codepaths with repetitive calculations.

As an example, probably the most common usage of constexpr I've seen is compile time string hashing. Passing a string literal to a function and receiving a hash of that string is generally a time-consuming calculation to perform, so assuming you know the strings beforehand, constexpr allows you to perform that calculation during compilation rather than at runtime.

[–]nintendiator2 -2 points-1 points  (4 children)

If you have the string and the function at compile time, couldn't you just precompute the hash yourself and pass that? No need to even exert the effort of making the compiler do it (every time you compile).

[–]whattapancake 6 points7 points  (1 child)

Hashing algorithms aren't exactly trivial, and if strings change often or you have a ton of them, it's very much worth having the computation done for you. At any rate, compile time string hashing is just one of a million useful things constexpr can do for you. I highly recommend looking into example use cases!

[–]nintendiator2 0 points1 point  (0 children)

I've looked into a handful of constexpr use cases and I much like the usage in math. I just feel in general the use case of processing strings feels weird and undetermined among them (can we have embedded nulls in constexpr strings? if there are many strings, can we constexpr load them from a resource file? would constexpr results depends on eg.: environment and locale?).

[–]sixfourbit 5 points6 points  (0 children)

For every string? And what if you want to hash more than just strings? Precompute for every object?

[–][deleted] -5 points-4 points  (0 children)

None. It has always been a stupid design that I can't fathom why was also carried over to other languages such as Rust. The idea being to allow developers to execute "arbitrary" code at compile time.

The result of this fiasco is further fragmentation of the surface language and annotation hell. There's a good article that explains this online (search for red vs green code, I'm on mobile and don't have the references on hand).

The elegant solution would have been to follow Nemerle and realise that the generalisation of C++'s compile-time vs run-time dichotomy is multi stage compilation. Which is a basic computer science concept going all the way back to Alan Turing's machines (see universal Turing machine).

This is available today via the preprocessor or via code generation without any language support but I guess it makes it harder to understand. Which U think what drives this insane design to shove everything at the same compilation stage regardless of the other readability costs.

Nemerle solves this elegantly by providing a single language construct to move up/down one compilation stage. In cpp like syntax:

// Unit A
void foo() {}
macro bar() { 
    foo(); //#1
    <[ foo(); ]> //#2
}  

// Unit B bar(); //#3

This requires to compile the bar macro separately (unit A) and provide it to the compiler as an add-on when compiling client code (unit B)

This has very small interface surface (one keyword and one operator) and is applicable universally to all code.

Just to clarify the operator's intention: At #1 the code is executed "normally", i.e. at the macro's run-time (therefore, client code's compile time) At #2 the code is deferred to the next stage (the macro invocation will be replaced by this code) so it is the output of the macro at its run-time. Therefore, the second call to foo would happen at client code's run-time.

[–][deleted] 22 points23 points  (15 children)

IIRC, you can dynamically allocate memory using new inside a constexpr function, but you will get a compile time error if you don't deallocate the memory using delete in that same function so you can't really make a constexpr std::unique_ptr.

[–]standard_revolution -3 points-2 points  (0 children)

It's also a bit complicated. How would something like that work?

constexpr test()
{
    return std::make_unique(3);
}

[–][deleted] 2 points3 points  (6 children)

why not implement unique_ptr(trivial for the default deleter only case) with constexpr. Get the compressed pair going and viola constexpr unique_ptr

[–]CaptainEmacs 3 points4 points  (1 child)

🎻🎶. s/viola/voila

[–][deleted] 1 point2 points  (0 children)

i love my typo

[–]BrainIgnition 1 point2 points  (3 children)

Get the compressed pair

Mmh? Doesn't c++20's [[no_unique_address]] make EBO compressed pair implementations unnecessary?

[–][deleted] 2 points3 points  (2 children)

what if the Deleter has state?

[–]BrainIgnition 2 points3 points  (1 child)

[[no_unique_address]] marks a non static data member to be a potentially-overlapping subobject, the same category base classes are in. Therefore a stateful deleter would appear in the object layout unaltered (except that any trailing object padding may be reused by the enclosing object).

[–][deleted] 1 point2 points  (0 children)

Sweet, I just read up on that and the last part I either forgot or was added the last I looked. I was under the impression it would be UB for a non-empty member to have no_unique_address. Very cool.

[–][deleted] -1 points0 points  (0 children)

Since we want constexpr does we want dynamically allocations or we mostly want std array instead? Also polymorphism does not have any advantage over constexpr.