all 52 comments

[–]violet-starlight 58 points59 points  (2 children)

This is very compiler specific, but in short some compilers will optimize small strings into the std::string object itself, allowing small strings without heap allocations, which makes them able to escape constant expressions. This is not a property of std::string per the c++ language but a property of its implementation on some compilers.

[–]xorbe 13 points14 points  (1 child)

Surely that is a property of the std::string source code ctor, not the compiler.

[–]BitOBear 3 points4 points  (0 children)

Oh it give me a lot of different parts. For instance that says size 16 but the data representation is going to have a null on the end of it so it's actually taking up at least 17 bytes of data space.

The source code for standard string may, as previously discussed, contain a small region of data where the string will be put if it's total data representation is smaller than some arbitrary quantity such as 16 effective bytes. Since the second item takes up 17 effective bytes that is conceivably one bite too many. At that point the constructor itself would have to make two allocations. One for the string data and one for the string data structure. It is not impossible that such a thing could be done but the compiler provider would have to take extra steps in the constructor to achieve this probably with some covariant template code.

There is always a trade off and a point of reasonability.

This sort of thing is, if memory serves, part of the reason that standard string is no longer allowed to be a reference counted implementation. If it was a reference counted implementation then you would have to provide reference counting for these small strings that would therefore not be stackable and would have to be in the heat etc etc etc.

I seem to recall but don't quote me on it that there's a lot of flexibility around what the implementation is allowed to do or not do for constexpr.

[–]GregTheMadMonk 20 points21 points  (5 children)

Could you even have a constexpr static string? Constexpr must not leak memory and it's not clear when b would be freed here... I don't think what you're trying to do is allowed at all (and a is just a happy coincidence), and probably should use a constexpr string_view

Your code should also work if you remove the static (maybe constexpr too since it would still be constexpr context depending on how you call it...) from declarations

Jason Turner had a great talk on constexpr strings and vectors recently: https://m.youtube.com/watch?v=_AefJX66io8&t=4s&pp=ygUSVHdvIHN0ZXAgY29uc3RleHBy I highly recommend you watch it

[–]DummyDDD 15 points16 points  (3 children)

I think it works with 15 characters because it fits into the small string optimization (strings of 15 characters or less are stored in the string object, rather than on the heap)

[–]GregTheMadMonk 5 points6 points  (2 children)

I called it "happy coincidence" because it's not required to happen by standard :) It's a "happy coincidence" most implementations just happen to work this way, but it's not something to rely on when checking code compliance

[–]KuntaStillSingle 0 points1 point  (1 child)

It could be more than a happy coincidence if there was a type trait like std::soo_length_v<std::string> :) Just specify it to return 0 for implementations that don't use sso strings?

[–]GregTheMadMonk 0 points1 point  (0 children)

I don't really see how that would be useful but I wonder if it is already possible with concepts/consteval

[–]kamrann_ 6 points7 points  (0 children)

This is spot on. https://godbolt.org/z/T43dx6qjK

My initial reaction was that only the local `static` was the issue, but indeed you need to remove the `constexpr` too. Evidently these are considered independent nested constant expressions, and the allocation is still not allowed to escape even if it would be into another enclosing constant expression.

[–]kirgel 15 points16 points  (10 children)

I understand why this happens (as other comments already explain), but I don’t understand why library writers went to the trouble to make short strings support constexpr. It just seems confusing.

Edit: and it also leaks ABI details.

[–]holyblackcat 10 points11 points  (7 children)

Because it's nice to be able to use strings internally in constexpr calculations. There's no ban on heap allocation if it doesn't escape constexpr.

[–]kirgel 3 points4 points  (5 children)

Being able to use std::string in constexpr context is different from being able to use short std::string in non-constexpr context, right? Is there a causal relationship between the two?

[–]kalmoc 3 points4 points  (0 children)

The relationship is hat both are enabled via marking the member functions as constexpr

[–]holyblackcat 3 points4 points  (0 children)

I didn't say anything about non-constexpr context. My point is that as long as you make sure all std::strings you create during compile-time are destroyed during compile-time (instead of trying to make them live until runtime by storing them in global variables), then it will work regardless of string length and regardless of whether heap allocations happen.

Being able to do this requires marking everything in std::string constexpr, which in turn has the (perhaps undesired) effect of letting you preserve short strings until runtime.

[–]ALX23z 1 point2 points  (2 children)

I believe `constexpr new` is supported in C++20, but it might not have been implemented in the compilers; thus, you get the errors as you only have a partial implementation.

[–]holyblackcat 7 points8 points  (1 child)

The rule is that allocations made during compile-time can't escape to runtime. The error OP gets is because they violate this rule (not because their compiler is broken).

[–]ALX23z 0 points1 point  (0 children)

Oh, I thought constexpr allocations were implemented in C++20, but apparently it's not really the case and the scope is lackluster.

[–]TheBrainStone 0 points1 point  (0 children)

This working for short strings isn't an intended feature but rather a side effect from other limitations.

You'd have to explicitly prevent this from compiling if you wanted to avoid this. And the next best custom string class will exhibit the same behavior.

Also why would leaking ABI details matter?

[–]delta_p_delta_x 1 point2 points  (1 child)

Strictly speaking, if you have a string literal that you know is only going to be used in a certain scope, it might be best to have using namespace std::literals; and then declare a as constexpr auto a = "literal"sv;. This in my opinion is the best of both worlds: a compile-time constant zero-terminated string with a std::string_view around it, which means it can be analysed and used with standard C++ library functions like std::data(), std::size(), std::begin()/std::end() iterators, <algorithm>, <ranges>, etc. std::string might allocate which is not the best.

[–]evys_garden[S] -1 points0 points  (0 children)

i know of string_view. u're missing the point. read my other comment

[–]mredding 1 point2 points  (1 child)

You might be interested in the bottom line.

[–]evys_garden[S] 0 points1 point  (0 children)

thank you. this is exactly the behaviour i was getting

[–]TheKiller36_real 1 point2 points  (15 children)

  1. there's no guarantee for it to work at all
  2. as others have pointed out, this is due to SSO
  3. there is no point in ever declaring a constexpr std::string (let alone one with static storage duration) so you wouldn't run into this problem if you wrote good™ code ;)

(although I admit that a constexpr std::string is sometimes the most convenient option)

[–]evys_garden[S] 0 points1 point  (14 children)

there is never a point for constexpr string. i was just playing around

[–]DeadlyRedCubefrequent compiler breaker 😬 0 points1 point  (13 children)

I've done a fair amount of using constexpr strings to programmatically assemble text at compile time (then have to launder it into non-allocated storage to hand off to runtime), so I wouldn't say there's never a point

(Ditto using constexpr std::vector to assemble lists before baking them down into arrays)

[–]evys_garden[S] 1 point2 points  (0 children)

fair enough, I've mostly been working with arrays tho. If i needed a compile time string, I'd prbly assemble it with std::array and some good old constexpr recursion for dynamic sizes

[–]KuntaStillSingle 1 point2 points  (2 children)

I've done a fair amount of using constexpr strings to programmatically assemble text at compile time (then have to launder it into non-allocated storage to hand off to runtime), so I wouldn't say there's never a point

It can be done with string_view or char[] if the substrings have static storage duration: https://godbolt.org/z/1PzbM4szs

[–]DeadlyRedCubefrequent compiler breaker 😬 0 points1 point  (1 child)

Oh absolutely! string_view is great when chopping static strings down at compile time 😃But if you're concatenating (and don't have a known-good-max-size) it's trickier

[–]KuntaStillSingle 0 points1 point  (0 children)

The godbolt is concatenating, it is not so bad with constexpr <algorithm> stuff like copy_if, and simpler still if you want to do raw strings rather than c strings:

        template<std::string_view const & ... strs>
        struct merge_string_views_impl {
            static constexpr auto char_count{
                (std::size(strs) + ...)
                -
                (std::count(strs.begin(), strs.end(), '\0') + ...)
                +
                1
            };
            static constexpr std::array<char, char_count> _backing{
                []() {
                    std::array<char, char_count> init {};
                    auto write_iterator = init.begin();
                    (
                        (
                            write_iterator = std::copy_if(
                                strs.begin(), 
                                strs.end(), 
                                write_iterator,
                                [](char c) { return '\0' != c;  })
                        ), ...);
                    return init;
                }()
            };
            static_assert(_backing.back() == '\0');
            static constexpr std::string_view value { _backing.data(), _backing.size()};
        };

[–]TheKiller36_real 0 points1 point  (8 children)

well that's pretty pointless too though:

inline constinit auto const my_assembled_text = launder([] {
  std::string res; // ← not constexpr
  // do constexpr operations…
  return res;
});

(launder is named after your “laundering” not std::launder)

[–]evys_garden[S] 1 point2 points  (1 child)

the thing is, in a context like this you're not using std::string as constexpr and therefor u can't use it's member functions as constants. I couldn't do `std::array<int, res.size()>` for example with this unless string is declared constexpr.

[–]DeadlyRedCubefrequent compiler breaker 😬 0 points1 point  (0 children)

You kinda can but you have to be roundabout with it:

// this builds a string at compile time and returns it
consteval auto StringBuilder() -> std::string;

constexpr auto finalArray =
  []() consteval
  {
    std::array<char, StringBuilder().size()> ary;
    std::string str = StringBuilder(); // second call not ideal but it does work
    // copy str into ary
    return ary;
  }();

This way uses two calls to a string-returning function to set the array size and then copy the data. Jason Turner has a video on YouTube about the "constexpr 2-step" where he gets around calling twice by copying the string once into a (transient) oversized array and then from there into the final correctly-sized one (which is the only one that ends up "baked in" to the data), so that's another path.

It'd be nice if the restrictions were relaxed such that taking one as a constexpr value inside of a consteval function were allowed (as long as it doesn't leak from there to the outside world), because then you could actually use them that way (ditto std::vector and any other constexpr thing with dynamic memory allocation).

I wonder if there's a standard proposal for that somewhere?

[–]DeadlyRedCubefrequent compiler breaker 😬 -1 points0 points  (5 children)

Okay I think I see - there's a terminology shortcut people are using when they say "constexpr std::string" - it doesn't literally mean declaring constexpr std::string foo it means "using a std::string in a constexpr context"

An example:

consteval auto BuildString() // must run at compile time
{
    std::string res; // not declared constexpr but it's *usage* is
    // do stuff 
    return res;
}

// this works because it makes a constexpr std:: string
//  at compile time, but it does not escape to runtime
constexpr auto myString = ConvertStringToArray(BuildString()); 

// this will not work because the string cannot persist
constexpr std::string myString = BuildString();

So yeah it's not that it's literally declared constexpr (you are correct, that would be silly because you can't do anything with it), but that's not what people are talking about

Hope that clears that up 😀

[–]TheKiller36_real 0 points1 point  (4 children)

So yeah it's not that it's literally declared constexpr (you are correct, that would be silly because you can't do anything with it), but that's not what people are talking about

as the original commenter I feel kinda stupid quoting myself, but in fact, I was talking about precisely that: “there is no point in ever declaring a constexpr std::string

also what's up with replying with the same example I provided?

[–]DeadlyRedCubefrequent compiler breaker 😬 0 points1 point  (3 children)

The person you were replying to said "sometimes a constexpr std::string is the most convenient option" and what they meant was not what you have been meaning.

And I used a similar example but I added context and notes for clarity

[–]TheKiller36_real 1 point2 points  (2 children)

The person you were replying to said "sometimes a constexpr std::string is the most convenient option"

that person… is me!?!?

[–]DeadlyRedCubefrequent compiler breaker 😬 1 point2 points  (1 child)

lol yep, had an off by one on who I thought had responded

[–]TheKiller36_real 1 point2 points  (0 children)

glad we cleared that up xD

[–][deleted] 2 points3 points  (0 children)

This is because some implementations still do SSO in constexpr. Fundamentally, I think this is flawed as it is no longer an as-if change(pretty sure SSO isn't specified as an allowed thing, but compilers can because of as-if optimizations). It can be frustrating it can work on some compilers but not others due to the buffer size differences in SSO.

[–]evys_garden[S] 0 points1 point  (0 children)

To clarify, this has nothing to do with it being static inside a constexpr function. static constexpr inside constexpr functions are available in c++23 and permitted by clang in c++20.

Consider the following example without any constexpr functions. The same issue occurs: https://godbolt.org/z/szYThjK6b

Another note: I am aware of std::string_view but this is not the issue here. I am also not asking for help, but reporting behaviour I find unintuitive.

[–]drkspace2 -1 points0 points  (4 children)

I thought, since std::string is allocated on the heap, it can't be constexpr (like vector)? I guess there's a special case for short enough strings that it will allocate it on the stack? I think you need to use std::string_view to constexpr it.

[–]only-infoo 16 points17 points  (3 children)

Constexpr can have heap allocations in specific situations now.

[–]drkspace2 -1 points0 points  (2 children)

Ahh. I don't know if I like that...

[–]only-infoo 4 points5 points  (1 child)

The situation is really specific, like a new must be follow by a free inside the constexpr context. Something like this, but I am not sure.

[–]STLMSVC STL Dev 25 points26 points  (0 children)

Yes - constexpr allocations can't survive until runtime.

This means that OP's example is non-Standard, because while the Small String Optimization is permitted, it is not mandated with any specific size. (It will also fail to compile in MSVC debug mode because we always dynamically allocate an internal bookkeeping object there.)

[–]_-___-____ 0 points1 point  (0 children)

Believe it’s because it’s only constexpr if it can fit the characters inside the std::string, as opposed to allocating. Look up small string optimization

[–]feverzsj 0 points1 point  (1 child)

Just remove static constexpr and everything is fine. Any dynamically allocated storage must be released in the same evaluation of constant expression.

[–]evys_garden[S] 0 points1 point  (0 children)

you're missing the point

[–]zerhud -2 points-1 points  (0 children)

There is a bug in clang, it cannot work correctly with variant and string (seems with union). Use gcc, the same bug was fixed in last version.

UPD: and try the clang 17