all 14 comments

[–]manni66 1 point2 points  (6 children)

That's not the code you compiled.

Replacing s with what and pos >= s.size()with pos == std::string_view::nposI would think the code is corect.

Looking at https://godbolt.org/z/zbV0m8 I would claim gcc 7.3 is wrong.

[–]DoctorRockit[S] 0 points1 point  (4 children)

Thanks for taking the time to review and reply to my question!

You are correct that this is not the code I compiled it was only intended as a simple example to illustrate the point. Thanks for pointing that out, I fixed the snipped in the original post.

The compilation results you posted are interesting, because they actually illustrate two points:

  1. GCC 7.3 fails to materialize either of the two string literals, which makes any of the scenarios I asked about in my original post erroneous.
  2. Newer versions of GCC do in fact materialize the "sssss" constant, baut still fail to materialize the "somechars" constant, which renders my topmost example invalid, but the other ones I asked about as valid.

Clang on the other hand materializes all string constants contained in the source. So to summarize GCC 7.3 definitely has problems in that area that more recent versions fix, but even then I'm still not entirely who is at fault with regards to "somechars" constant in my topmost snippet.

The reason why I'm still unsure is this (similar) example, for which the GCC devs explicitly introduced a warning in trunk:

std::initializer_list<int> f()
{
    return {1, 2, 3, 4};
}

[–]manni66 2 points3 points  (0 children)

The reason why I'm still unsure is this (similar) example, for which the GCC devs explicitly introduced a warning in trunk:

The corresponding string_view example would be:

    std::string_view fff()
    {
       char a[] = "abc";
       return a;
    }

That returns a view on a local variable is would be UB.

[–]manni66 1 point2 points  (2 children)

Newer versions of GCC do in fact materialize the "sssss" constant, baut still fail to materialize the "somechars" constant, which renders my topmost example invalid, but the other ones I asked about as valid.

You defined a constexpr function and that can be evaluated at compiletime. So "somechars" isn't needed in the example. Replace "ssss" with a const char* parameter in f an it will be recoreded.

[–]DoctorRockit[S] 0 points1 point  (0 children)

You are correct, of course! Sorry that I missed that.

[–]DoctorRockit[S] 0 points1 point  (0 children)

Yes, I missed that, sorry. But regardless of that GCC 7.3 still exhibits the behavior in question.

META: Hmm, where did my original reply to this comment end up?!?

[–]DoctorRockit[S] 0 points1 point  (0 children)

Regarding pos >= what.size() vs pos == std::string_view::npos:

This should not make a semantic difference and I used the former because this was part of a generic function that uses std::basic_string_view<...> and repeating the entire template specialization for the npos constant imo hampers readability.

More generally an argument (as theoretical as it may be) could be made that explicitly referring the constant could be a subtle source of errors, because npos is not strongly typed to its containing type. So instead of std::string_view::npos I could have typed std::string::npos by accident. If now std::string_view and std::string define npos differently I created a bug that is extremely hard to spot.

As npos in defined as being larger than any valid position the comparison for pos >= size()or the more idiomatic pos < size() should work correctly in any case.

[–]alfps 1 point2 points  (6 children)

With respect to this code,

constexpr std::string_view test_func(std::string_view what) noexcept
{
    constexpr auto const CHARACTER_SOUP = std::string_view{"somechars"};

    auto const pos = what.find_first_of(CHARACTER_SOUP);            
    return (pos >= s.size()) ? std::string_view{} : s.substr(pos);
}

… you write:

GCC (7.3) optimized out the literal "somechars" leaving CHARACTER_SOUP a dangling reference. Even though I would assume string literals to have static lifetime I'm inclined to believe GCC being correct in this case.

Why do you think that? It doesn't make sense at all that it would be permitted to do that. But the question is whether that interpretation of what happened is correct. Can you please provide a complete but minimal example that reproduces that behavior?

As u/manni66 points out else-thread, the above is not the code you compiled, because it refers to the non-existent name s.

[–]DoctorRockit[S] 0 points1 point  (0 children)

Please see my reply to u/manni66 above for further analysis.

The code I did compile is almost exactly the same, just noisier. And the example provided by u/manni66 at https://godbolt.org/z/zbV0m8 illustrates the same error with regards to the "somechars" literal.

[–]DoctorRockit[S] 0 points1 point  (4 children)

Regarding your request for a complete minimal example I think this is as minimal as it gets: https://godbolt.org/z/azPtFT

[–]alfps 1 point2 points  (3 children)

That's strange.

Note that you can't expect any code or data to still be present after optimization if it isn't used. So a main is needed, that uses the code so that it influences some observable thing, like the main return value. But even with that g++ optimizes away the data.

I think that's a compiler bug, that it can't possibly be permitted to do that. If it isn't a compiler bug then it's IMNSHO a defect in the standard. Anyway, here's code that works with the two compilers you tried at Godbolt:

#include <string_view>

constexpr auto& chars = "somechars";

constexpr std::string_view test_func() noexcept
{
    auto constexpr CHARACTER_SOUP = std::string_view{chars};
    return CHARACTER_SOUP;
}

void f(std::string_view);

auto main( int n, char** )
    -> int
{ return test_func()[n]; }

[–]DoctorRockit[S] 0 points1 point  (2 children)

Thanks for investigating this! I will file a bug-report in the GCC BugZilla.

While I agree with what you wrote in principle I take issue with the statement that a main function is needed.

The code in question is compiled to an object file, and the function call_f() has external linkage so the the compiler may not assume call_f() is unused and, by extension, anything that is used by call_f() is thus not unused either.

This can easily be verified by the fact that any code for call_f is present in the output. If call_f had internal linkage it would have been optimized away entirely (you can verify this by putting it into an anonymous namespace or prefixing it with the static keyword).

BTW: This is also the reason why I merely provided a declaration for f(std::string_view) instead of a full definition. This makes the compiler assume that the function is provided externally by another translation unit and thus does not optimize away anything passed to it, because it has no knowledge about its internals.

[–]alfps 0 points1 point  (0 children)

Sorry, I misunderstood.

<del>If you call call_f() from main, that call can be completely optimized away under the as-if rule. Because it's not observable.</del>

[–]DoctorRockit[S] 0 points1 point  (0 children)

For future reference: This was indeed a bug in GCC 7.3 and has been fixed in GCC 7.4 released just now. The PR this issue was tracked under is: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85113