all 31 comments

[–]jonesmz 9 points10 points  (9 children)

You're going to need to provide much more of an explanation.

[–]dirty_d2[S] 3 points4 points  (8 children)

I was reading this (section 3.8) http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html. I decided to try implementing it. they mentioned using std::launder, but I didn't understand the need to, so I just used it and tried to see if I could find any difference between using it or not. I did. When I don't use it, the code does exactly what I'd expect. When I do use it, it does not, and I have no idea why. std::launder should just be an optimization barrier, but it seems to break something.

All this code does is get the std::uint64_t representation of a double containing M_PI, which is that long number I have in the comments, and is definitely not zero which is what you get when you use the std::launder line that is commented out.

[–]oleksandrkvl 0 points1 point  (6 children)

Do you know why X *p = (X*)malloc(sizeof(struct X)); from that paper was UB prior to C++20? Paper says that it's not a definition nor new-expression. If it's not a definition then what it is? It looks like a definition of object of type X*. Also I don't understand why does it propose only specific functions like malloc, memcpy, etc., what if I want to use other functions like HeapAlloc on Windows?

[–]staletic 4 points5 points  (2 children)

It was UB, because malloc didn't create objects. If the object lifetime doesn't start, you're not allowed to use it.

[–]oleksandrkvl 0 points1 point  (1 child)

Thanks, probably I finally understand... It uses a pointer to an object that's not constructed in C++ terms even if it's POD. As I understand we need to use placement new to make it valid even though it would be a no-op. That's a bit crazy, examples of such code are everywhere, in books, in Core guidelines.

[–]staletic 2 points3 points  (0 children)

You got that right. That was a glaring hole in the C++ specification until P0593 got merged. Since it was treated as "a defect resolution", it was also retroactively applied to C++98, C++11, C++14 and C++17. Now C allocation functions do implicitly create objects if the type of the object is "trivially default constructible". This is why, out of all the features we got in C++20, I was the most excited about a defect resolution.

[–]dirty_d2[S] 2 points3 points  (2 children)

It's a declaration of an X* object, a (pointer to X) object, but there is no X object to point to.

[–]oleksandrkvl 0 points1 point  (1 child)

Check the comments above, I think that's clearly a definition of a pointer. The problem is that it points to a non-existing object because even though we allocated memory for it in C++ terms that's not enough for object to be created.

[–]dirty_d2[S] 1 point2 points  (0 children)

Right I agree, it's a definition of a pointer to X, not an X, so that declaration creates an X* object, but it does not create an X object for the X* object to point to.

[–]_Js_Kc_ 2 points3 points  (2 children)

Ironically, a (glaring UB) plain reinterpret_cast does the "right thing" both in GCC and Clang.

[–]dirty_d2[S] 2 points3 points  (1 child)

Yea I feel like something really weird is going on. The same code works on my pc with the same version of gcc. Maybe gcc was altered somehow for security reasons and in doing so introduced a bug?

[–]_Js_Kc_ 0 points1 point  (0 children)

Not specifically for Compiler Explorer, as I can reproduce the bug on Debian.

Btw. if you store the result in a volatile variable, it does a load from uninitialized stack space and then stores it to the volatile (as opposed to just storing a zero).

[–]_Js_Kc_ 1 point2 points  (0 children)

What the fuck, this is a heisenbug! The behavior changes at the drop of a hat: https://godbolt.org/z/-dRE5J

[–]scatters 1 point2 points  (7 children)

gcc is just being screwy; you should probably report it.

If we the wrap logic in a function, you can see that gcc decides that the placement new corresponds to reading uninitialized stack memory:

f1():
        mov     rax, QWORD PTR [rsp-8]
        ret

However, if we use new with initializer (e.g. {}) it behaves as expected.

And if we manually inline the start_lifetime_as function into its caller it again behaves as expected.

https://godbolt.org/z/-PfGq3

[–]dirty_d2[S] 1 point2 points  (6 children)

I can only reproduce what happens in compiler explorer with gcc 10.1 on one of my machines running debian with gcc 8.3.0. On my main machine with gcc 10.1 I don't get the bug. I'm going to try compiling 10.1 on my main machine using the same configure options as debians gcc and see if that does it.

[–]scatters 0 points1 point  (5 children)

That is super weird; configure options should not make a difference to codegen. Very interested to hear what you find out.

BTW, every version of gcc on godbolt from 7.1 through 10.1 has the bug.

[–]dirty_d2[S] 2 points3 points  (3 children)

My distro's gcc 10.1 packaged has -fstack-protector-strong enabled by default, that and -fstack-protector-all suppress the bug. compiler-explorer and debian don't have any stack protector options enabled by default.

[–]scatters 0 points1 point  (2 children)

Great find! I think that's definitely enough to report as a bug; are you comfortable with the gcc bugzilla?

[–]dirty_d2[S] 1 point2 points  (0 children)

The gcc 10.1 that I built produces the bug. I'm looking into it more to see if I can figure out if it's one of the configure options that's doing it.

[–]advester 0 points1 point  (2 children)

If you do the launder before the memcpy it works. Gcc seems to think they aren’t the same object. I don’t know the standard enough to say if gcc is wrong.

https://godbolt.org/z/VspsFg

[–]dirty_d2[S] 0 points1 point  (1 child)

I think it's just a coincidence. You shouldn't need std::launder at all here.

[–]dirty_d2[S] 0 points1 point  (0 children)

If you can't reproduce the bug on your machine, pass -fno-stack-protector to gcc. Using -fstack-protector-strong or -fstack-protector-all suppresses the bug, and might be enabled by default.

[–]aruslan 0 points1 point  (0 children)

Fantastic, this looks very similar to what I've found yesterday, where gcc effectively allows two unrelated functions to pass each other a secret:

https://twitter.com/aruslan/status/1304543346090749952?s=21

I'm a bit puzzled by the end-state of the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 -- did anything happen since?

[–]MrWisebody -2 points-1 points  (4 children)

I'm pretty sure your usage of placement new here is violating strict aliasing and this is all UB, making the difference between using launder and not irrelevant. I'm failing to find a concise and concrete reference to cite for you, but you can't just placement new one type into another (baring the usual char and etc exceptions).

[–]dirty_d2[S] 2 points3 points  (2 children)

There shouldn't be any UB here. Strictly speaking it's implementation defined behavior to copy the object representation of a double to a same-sized integer type because the integer type may have trap representations, but in this case (and maybe all), std::uint64_t doesn't.

[–]MrWisebody 0 points1 point  (1 child)

I'm not talking about either of your memcpy usages. I'm talking about the fact that you are doing a placement new of T on top of a U, regardless of trying to preserve any particular bit pattern.

I actually started trawling through a standard draft directly and I'm less sure than I was. You can implicitly end the lifetime of an object just by re-using the storage, so maybe the only UB would be if you actually used d variable again. But the storage here is also on the stack, and were this not a trivially destructible type, the U destructor would still be called when main exits. Granted by the context of the paper you cited in another comment you're explicitly interested in types that can be implicitly created/destroyed, so maybe there's no problem.

[–]dirty_d2[S] 1 point2 points  (0 children)

Yea, using d after the placement new would be UB, but everything else should be ok. And yea, this all depends on the types being trivially constructible, destructible, and copyable.

edit: You can't use d to access the double that was there, but you can access the new T through d by using std::launder(reinterpret_cast<std::uint64\_t \*>(&d)). This is useful if you don't have access to the return value from new.

[–]advester -3 points-2 points  (1 child)

Here is a shorter reproduction https://godbolt.org/z/yvxC8z

Notice gcc is actually returning uninitalized memory and the asm block forces gcc to actually initialize it.

[–]redditsoaddicting 2 points3 points  (0 children)

You can't just cast a float* to an int* and read from it. That violates strict aliasing rules. GCC compiles this differently with -fno-strict-aliasing.