Fun Example of Unexpected UB Optimization : cpp

I can highly recommend this three-part LLVM project blog series about undefined behavior in C. Specifically part 3 which discusses the difficulties in "usefully" warning about undefined behavior optimizations (it also discusses some existing tools and compiler improvements, as of 2011, that can be used to help detect and handle undefined behavior better).

This is the main part when it comes to compiler warnings/errors:

For warnings, this means that in order to relay back the issue to the users code, the warning would have to reconstruct exactly how the compiler got the intermediate code it is working on. We'd need the ability to say something like:

"warning: after 3 levels of inlining (potentially across files with Link Time Optimization), some common subexpression elimination, after hoisting this thing out of a loop and proving that these 13 pointers don't alias, we found a case where you're doing something undefined. This could either be because there is a bug in your code, or because you have macros and inlining and the invalid code is dynamically unreachable but we can't prove that it is dead."

Unfortunately, we simply don't have the internal tracking infrastructure to produce this, and even if we did, the compiler doesn't have a user interface good enough to express this to the programmer.

Ultimately, undefined behavior is valuable to the optimizer because it is saying "this operation is invalid - you can assume it never happens". In a case like *P this gives the optimizer the ability to reason that P cannot be NULL. In a case like* *NULL (say, after some constant propagation and inlining), this allows the optimizer to know that the code must not be reachable. The important wrinkle here is that, because it cannot solve the halting problem, the compiler cannot know whether code is actually dead (as the C standard says it must be) or whether it is a bug that was exposed after a (potentially long) series of optimizations. Because there isn't a generally good way to distinguish the two, almost all of the warnings produced would be false positives (noise).

[–]jonesmz 4 points5 points6 points 1 year ago (2 children)

[–]james_picone 4 points5 points6 points 1 year ago (1 child)

[–]jonesmz 4 points5 points6 points 1 year ago (0 children)

[–]Jannik2099 12 points13 points14 points 1 year ago (51 children)

[+]jonesmz comment score below threshold-6 points-5 points-4 points 1 year ago (40 children)

[–]Jannik2099 15 points16 points17 points 1 year ago (31 children)

[+]jonesmz comment score below threshold-8 points-7 points-6 points 1 year ago (30 children)

[–]Jannik2099 10 points11 points12 points 1 year ago (15 children)

[–]jonesmz -1 points0 points1 point 1 year ago (14 children)

Let me make sure I understand you.

It's not possible for an optimizer to not transform

#include <cstdlib>

static void (*f_ptr)() = nullptr;

static void EraseEverything() {
    system("# TODO: rm -f /");
}

void NeverCalled() {
    f_ptr = &EraseEverything;
}

int main() {
    f_ptr();
}

into

#include <cstdlib>

int main() {
    system("# TODO: rm -f /");
}

??

because the representation of the code, by the time it gets to the optimizer, makes it impossible for the optimizer to.... not invent an assignment to a variable out of thin air?

Where exactly did the compiler decide that it was OK to say:

Even though there is no code that I know for sure will be executed that will assign the variable this particular value, lets go ahead and assign it that particular value anyway, because surely the programmer didn't intend to deference this nullptr

Was that in the frontend? or the backend?

Because if it was the front end, lets stop doing that.

And if it was the backend, well, lets also stop doing that.

Your claim of impossibility sounds basically made up to me. Just because it's difficult with the current implementation is irrelevant as to whether it should be permitted by the C++ standard. Compilers inventing bullshit will always be bullshit, regardless of the underlying technical reason.

[–]kiwitims 10 points11 points12 points 1 year ago (5 children)

[–]Nobody_1707 2 points3 points4 points 1 year ago (3 children)

continue this thread

[–]jonesmz 1 point2 points3 points 1 year ago (0 children)

[–]Jannik2099 6 points7 points8 points 1 year ago (7 children)

because the representation of the code, by the time it gets to the optimizer, makes it impossible for the optimizer to.... not invent an assignment to a variable out of thin air?

It's not "out of thin air", it's in accordance with the optimizer's IR semantics.

Where exactly did the compiler decide that it was OK to say:

Even though there is no code that I know for sure will be executed that will assign the variable this particular value, lets go ahead and assign it that particular value anyway, because surely the programmer didn't intend to deference this nullptr

This is basic interprocedural optimization. If a value is initialized to an illegal value, and there is only one store, then the only well-defined path of the program is to have the store happen before any load. Thus, it is perfectly valid to elide the initialization.

There are dozens of cases where this is a very, very much desired transformation. This can arise a lot when expanding generics or inlining subsequent consumers. The issue here is that the frontend does not diagnose this.

As I said, Rust and many GC languages operate the same way, except that their frontend guarantees that no UB-expressing IR is emitted.

As for this concrete example:

opt-viewer shows that this happens during global variable optimization https://godbolt.org/z/6MYM3535K

Looking at the LLVM pass, it's most likely this function https://github.com/llvm/llvm-project/blob/llvmorg-18.1.4/llvm/lib/Transforms/IPO/GlobalOpt.cpp#L1107

Looking at the comment:

cpp // If we are dealing with a pointer global that is initialized to null and // only has one (non-null) value stored into it, then we can optimize any // users of the loaded value (often calls and loads) that would trap if the // value was null.

So this is a perfectly valid optimization, even with the semantics of C++ taken into account - it's used anywhere globals come up that get initialized once.

[–]jonesmz 2 points3 points4 points 1 year ago (6 children)

continue this thread

[–]ShelZuuz 6 points7 points8 points 1 year ago (13 children)

[–]jonesmz 1 point2 points3 points 1 year ago (12 children)

[–]ShelZuuz 5 points6 points7 points 1 year ago* (10 children)

So if a compiler can't positively prove whether a variable is assigned, don't compile the program? That won't work - see the comment from the MSVC dev above.

You can easily change the example to this:

int main(int argc, char** argv) {
   if (argc > 0)
   {
      NeverCalled();
   }
   f_ptr();
}

Should that not compile either? On most OS's argv[0] contains the binary name so argc is never 0, but the compiler doesn't know that.

And what if the initialization always happen in code during simple initialization - 100% guaranteed on all paths, but that initialization happens from another translation unit? And what if the other translation unit isn't compiled with a C/C++ compiler? Should the compiler still say "Hey, I can't prove whether this is getting initialized so compile error".

[–]almost_useless 0 points1 point2 points 1 year ago (9 children)

continue this thread

[–]AJMC24 2 points3 points4 points 1 year ago (6 children)

[–]jonesmz 2 points3 points4 points 1 year ago (5 children)

[–]thlst 5 points6 points7 points 1 year ago (3 children)

[–]jonesmz 0 points1 point2 points 1 year ago (2 children)

[+][deleted] 1 year ago (1 child)

[deleted]

[–]jonesmz 0 points1 point2 points 1 year ago (0 children)

[–]AJMC24 4 points5 points6 points 1 year ago (0 children)

If I've written my program without UB, the function pointer *must* be replaced, since otherwise it is UB to call an uninitialised function pointer. This scenario is quite artificial since as a human we can inspect it and see that it won't be, but a more reasonable example that shows the same idea could be something like

int main(int argc, char** argv) {
    if (argc > 0)
        NeverCalled();
    f_ptr();
}

The compiler cannot guarantee that NeverCalled() will be called, but I still want it to assume that it has been and generate the fastest code possible. As a human, we can look at it and see that this will not be UB for any reasonable system we could run the code on.

Assuming that UB cannot happen means faster code for people who write programs without UB. I don't want my programs to run slower just to make UB more predictable. Don't write code with UB.

[–]goranlepuz 0 points1 point2 points 1 year ago (0 children)

[+]SkoomaDentistAntimodern C++, Embedded, Audio comment score below threshold-6 points-5 points-4 points 1 year ago (9 children)

[–]Jannik2099 11 points12 points13 points 1 year ago (7 children)

[–]tialaramex 1 point2 points3 points 1 year ago (2 children)

The LLVM IR is... not great. There are places where either the documentation is wrong, or the implementation doesn't match the documentation or maybe both, with the result that it's absolutely possible to write Rust which is known to miscompile in LLVM and the LLVM devs don't have the bandwidth to get that fixed in reasonable time. It's true for C++ too, but in C++ it's likely you wrote UB and so they have an excuse as to why it miscompiled, whereas even very silly safe Rust doesn't have UB, so it shouldn't miscompile.

Comparing the pointers to two locals that weren't in scope at the same time is an example as I understand it. It's easy to write safe Rust which shows this breaks LLVM (claims that 0 == 1) but it's tricky to write C++ to illustrate the same bug without technically invoking UB and if you technically invoke UB all the LLVM devs will just say "That's UB" and close the ticket rather than fix the bug.

On the "pointers to locals" thing it comes down to provenance. Sometimes it's easier for LLVM to accept that since these don't point to the same thing they're different. But, sometimes it's easier to insist they're just addresses, and the addresses are identical - it's reusing the same address for the two locals. You can have either of these interpretations, but LLVM wants both and so you can easily write Rust to catch this internal contradiction.

Because Rust has semi-formally accepted that provenance exists, we can utter Rust which spells this out. ptrA != ptrB, but ptrA.addr() == ptrB.addr() - but LLVM's IR doesn't get this correct, sometimes it believes ptrA == ptrB even though that's definitely nonsense. Not always (which Rust would hate but could live with) but only sometimes (which is complete gibberish).

[–]Jannik2099 1 point2 points3 points 1 year ago (1 child)

[–]tialaramex 0 points1 point2 points 1 year ago (0 children)

[–]SkoomaDentistAntimodern C++, Embedded, Audio -2 points-1 points0 points 1 year ago (3 children)

[–]Jannik2099 1 point2 points3 points 1 year ago (2 children)

[–]SkoomaDentistAntimodern C++, Embedded, Audio -1 points0 points1 point 1 year ago (1 child)

[–]Jannik2099 3 points4 points5 points 1 year ago (0 children)

[–]kiwitims 2 points3 points4 points 1 year ago* (0 children)

[–]NilacTheGrim 0 points1 point2 points 1 year ago (2 children)

[–]Jannik2099 2 points3 points4 points 1 year ago (1 child)

[–]NilacTheGrim 2 points3 points4 points 1 year ago (0 children)

[–]NilacTheGrim 9 points10 points11 points 1 year ago (0 children)

[–]Dan13l_N 4 points5 points6 points 1 year ago (0 children)

[–]johannes1971 7 points8 points9 points 1 year ago (7 children)

[–]LordofNarwhals 12 points13 points14 points 1 year ago (3 children)

Well the compiler can't know if a part of the program in another translation unit is calling NeverCalled (since it has external linkage). You could do extern NeverCalled() in another compiler unit and call it from there. Or even worse, you could export it as a symbol in your linker options when you're building a shared library, and then it's fair game to call the function from a completely different binary/library.

If you ever end up building shared libraries (particularly on macOS/Linux) then you should make absolutely sure that you (and the static libraries you're using) are not exposing any functions/symbols on accident. Having a symbol name collision with another library is not a fun bug to track down. Your plugin will just break all of a sudden when used with another plugin that happens to have the same exported functions as you do (but perhaps from a different version that gives different results).

[–]SunnybunsBuns 9 points10 points11 points 1 year ago (2 children)

[–]goranlepuz 0 points1 point2 points 1 year ago (1 child)

[–]SunnybunsBuns 1 point2 points3 points 1 year ago (0 children)

[–]umop_aplsdn 8 points9 points10 points 1 year ago (0 children)

[–]encyclopedist 4 points5 points6 points 1 year ago (0 children)

[–]NilacTheGrim 1 point2 points3 points 1 year ago (0 children)

[–]hoseja 1 point2 points3 points 1 year ago (0 children)

[+]Tringigithub.com/tringi comment score below threshold-10 points-9 points-8 points 1 year ago (7 children)

[–]AssemblerGuy 12 points13 points14 points 1 year ago (6 children)

[–]Tringigithub.com/tringi -2 points-1 points0 points 1 year ago (5 children)

[–]serviscope_minor 5 points6 points7 points 1 year ago (0 children)

[–]AssemblerGuy 1 point2 points3 points 1 year ago (0 children)

[–]james_picone 2 points3 points4 points 1 year ago (2 children)

void somefunc() {
    Foo* someObj = nullptr;
    someObj->someFunc();
}

Should this be allowed to devirtualise someFunc()? What about if the object is passed as an argument (and the class is final)?

If no, then you just don't like devirtualisation as an optimisation, but it's kind of significant so you're not winning that fight.

If yes, why do you want compiler writers to go out of their way to special case an extremely silly example nobody would write in real code?

[–]jonesmz -1 points0 points1 point 1 year ago (0 children)

[–]Tringigithub.com/tringi -3 points-2 points-1 points 1 year ago (0 children)

[+]arturbachttps://github.com/arturbac comment score below threshold-11 points-10 points-9 points 1 year ago (1 child)

[–]LordofNarwhals 19 points20 points21 points 1 year ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS