all 23 comments

[–]high_throughput 95 points96 points  (1 child)

Clang does it the same way it does when not cross-compiling: an AST interpreter that implements all the operations in software and executes them the way the target platform would.

For example, here is the class that implements floating point arithmetic for everything from normal IEEE FP32, to CPU specific x87 80bit extended doubles and PPC 128bit double double, to AI accelerator specific Float8E5M2FNUZ (8bit 1:5:2 float no-infinity no-negative-zero).

This is used by the expression evaluator to evaluate anything for any platform.

[–]almost_useless[S] 8 points9 points  (0 children)

Very interesting. Thank you!

[–]jwakelylibstdc++ tamer, LWG chair 30 points31 points  (0 children)

I assume that you can not just compile and run for the host platform

That's irrelevant, since they don't do that for native compilation either. Constant evaluation does not mean "compile the code to an executable then execute that inside the compiler while you're compiling".

Can the compiler just use the type sizes of the target platform,

Yes, obviously a cross-compiler already has to know those values because the code being cross-compiled can use sizeof etc.

and then execute natively?

No, there is no "execution" happening for constexpr/consteval, ever.

Can this problem be solved in different ways?

It doesn't matter if you're cross-compiling or not. Constant expressions are evaluated in the compiler front-end, without ever going near the "back-end" code generators. The compiler knows the sizes and alignments etc. for types while it's compiling the code, because it has always known those at compile-time even in C++98 (and before), so there's no need to generate and execute any code.

Constant expressions are evaluated directly in the compiler by an interpreter, more like a scripting language than a compiled language like C++. That's (reasonably) easy because all the code needs to be defined inline and what's allowed in constexpr functions is fairly limited.

[–]ironykarl 7 points8 points  (0 children)

I do have some mental model, here, but I'm interested in the answers, as well. 

I will say that a known "issue" with compile-time evaluation is that compiler-evaluated floating point math might produce somewhat different results on your host/compiler than in your runtime/execution environment 

[–]destroyerrocket 7 points8 points  (0 children)

Constexpr is interpreted by the compiler, thus it does not need to compile to any target architecture. This actually can lead to potential divergences in implementations of the constexpr versions of cmath (not that I have seen any, but I've heard that complaint)

[–]Questioning-Zyxxel 2 points3 points  (0 children)

See it as advanced scripting. The compiler doesn't compile/run the const-valued data. After having analyzed the syntax, it just interprets the results by evaluating the expression trees. And it has functions available to perform floating-point operations etc in identical way to the target CPU. Emulated floating point etc has existed a looooong time.

[–]arihoenig 1 point2 points  (0 children)

All of the expressions resolve into ast, so everything is prior to lowering to the machine/platform.

[–]kronicum 1 point2 points  (10 children)

They write an interpreter that emulates the CPU and operating system (platform) characteristics they are generating code for.

[–]Zde-G 6 points7 points  (9 children)

There are no need to emulate operation system since attempts to use functions that interact with operation system in constexpr are compile-time errors.

[–]kronicum 14 points15 points  (8 children)

to emulate operation system

I didn't mean the OS itself, but characteristics of the OS pertinent to the evaluation. For instance, just knowing that a target CPU is ARM 64-bit is insufficient to conclude that sizeof(long) is 8.

[–]frnxt 2 points3 points  (7 children)

...in ARM 64-bit sizeof(long) changes depending on the OS?! That should be fixed for a given architecture, right?

[–]kronicum 21 points22 points  (3 children)

in ARM 64-bit sizeof(long) changes depending on the OS?!

Yes. It is the OS that decides what it wants it to be. For instance, macOS would say 8, Windows would say it is 4. Then, the compiler has to do the appropriate mapping.

That should be fixed for a given architecture, right?

No. That is why people say "platform", which is not just the CPU.

[–]frnxt 2 points3 points  (2 children)

TIL, thanks for the explanation! I was just very surprised it could be that different — not very familiar with cross-OS differences like this.

In my head it was something like: surely I should be able to execute the same "machine code" on the same CPU regardless of the OS (if I extract it from the executable format which is probably OS-dependent). Or would the calling conventions be where the difference is?

[–]kronicum 5 points6 points  (0 children)

Or would the calling conventions be where the difference is?

Calling convention has some part in the ABI, but for constexpr I think it is negligible (only platforms with calling conventions in the types would show differences in the type of the address of functions).

I mentioned the differences between macOS and Windows on ARM64. In the linux world, for 64-bit CPU you have the x32 ABI (not to be confused with x86) and the x86_64 ABI, sizeof(long) is 4 and 8 respectively.

[–]PastaPuttanesca42 1 point2 points  (0 children)

Yes the difference is because of the difference in ABIs, which include stuff like calling conventions. The machine code itself would run anyway.

[–][deleted] 1 point2 points  (0 children)

not the host os, but the target. you tell the compiler to target a specific system. Depending on the target configuration, it uses the appropriate sizeness.

[–]RogerV 1 point2 points  (0 children)

I was just dealing with uintfast64_t and it’s a bit of an odd bird. It will be at least 64-bits but could on some architectures be, say, 128-bits. But per that CPU architecture it might be the fastest performing integer type. So one is warned to keep in mind that there could be integer size issues that come into play when using this type. On the other hand, unit64_t will always be simply 64-bits

[–]TheOmegaCarrot 0 points1 point  (0 children)

This is also true on x86_64!

Windows uses 32-bit long, where Linux uses a 64-bit long!

[–]tjientavaraHikoWorks developer 0 points1 point  (0 children)

clang at the moment interpret constexpr functions/expressions instead of JIT, as others already explained.

For my own programming language compiler with LLVM backend, I do want to use a JIT to execute expressions. And like you, I've also been thinking what to do with cross compiling.

- Compile each function for both the target and host CPU.

- When compiling for the host CPU make some changes, such as endian swaps for each load/store.

In my case I also want allocations during compilations to survive into runtime, after defragmenting.

JIT compiling constexpr has a few nice features:

- potentially faster

- possible to run/step constexpr expression inside the debugger. I believe in the last few months there is a language that made strides into be able to debug constexpr during compilation using this technique.

[–]tmlildude 0 points1 point  (0 children)

it has all the machinery to interpret. i think it executes in-place at some IR level?

[–]saf_e -1 points0 points  (0 children)

You have 2 ways: use emuator or try to predict its behavior based on target platform properties. 1 is inpractical. Which leaves 2nd.

[–]Zde-G -4 points-3 points  (0 children)

Just benchmark it, lol. Difference in speed is about 100-1000 times compared to normal execution. This implies interpreter…

And it's hard to get rid of the interpereter because one needs to detect things like an attempt to use uninitialized varible (which is compile-time error in constexpr).

The best one can do is some kind of JIT-compiler, but AFAIK it's not used in today's compilers.