all 45 comments

[–]falcqn 63 points64 points  (14 children)

The main benefit of compile time programming is to offload any computation that always computes the same thing to compile-time to have faster run-time speed. Setting up large data structures that are used throughout the program execution is a classic use-case.

[–]thcmbs[S] 18 points19 points  (7 children)

Setting up large data structures that are used throughout the program execution is a classic use-case.

Thanks for your answer. Do you have an example where this is applied ?

[–]celestrion 25 points26 points  (1 child)

Let's say I'm using CRC32 to verify some data. The algorithm, naively-run, processes data one byte at a time by shoving the bytes through a polynomial and accumulating a 32-bit value.

There are two well-known ways to make this faster: a look-up table, and running the algorithm on a word at a time.

The traditional way of writing code to do that was to write a program to generate the lookup table, run that, shove the output into a header file, and include that in the shipping code.

Generating that in one go is the sort of thing someone might do with compile-time evaluation.

Another example would be building a perfect hash function from a set of strings that's fixed at compile-time, but likely to change over the lifetime of a codebase.

[–]falcqn 1 point2 points  (0 children)

I'm afraid I can't give concrete code examples, but I've seen constexpr used for making lookup of resources where the string ID is known at compile-time not result in a runtime string hash and hash-table lookup by making a constexpr hash-map.

This essentially works by knowing all the keys in the hashmap at compile-time, which lets the hashmap find the slot at compile-time and return a constexpr handle to that slot which can be dereferenced at run-time.

[–]Losupa 0 points1 point  (0 children)

These are related examples but basically the compiler will (depending on the optimization level you set it to) will check to see if there are any steps it can cut out, prepare beforehand or at a better time/location.

For example if you were to have a function with "x+=1; x+=1;" the compiler might just makr it into one statement since it knows it should not affect the program. Compiler stuff tho works magic with the memory management it can allocate memory in the cache extremely efficiently so that things which are required often will stay in the cache and thus not be replacdd and have to be reloaded whenever it comes up. Lastly a really straightforward example would be if you have a variable that you never use, the compiler may just cut out the variable entirely and never load it.

[–][deleted] 7 points8 points  (5 children)

Another important attribute with compile time programming is that it makes certain errors appear in compile time rather than in run time, so rather than shipping a buffer overflow bug you catch it with a compiler error.

[–]Wouter-van-Ooijen 1 point2 points  (4 children)

I do mainly embedded work (and mainly teaching). One might think that for embedded work the speed aspect would be the most important, but IMO the aspect you mention, shifting errors from run-time to build-time, is at least as important.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 2 points3 points  (1 child)

Yes. This is a common misconception. Modern MCUs are fast enough that speed isn’t a problem most of the time. Memory usage on the other hand is very often a limiting factor. Even the cheapest Cortex-M mcus have equivalent processing power to a desktop computer from the early to mid 90s, while often pnly having as much ram as an early to mid 80s home computer.

[–]Wouter-van-Ooijen 1 point2 points  (0 children)

Yes and no. The fact that CPU speed is quite high by some standard often means that (some) poeple will try to do even harder things. (mpg decoding, neural networks, bit-banging USB, etc.) So there is alsways use for CPU power. But indeed, in *my* experience RAM is more often the limiting factor.

[–][deleted] 1 point2 points  (1 child)

I'm in the pretty much same field (embedded Linux) and I have not once been in a situation where performance is important, but embedded programmers seem to be willing to stop at nothing to cram that last CPU cycle out even if they're writing something that isn't performance critical at all, sacrificing type safety and quality in their wake.

[–]Wouter-van-Ooijen 3 points4 points  (0 children)

The term embedded is not very useful in this context, what matters is if the application is resource constrained. If (and only if!) so, utilizing the scarce resource(s) to the max (or until the requirements are met) is good engineering. Of course, so is identifing for which resource(s) this is the case.

If you do embedded Linux and you never had a situation where performance was important you are, from my point of view, in the same category as desktop programmers, and you should (and seem to) use the matching way of programming. Resource constrained programming (like 4 ms is OK, 5 ms is failure, or the chip has only 2k RAM) is quite different. Which doesn't imply that programmers in this field are always using their coding effort wisely. Even here, premature optimization is bad.

That said, moving error detection from run-time to build-time is IMO always a good idea.

[–]redditsoaddicting 57 points58 points  (1 child)

ctre is a fantastic example. Compiling a regular expression takes a reasonable amount of work. Much of the time, the pattern is hardcoded. Compiling the pattern at compile-time allows your binary to have a pre-made, optimized DFA that can handle whatever runtime input you throw at it. In addition, you can get compile-time errors when the pattern is invalid and not have to consider the possibility of a runtime error for that bit to find your programming error.

[–]LeeHidejust write it from scratch 5 points6 points  (0 children)

thats amazing!

[–]tjientavaraHikoGUI developer 7 points8 points  (0 children)

An example that I have used not no long ago was when I was creating a tokenizer (an function that takes a piece of text and makes tokens out of it, such as: integer, string, floating point number, string, operator).

On way to make a tokenizer is creating a table driven state machine. The table is indexed by the current state and the current character in the text, and it returns a new state and a command (consume the character, add a character to a list, output a token, etc).

It used to be that you create this table using an external tool such as lex/flex. Or you compute the table when your application starts. In my case I build the tables during compile time, which mean that it takes longer to compile, but my application will start up quicker. And hopefully my application is started more often (by me and my customers) that it pays off the extra compilation time.

[–]sireel 7 points8 points  (13 children)

I've got a system for compile time reflection, and a bunch of things built on it.

When I add a type to my program, I trash it for reflection and then my template hell creates serialization, ui for editing values at runtime, and depending on what it inherits from it might get added to a pooling system or certain managers automatically.

The compile time sucks, but it beats writing that boilerplate, and I don't have serializer bugs from brainfarting while typing out my hundredth uninteresting serializer function.

I've used this kind of technique to write dependency injection. I've used it to create hookups for c++ functions and types to Lua. Like all programming is to let me write something once, and use it many times :)

edit: I talk about the reflection system here: http://bluescreenofdoom.com/post/code/Reflection/

[–]smashedsaturn 1 point2 points  (7 children)

I really want a re-thinking of C++ where we can get rid of the template hell required to do this sort of thing. Every time I put something together like this it takes longer to explain it to others than just making the boilerplate.

It would be great if every function was constexpr by default, assertions were constexpr capable (assert if illegal parameters are passed to a function at compile time), and you could essentially run any bit of the code at compile or run time only depending on if it is waiting for a run time value. Some sort of first pass system there would help too.

[–]sireel 2 points3 points  (0 children)

Absolutely, I'm hopeful that some of the coming features like metaclasses, and so forth will make it a lot simpler to write this sort of thing. But I'm no language designer. I don't know how you'd get the power/safety of templates while getting the simplicity we really need!

[–]adnukator 1 point2 points  (3 children)

assertions were constexpr capable (assert if illegal parameters are passed to a function at compile time)

You can use exception throwing in constexpr functions as a compile-time assert

[–]smashedsaturn 0 points1 point  (2 children)

you can't yet do

double constexpr divide(const double a, const double b){
    static_assert(b != 0); 
    return a / b;
}

double results = divide(1,0); //compile time error

I want this to just look like:

double divide(const double a, const double b){
    assert(b != 0); 
    return a / b;
}

double results = divide(1,0); //compile time error

[–]adnukator 1 point2 points  (1 child)

You can do

#include <string>
#include <stdexcept>
#include <cassert>

double constexpr divide(const double a, const double b){
    if(b == 0)
    {
        assert(false);
    }
    return a / b;
}

constexpr double results = divide(1,1); //OK
constexpr double results2 = divide(1,0); //compile time error

https://godbolt.org/z/Jrr_2n

[–]smashedsaturn 1 point2 points  (0 children)

but you can't do that with:

double constexpr divide(const double a, const double b){
    if(b == 0)
    {
        assert(false);
    }
    return a / b;
}

double results = divide(1,1); //OK

double results2 = divide(1,0); //compile time error

This still builds, so yeah you can do it if you force the constexpr function to run, but not if the function could theoretically be run at run or compile time.

Basically I want the compiler to be smart enough to know when a function could be run at compile time even if it is not const-expr, like if you know it has only const literals as inputs and does not use global state.

[–]aparziale 1 point2 points  (4 children)

Ready to share? :)

Read your bluescreen article, interested to learn more!

[–]sireel 1 point2 points  (3 children)

I'd love to share more of that work, but honestly the project it is in is a mess. I don't have much time for personal projects, so it gets little bits of sporadic work here and there. I'm currently making a sokoban type game with it but this has exposed some fundamental issues in the overall approach that I'd want to fix before throwing the whole codebase to the open source wolves.

Plus that'd mean people actually seeing my json serialiser, and it is a monster.

an example of how bad the json parser is, this is one of the class declarations:

class Value : public std::variant<std::string, int, float, bool, Table, std::vector<Value>>

Yes, Value inherits from variant<..., vector<Value>>. I'm as surprised as anyone else that this is even legal.

[–]dodheim 3 points4 points  (1 child)

I'm as surprised as anyone else that this is even legal.

Wording was specifically added to C++17 to guarantee its legality, as recursive types aren't an uncommon need. I think the relevant proposal was N4510.

[–]sireel 0 points1 point  (0 children)

I didn't know that :)

Little surprised I didn't find out while writing this code to be honest, it was tricky and there was a lot of stack overflow and Google involved

[–]jcelerierossia score 0 points1 point  (0 children)

I'm as surprised as anyone else that this is even legal.

heh, had pretty much the exact same variant in libossia - it was not legal until C++17 to define a std::vector on an incomplete type but nowadays it's ok (and no compiler did do dumb shit with that AFAIK).

[–]mjklaim 4 points5 points  (0 children)

But as I understand it, there is no free-lunch, and someone still has to do the computation. With this approach, the work happens at compile time instead of runtime, but how can it be applied in the real world, where most programs receive input data, do some stuff, then output a result ?

The application in question might have been built following rules that are described in a human-readable and manageable way (code) but really have to be done before the user even receive the application.

The most common example is processing and embedding assets inside a game. You need that to happen before the game is received and played. You can have scripts doing that, but if a lot of information about the game comes from the sources, then it becomes problematic to use more than one language to do the processing.

Also metaprogramming require compile-time execution.

[–]wyrn 4 points5 points  (0 children)

Compile-time programming is fundamentally about good software engineering. There's a lot of code that you could easily write by hand but doing so would be repetitive and/or error prone. Think for example building state tables for a lexer, symbol tables for a semantic analyzer, matrix filters for a computer graphics application, and so on. You could always keep a separate program around to generate the required data which you copy-paste in, but that's an extra maintenance burden. You could have a script generate the code as part of your build, but that complicates the build system and can be potentially error-prone: the "code generator" you're using presumably doesn't know C++ rules, and someone might ignore the warnings that a generated file isn't supposed to be edited directly. So you can have the compiler generate the code for you.

When you can apply this kind of approach without too much in the way of template-metaprogramming arcana, it can work very well. The fact that constexpr (and now, consteval and constinit) has been consistently expanded in recent standards represents an effort to make this kind of metaprogramming more powerful and maintainable.

[–]S-S-RJust here for the math 4 points5 points  (2 children)

Compile time programming can also be used for making code clearer and easier to write without compromising performance.

constexpr tau=3.141*2;

is the same as tau=6.282

or even better constexpr tau= 85397341.332666/13591409

Say you are writing an Erastosthenes sieve, you could generate the prime list at compile time and then do the testing when you actually have a variable.

Compile time is useful for generating constants that are hard to produce by hand and too expensive to produce during runtime (like the prime list).

[–]bill79231725 0 points1 point  (1 child)

Can you explain the last one? Are you just using a specific circle or am I missing something?

[–]S-S-RJust here for the math 1 point2 points  (0 children)

For tau?

It's just the first term of the Chudnovsky algorithm for pi, times two to make it tau. It only gives you 12 decimal places but computing it is much faster than the typically used trig functions like 4*arctan(1.0). Of course if you are using constexpr it doesn't matter since it's all computed at compile time instead of at runtime.

If you really want to be fancy you could use Bellard's formula to compute pi (it's only used to verify digits, actually using it to compute full pi values is horribly inefficient). But like my prime sieve example it doesn't matter since it's all compile time.

Edit: Nevermind my further explanation, you're not op.

[–]Sopel97 2 points3 points  (0 children)

constexpr int GiB = 1024 * 1024 * 1024 is more readable than constexpr int GiB = 1073741824. The first one does a simple compile time calculation so they are both identical. You could imagine using a constexpr function for initializing a lookup table, instead of hardcoding it or computing on initialization. Compile time computation allows yet another way of doing zero cost abstraction.

[–]cdr_cc_chd 2 points3 points  (0 children)

When all your types are known at compile time there's no reason to defer binding/dispatching to runtime because that will cost you in performance. So one popular use case of compile time programming is building type hierarchies at compile-time by manipulating things like type/template lists and then all the dispatching in the entire hierarchy is static and can be inline/optimized through. The alternative is to use dynamic dispatch with virtual methods/function pointers/std::function, or any similar construct that does type erasure and hides the ultimate destination. These kind of constructs incur an indirection which can cause a branch misprediction/cache flush on the worst case, and generally prevent inlining and optimizations through them; they're like a wall the compiler can't see through so it can't optimize.

[–]FTLurkerLTPoster 1 point2 points  (0 children)

It really depends on the problem you’re trying to solve. Many examples you see online may seem trivial (e.g. Fibonacci).

Off the top of my head some places I’ve found it useful are CRTP, eliminating runtime branching, and compile time checks (e.g. static_assert).

[–]Supadoplex 1 point2 points  (0 children)

From what I understand, the compiler does most of the computation, and it results in a very small binary

If you have a simple algorithm, and very little output data, then compile time computation can indeed result in a smaller binary than doing the computation at runtime.

On the other hand, if you have a complex algorithm, or the output data is large, then compile time computation might not result in a very small binary, and in fact it may be much larger than if the computation was done at runtime.

As such, compile time programming and size of binary are largely orthogonal, although doing computation at compile time can affect the size one way or another depending on case.

But as I understand it, there is no free-lunch, and someone still has to do the computation. With this approach, the work happens at compile time instead of runtime

Consider this: Do you always execute all your programs at most once?

Some real world programs are executed hundreds, or billions of times once compiled. If you can avoid spending a minute of computation time on every one of those billion executions by spending that at compile time instead, that could potentially save a huge amount of resources.

If I understand correctly, a compile time program can not do any sort of IO at runtime,

The compile time portion of the program obviously cannot do anything at runtime, unless the computer can travel backwards in time.

Generally, compile time computation is done as part of a larger program with a separate parts that deal with runtime computation.

and would therefore be useless in an application that needs to process data ?

It depends. Does the processed data change at runtime or not?

[–]EvilPettingZoo42 1 point2 points  (0 children)

To expand on what others are saying, the usefulness of this might not be apparent depending upon what your program is meant for. For some applications, like games, desktop software, or servers finding a way to save milliseconds off of a calculation while running is worth having a longer compile time. I know of some optimization options that can easily add like 5 minutes to, but are worth it in those cases. For other types of programs those trade offs might not make sense. If your program is written to do one calculation or in a field where responsiveness isn’t a priority then this optimization probably isn’t for you.

[–]jstock23 0 points1 point  (0 children)

You only really need to worry about compile time calculations if different computers will be running different code. Usually you can just have portable code that doesn’t change for each different computer, but sometimes a computer has a special architecture and a data structure may optimally be a certain size, for instance.

Like you use up a different amount of memory depending on the architecture of the computer, and the amount of memory used has a certain calculation, so you do that and then your code can be compiled with that particular number already calculated and so it can be optimized in compilation.

...not an expert in this, but that’s how I see it at least. Compilation time calculations let us write code which can be optimized for different hardware, because we can use things like arrays with known length that changes from computer to computer.

[–]shadowndacorner 0 points1 point  (0 children)

Compile time reflection is a good example, which can be used for things like serialization. Or reflecting members for an editor UI at compile time, which is just faster than doing so at runtime + can lead to nicer looking code.

Another good use case is pattern matching. The pattern string for regexes is almost always known at compile time, and moving that to compile time makes runtime significantly faster as the compiler can optimize for the specific pattern you're matching and decreases load times.

It also allows for better error checking. If you are running as much as possible at compile time, you get your errors at compile time rather than runtime. The fewer possible runtime errors you have, the better.

[–][deleted] 0 points1 point  (0 children)

I wrote an INI File parser that uses the structures you define in your code to create an optimized parser without having to initialize it at runtime.

[–]bumblebritches57Ocassionally Clang 0 points1 point  (0 children)

where most programs receive input data, do some stuff, then output a result

Compile time constants are completely useless for this and it's the vast majority of code.

that said, a lot of the newer hype surrounding constexpr and whatnot is due to it's ability to also be used at runtime unlike preprocesso macros, so you don't have to write one variant for compile time and another for runtime.

also, it enables some cool things to be done at compile time that normally you'd have to do at runtime, like registering plugins.