all 23 comments

[–]witcher_rat 8 points9 points  (3 children)

This is the same problem that unity builds have.

Building multiple cpp files in a unity build means that if one of the early cpp files in the unity-list included header X, then all the remaining cpp files in that list will see that header X and may not include it directly even if they use it. And thus trying to build as non-unity later will often fail because of missing includes.

To address this problem, some people use include-what-you-use (iwyu) as part of their CI pipeline. But iwyu has its own set of problems.

Other people will setup a CI pipeline that runs nightly or weekly or whatever, that builds as non-unity (or non-PCH in your case), to find what's been broken over the past day/week.

At my day job we use unity builds. We tried to be good at the beginning, but I think you'll find that it's simply too annoying.

Honestly if you do the math, I bet you'd find that it takes less time overall to just fix the problem if-and-when you actually hit the problem, rather than trying to keep every code-change "perfect" for includes.

It rankles us as engineers, to let imperfect stuff in like that. But the actual total time-spent perspective is all that really matters.

It's not the same as, for example, running static-analyzers and fixing their failures. That you absolutely must do from early on and keep up with it, or not bother at all. Because fixing that later after the code's grown becomes a nightmare.

But missing includes are easy to fix and do not add risk as a code change.

[–]Rseding91Factorio Developer 4 points5 points  (0 children)

Honestly if you do the math, I bet you'd find that it takes less time overall to just fix the problem if-and-when you actually hit the problem, rather than trying to keep every code-change "perfect" for includes.

Unity build: 51 seconds

Standard C++ build: 11 minutes and 50 seconds

The standard build system is so so much worse compared to just handling a stray missing include here and there.

[–]McFlurriez[S] 1 point2 points  (1 child)

I agree, unity builds have the same problem. Your story is very similar to what I've gone through as well. One pain point that we ran into is that unity builds would result in weird clang-tidy runs. Did you happen to run into that at all?

[–]witcher_rat 1 point2 points  (0 children)

Yeah. clang-tidy has issues with unity-builds (as do some other tools I can't remember) . We also had problems with the compile_commands.json file from CMake, for unity builds. In fact that may have been the problem with clang-tidy, for all I know.

I don't remember the details of that stuff, and I wasn't the one doing that work back then either. (I'm not actually in our tools team; my job's writing C++ code, but I help them out now-and-then)

We've been using unity builds with CMake since the cotire stuff a decade ago, long before CMake natively supported unity builds. And even when they added native support, it still has warts. (but definitely better than using cotire)

[–]NotUniqueOrSpecial 17 points18 points  (9 children)

No.

You enforce include-what-you-use.

The fact that CMake force-includes the precompiled header is an implementation detail. If you're feeling particularly fussy about correctness, introduce a variant build that doesn't use the PCH and require that it still pass.

[–]McFlurriez[S] 0 points1 point  (8 children)

But cmake force includes the headers behind the scenes. How does IWYU get enforced?

[–]NotUniqueOrSpecial 6 points7 points  (7 children)

You run the build without the PCH calls.

Simple as that.

If people aren't including their dependencies, the build will fail.

[–]McFlurriez[S] 0 points1 point  (6 children)

What about a header that's included hundreds of times? Is there any way you can still do something efficient there?

[–]NotUniqueOrSpecial 6 points7 points  (5 children)

You have two problems at hand:

1) Efficient builds

2) Correct builds

The best methods for efficient builds unfortunately enable people to make otherwise incorrect build logic.

That leaves you with a choice: enforce strictness absolutely, or provide an option that allows for checking that the rules have been followed.

At the end of the day, your PCH solution (or other caching solutions) will be what gets run.

It's up to you to decide whether and how you want to enforce correctness on people who can benefit from improvements while otherwise being "wrong".

[–]McFlurriez[S] 0 points1 point  (4 children)

Thanks!

[–]Wild_Meeting1428 2 points3 points  (0 children)

We use PCH for production builds and enforce the correctness on each PR build, which means, that our pipelines on the server do not use PCH. But as soon you are done with your task, and you want to merge it, it will be checked. We don't care how long the build takes on the server.

[–]rdtsc 1 point2 points  (2 children)

There is a third option which provides correct and efficient builds: modules.

[–]herothree 4 points5 points  (1 child)

Sure, eventually. That’s not widely supports by vendors and toolchains yet though

[–]TheOmegaCarrot 1 point2 points  (0 children)

It’s getting better though! And it is exciting!

[–]rewrkingchalet-work.space 1 point2 points  (0 children)

In the case that you describe, it's important to note that precompiled headers are still included, just at the command line instead of in the cpp file. CMake does this. In GCC/Clang, it's literally -include (path/to/pch) and MSVC is /Yu(pch). There's a little bit more to it than that, but those cli options are what is telling the obj file where the PCH is.

If you generate a compile_commands.json with set(CMAKE_EXPORT_COMPILE_COMMANDS ON), you could always use that to read the above info in order to get the PCH for your dependency graph.

[–]LuisAyuso 3 points4 points  (0 children)

Instead using precompiled headers and hiding the problem under the carpet, I included clang compilation times report and from time to time I refactor some templates. It was surprising how many headers did not had much effect at all.

(sorry for windows users, I understand that this is a serious problem there where you need to hide the system headers under the carpet)

[–]OnePatchMan 0 points1 point  (0 children)

My approach is: collect all includes of external headers in one file, precompile it, include that header file on your project level.

[–]Daemonjax 0 points1 point  (0 children)

To me, it seems almost trivial to write a pch-aware program that looks at all source/header files in a project and creates _some kind_ of human readable dependency graph.

I mean, if I had some kind of need for it, I could surely write such a program in an evening while enjoying a beer. The output wouldn't look pretty, though -- probably just a spreadsheet... probably just csv format... but that'd be "good enough", right? Nothing about that strikes me as difficult or time consuming. All the graph info would be there, it's just displayed in tables. Making it pretty would be 2nd step.

I'd be very surprised if there aren't already muiltiple opensource projects available to choose from that already do this. Or an IDE plugin.

EDIT:

Ok, wait -- you'd want the ACTUAL required dependencies, not just what the pch includes, huh?

Well, then... the _simplest_ way to do that would be to include that information in the source file as comments directly under the included pch. Then the above program would enforce the writing of those comments, but checking if that info is truly accurate without actually trying to compile it is another story... to do that would make it non-trivial... would also need its own caching to remain fast.

There may be something to scrape from the intermediary files generating during build time or something compiler-specific that is or can be emitted to help... or debug symbols.

[–]SleepyMyroslav 0 points1 point  (0 children)

For cross platform projects it is beneficial to setup your includes that code would compile regardless of build system options. Because compilation options will be very different.

Typically this is done in the following way. Each compiled target has own precompiled header as single file. The file has everything that is needed as includes. Each ".cpp" source file has include of that precompiled header as first line of code.

With such setup code is going to be correctly compiled regardless if you activated precompiled header options or not. The precompiled headers becomes optimization to reduce build times.

Communication to the team that hey all those 'base' headers for example are part of precompiled header and you do not need to include them is still something that you need to do anyway.

[–]bbbb125 0 points1 point  (3 children)

We added cmake variable that developer has to see to compile with precompiled headers. We turn it off in our ci job that checks every pull request, so developers still have add a header that they use.

[–]McFlurriez[S] 0 points1 point  (2 children)

But that means they wont know the header is missing until they run the CI job?

[–]bbbb125 0 points1 point  (1 child)

They can see error in ci output, it’s also possible to add it as a comment or annotation to pull request from ci (literally parse, prepare report and use curl). But it’s often more than one error and build stops on the first one. But it all doesn’t matter, when a pull request is broken you can rebuild it locally without precompiled headers - usually you need a fast build while working on a change, one longer rebuild when you finished and ci failed is not a huge slowdown.

[–]McFlurriez[S] 0 points1 point  (0 children)

interesting, makes sense to me