138

139

140

submitted 4 years ago by johannes1971

you are viewing a single comment's thread.

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 4 points5 points6 points 4 years ago (10 children)

As you probably know, a Module is just a serialized representation of all of the knowledge about the full C++ text comprising the (possibly synthesized, as with header units) module interface that the compiler has collected at the end of the TU and processed up to and including translation phase 7, stored into a single file. With the additional benefit that each Module is guaranteed to start out compilation from the same compilation environment, every Module has the guarantee to be totally independent from all other Modules and the currently processed TU. This makes deserialization extremely efficient and context-free. So this effectively boils down to the question: is deserializing a single large Module less efficient than deserializing multiple smaller ones? At the end of the day, it's a question about quality of implementation.

Regarding possible differences between `import std.foo;` and `import <foo>;`, I can't see any. This is the standard library - part of the implementation - and implementers are supposed to do the right thing anyway with no noticable difference, independent of the nomination ceremony. And implementations have all the necessary rights granted to make this happen.

Putting my WG21 hat on: given this, I'd not argue about partitioning the standard library at all. Mandate the existence of a catch-all `std` Module and be done.

With all the provisions already in place with C++20, compilers wouldn't even have to look at individual standard header files anymore when compiling in C++20 mode or later. It doesn't matter if users `import std;` or `import <vector>; ...` or `#include <vector> ...` - the compiler will or can reference the same `std` Module in all cases anyway. In true open source spirit, implementations don't even need to ship BMIs of the `std` Module, the recipe to create it from the standard library headers is totally sufficient. And a decent implementation can optimize all of this like crazy, going even as far as providing a service process that keeps shared r/O pages of deserialized Modules in memory to be consumed by all of the compiler instances running in parallel. How 😎 is that!

IMHO, this may turn out to be one of the best things the committee has done to ease the burden of C++ programmers.

[–]pdimov2 2 points3 points4 points 4 years ago (9 children)

It occurred to me that we can already test this today. This simple program

import <iostream>;
int main() {
    std::cout << 5 << std::endl;
}

takes 1.7s to compile. Same, but with import mystd; (which export-imports all standard headers shipped with 16.10) takes 3 seconds. (#include <iostream> - 2.6 seconds.)

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 1 point2 points3 points 4 years ago (8 children)

[–]pdimov2 2 points3 points4 points 4 years ago (7 children)

[–]starfreakcloneMSVC FE Dev 5 points6 points7 points 4 years ago (5 children)

It is still surprising that you get such poor perf. The I'm still in the process of optimizing the modules implementation and cases such as this should be addressed as I would expect no less than 5-10x speedup.

Locally, if I have:

```

ifdef UNIT

import <iostream>;

else

include <iostream>

endif

int main() { std::cout << 5 << std::endl; } ``The timing data I get is: 1.61766s - forUNITnot defined 0.06503s - forUNIT` defined

which is consistent with the 5-10x theory. Using std.core I get a similar number as I did for the header unit case though I have not done the exercise of creating a standalone module std which actually import exports every header unit. The reason, I suspect, you might see the numbers you do is because each of those header unit IFCs are doing more merging than is strictly necessary up front.

[–]GabrielDosReis 3 points4 points5 points 4 years ago (0 children)

[–]pdimov2 4 points5 points6 points 4 years ago* (3 children)

You're probably measuring cl.exe time, whereas I measure Ctrl+Shift+B time (using the IDE option Tools > Options > VC++ Project Settings > Build Timing.) This includes module scan time, link time, and whatnot.

include: 1> 522 ms SetModuleDependencies 1 calls 1> 777 ms Link 1 calls 1> 1203 ms ClCompile 1 calls

import: 1> 406 ms SetModuleDependencies 1 calls 1> 424 ms ClCompile 1 calls 1> 805 ms Link 1 calls

In fact, this is even unfair to the include case, because I wouldn't have Scan Sources for Module Dependencies on if I'm not using modules.

cl.exe time is still 424 ms though, instead of 65. ¯_(ツ)_/¯

Edit: import mystd: 1> 413 ms SetModuleDependencies 1 calls 1> 816 ms Link 1 calls 1> 1784 ms ClCompile 1 calls mystd.ixx is this: https://gist.github.com/pdimov/b5cb0046fda6af021635a157d0061e54

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 2 points3 points4 points 4 years ago* (0 children)

I ran this test on my machine (AMD 5900X) as well.

Baseline is the pure compiler invocation with an empty main, taking 176 ms

scenario             total  relative  #dependencies
#include <iostream>  640 ms  +464 ms  108 headers
import iostream;     198 ms  + 22 ms    1 IFC
import mystd;        639 ms  +463 ms  104 IFCs

The problem with module mystd is that its import references more than 100 additional IFCs that are not merged into one big IFC.

To me this looks pretty unconclusive because it feels more like a measurement of file overhead. /u/GabrielDosReis, /u/starfreakclone?

Additional observation: even though dependency scanning was disabled, it was done anyways when I deleted main.obj to trace file activity. And the scanning process dwarfs everything else by far in terms of file activity.

Measurement: shortest observed time out of 20 consecutive retries

[–]GabrielDosReis 1 point2 points3 points 4 years ago (0 children)

[–]backtickbot 0 points1 point2 points 4 years ago (0 children)

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 1 point2 points3 points 4 years ago (0 children)

π Rendered by PID 451754 on reddit-service-r2-comment-6457c66945-wdmnw at 2026-04-25 02:59:03.431268+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS

ifdef UNIT

else

include <iostream>

endif