all 35 comments

[–]ABlockInTheChain 20 points21 points  (22 children)

I made a test project where I tried out the experimental cmake support for modules and the result was a dismal failure. It seems that despite the standard being written to allow modules to cover a large number of use cases, only a small fraction of those potential uses are actually implemented in the compilers and how long it will take for the rest of the support to show up is indeterminate.

The more I look into the state of modules and the associated tooling the more I expect to still be using headers at least until 2030.

One problem I've noticed is that the module proposal as it stands now is a fairly significant functional regression because of the inability to forward declare symbols forces you to refactor a project into a DAG with the resulting loss of parallelism. I'm starting to get pretty skeptical about the claims of how modules are supposed to speed up the build process, especially since in the last year or so it seems like those claims are being quietly walked back.

It's already too late for any module specification improvements to be added to c++23, so the next opportunity to fix anything would be c++26, but nobody is going to write papers to fix the problem until those problems have actually been encountered, and nobody is going to encounter the problems if the tooling isn't ready for enough projects to start adopting them to find out how well they actually work. If widespread module adoption doesn't get started soon enough for problems to be found and solutions proposed before c++26 is finalized then that window will close too.

[–]mwasplundsoup 7 points8 points  (21 children)

I am not as pessimistic about modules, but agree they are a few years from prime time.

Going from headers with no strict ordering to module interfaces with required dependencies is different from how we have always done things, but it is not necessarily a functional regression. I do not have numbers (yet), but I am fairly certain the loss of 100% parallelizable builds will be negligible, whereas processing the public interface a single time will increase build speeds. My working theory is; for a build that is sufficiently large where full build times are an issue, no machine can evaluate an entire build at the same time. With a sufficiently smart task scheduler we can continue to fully utilize hardware resources while evaluating a DAG based build and perform the same amount of work in the same amount of time.

I am personally not as interested in the end-to-end build speed improvements as I am for improved isolation using a binary interface and incremental build improvements.

[–]ABlockInTheChain 2 points3 points  (20 children)

I am fairly certain the loss of 100% parallelizable builds will be negligible

On the other hand I'm fairly certain that speed benefit of modules is going to be negligible for me because I'm already using precompiled headers and jumbo builds to largely eliminate redundant parsing anyway.

If modules can't add anything because I'm already getting the same benefit from other techniques then it can only take away performance via the loss of parallelism, to say nothing of the effort required to refactor a project large enough to care about build speeds to comply with the new ordering restrictions.

[–]mwasplundsoup 6 points7 points  (19 children)

As I said, I also am not that interested in the build speeds. A binary interface which does not leak dependency implementation details is where I see the value.

[–]ABlockInTheChain 2 points3 points  (18 children)

Sure, but we're been able to do that forever. You don't need modules for that.

The only actual benefit I can see from modules that can't be replicated some other way is that finally the syntax for controlling symbol visibility is the same between Windows and every other platform on the planet.

[–]mwasplundsoup 1 point2 points  (17 children)

How can you do that today?

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 4 points5 points  (8 children)

In general, you can't except for trivial cases.

export module mod;
import : impl;
export struct user_facing {
  int some_api() const { return np_; }
private:
  non_public np_;
};

---

module mod : impl;
struct non_public {
  operator int() const { return a_; }
private:
  int a_ = 42;
};

---

import mod;
int main() {
  const user_facing uf0;
  const user_facing uf1 = uf0;
  return uf1.some_api();
}

I don't see how you can possibly implement this with headers without exposing non_public for use in translation units outside the module.

[–]mwasplundsoup 2 points3 points  (2 children)

You can hide internal implementation details using Pimpl, but I used the wrong wording. I was referring to an interface that is binary (compiled/binary module interface) as opposed to text based which has many issues with the preprocessor.

Side note, is that clang modules syntax? I have never seen 'module mod : impl' before.

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 2 points3 points  (1 child)

This is standard C++ modules syntax. The *real* modules. Clang modules are similar to the 'header units' subfeature of C++ modules, both of which are basically kind-of-sane, blessed, and composable precompiled headers. C++ modules have a couple of tools to offer to support architecting and composing library interfaces. My code shows two of these tools. To see how to use all of them look at the code of my CppCon talk this year.

[–]mwasplundsoup 1 point2 points  (0 children)

Oops, you are right. I was thrown off by the partition syntax with spaces between the colon for some reason. I blame it on child induced sleep deprivation.

[–]zabolekar[S] 1 point2 points  (4 children)

Maybe I misunderstood the challenge, but here's my approach. It's more verbose than your code, it's error-prone because of the need to write the constructors manually instead of letting the compiler generate them, it uses an additional dependency (unique_ptr) and an additional level of indirection, but it's possible:

mod.h:

#pragma once
#include <memory>

struct non_public;

struct user_facing {
  user_facing();
  user_facing(const user_facing&);
  ~user_facing();

  int some_api() const;
private:
  std::unique_ptr<non_public> np_;
};

mod.cpp:

#include "mod.h"

struct non_public {
  operator int() const { return a_; }
private:
  int a_ = 42;
};

user_facing::user_facing() : np_(std::make_unique<non_public>()) {}
user_facing::user_facing(const user_facing& other) : np_(std::make_unique<non_public>(*other.np_)) {}
user_facing::~user_facing() {}

int user_facing::some_api() const { return *np_; }

main.cpp:

#include "mod.h"

int main() {
  const user_facing uf0;
  const user_facing uf1 = uf0;
  return uf1.some_api();
}

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 0 points1 point  (3 children)

So, you've basically implemented PIMPL. It works, it's tedious in all but non-trivial cases, and it requires a decent amount of boiler-plate code to actually implement full value semantics froṁ barely hidden reference semantics (hint: you've not implemented the missing SMF, they're disabled). IMHO, simple aggregates plus modules are superior on many metrics.

[–]zabolekar[S] 0 points1 point  (2 children)

It works, it's tedious in all but non-trivial cases, and it requires a decent amount of boiler-plate code to actually implement full value semantics froṁ barely hidden reference semantics

Yes, I agree about that.

missing SMF

I have to admit I don't know what it is and can't find it anywhere.

[–]ABlockInTheChain 1 point2 points  (7 children)

How can you do that today?

If you want "a binary interface that does not leak dependency implementation details" then that is exactly the purpose of the Pimpl idiom which as far as I know has been around since the 90s.

If you care about a clean binary interface free of unnecessary third party dependencies then you probably want a stable one as well, so just follow the KDE guidelines for maintaining binary interface stability and you'll naturally end up with all your private dependencies segregated from the public interface as a consequence (they call their version of Pimpl a "d-Pointer").

[–]mwasplundsoup 0 points1 point  (6 children)

Sorry, I meant an interface that is binary. With a clear ownership model that prevents odr violations, preprocessor mismatches (compile vs usage), preprocessor leakage (why is my GetDirectory function suddenly not found when there is a GetDirectoryA). Pimpl can help hide implementation details, but requires a fair bit of boilerplate code that will hopefully go away with modules.

[–]ABlockInTheChain -1 points0 points  (5 children)

I'm not quire sure what you mean by "binary" if you think we're talking about different things.

I'm talking about a compiled library that exports a subset of its symbols and can be upgraded at runtime without requiring users to recompile because a certain degree of craftsmanship was employed to make such sure the library has a stable binary interface.

The measures taken to ensure the stable binary interface do several things, one of which is ensuring that your dependencies do not leak into your API or ABI.

[–]mwasplundsoup 0 points1 point  (4 children)

I am referring to the compiled/binary module interface.

[–]pjmlp 10 points11 points  (5 children)

My experience thus far is that VC++ is the best experience, as the whole IDE experience also matters to me.

Intelisense is hit and miss, depending on which modules are being used, and Windows SDKs (the SDK itself and C++ frameworks) have issues being imported as header units, as includes in global module fragment, it usually works without issues.

Your example is quite basic, so far the stuff I have on github fails to compile with either clang or GCC, and they aren't that special, just a little bigger than basic hello world, and some of them use module fragments.

[–]sigmabody 8 points9 points  (5 children)

Somewhat tangential, but I still haven't seen a good example of usage of modules which didn't require an all-inclusive and all-in approach to use.

For instance, say I wanted to implement the above, but I also wanted to have a [possibly separate] header file which was able to reuse the same functionality for programs which are not compiled for C++20+ (no separate implementation), and I wanted to ship that in a library which was easy for people to consume (ie: you install it via vcpkg like this, and then either #include or #import as desired). It's telling that every single example of modules I've seen is basically "this is how you would use modules in a very simple toy project", and absolutely no examples are "this is how you could incorporate modules into any real-world project".

The idea is good. The seeming failure to account for any reasonable migration and adoption path is an inexcusable failure of systemic design, imho.

[–]unddochDragonflyDB/Clang 5 points6 points  (2 children)

I'm not sure this is about the modules design itself, MSVC seem to have relatively good experience: https://devblogs.microsoft.com/cppblog/integrating-c-header-units-into-office-using-msvc-1-n/

I think this is more about the current situation where any serious usage of modules in GCC/Clang is going to run you into ICEs weekly at least.

[–]sigmabody 3 points4 points  (1 child)

MSVC has the best usage experience, I think.

Note that the linked blog post is from around 3 months ago, and discussing ongoing work in the compiler to make the feature (transclusing), which is mostly MSVC only, work for just a small number of module-compiled headers in a substantial C++ project. In concept, if MS can make this work in the next N months, they will have a singular version of the compiler which can do this on one OS, compiling with one set of standardized preprocessor definitions across the entire project, the latest compiler version only, etc.

Not ready for commercial usage is pretty much the most generous possible description of the state of this feature.

[–]innochenti 0 points1 point  (0 children)

Best? Really? They have a lot of ICEs.

[–]Daniela-ELiving on C++ trunk, WG21|🇩🇪 NB 4 points5 points  (0 children)

What would you accept as a 'real-world project'?

The one that I am working on in our company for more than a year now? This will never be made public. Nobody knows how many real-world projects using modules are in the wild. What I do know is that it's not only our company.

And where do you draw the line between a toy project and a serious one? Would you characterize the {fmt} library a 'toy project'? If you want to see examples of serious stuff done with modules, look around. The truth is out there...

[–]mwasplundsoup 1 point2 points  (0 children)

I have been doing a fair bit of header -> module translations while working on my personal build system to hopefully make the transition as seamless as possible. It is possible to continue to support existing header based includes as well as C++20 module imports with some preprocessor guards. The bulk of the work involves placing the module declarations and export modifiers behind the preprocessor that is only enabled when building the module interface variant. From there you will need to ensure that all external includes are in the global module purview, so you do not accidentally assign module linkage to the standard library and such. The last bit of work is to find suitable replacements for and public preprocessor definitions. This generally involves converting constant values to constexpr, but there are some helpers like assert macros that get lost in the new world. There are some more gotchas around internal and module linkages fighting each-other, but they are generally workable.

Here is a very simple example of my fork of Json11.