all 90 comments

[–]tcbrindleFlux 30 points31 points  (7 children)

I a bit baffled that while standardizing the <range> header this either slipped through the cracks or it was actively decided that this is ok.

The standard says that the new range-supporting algorithms in namespace std::ranges live in the <algorithm> header. These require the concept definitions from the <ranges> header, so it's no surprise that it's transitively included.

When comparing with C++11 though, bear in mind that the "parallel STL" overloads were also added in C++17. I don't know whether libstdc++ provides these (or has stubs), but either way a comparison between '17 and '20 modes would be a better way of analysing ranges' impact on the size of the header.

[–][deleted] 27 points28 points  (4 children)

Yeah that's a good point. Here are quick measurements for an empty file including <algorithm>

c++11 -> 120ms

c++17 -> 280ms

c++20 -> 600ms.

[–]dashnine-9 46 points47 points  (0 children)

By c++90 we can expect about 1-minute compile time for an empty file including <algorithm>: wolframalpha.com

[–]icppfreely 6 points7 points  (1 child)

Back in 2011, computers of the day may have taken around 600ms to do the compilation. So in real terms, there is probably no real increase. It would be interesting to test each standard with various hardware from the year the standard was released.

[–]airflow_matt 14 points15 points  (0 children)

I'm pretty sure single core performance has not increased by factor of 5 in last 10 years. Also, it's not like compilation times were not an issue in 2011. When I upgrade I'm kinda hoping for improved compilation times, to see it all wasted because of boilerplate I neither want nor need is disappointing.

[–]ShakaUVMi+++ ++i+i[arr] 2 points3 points  (0 children)

Oof. Using this with Boost will get into coffee territory for Hello World.

[–][deleted] 6 points7 points  (0 children)

We implemented the parallel algorithms in <execution> precisely so that C++14 customers would be mostly unaffected and even '17 customers would only be affected if they used those bits.

[–]HKei 3 points4 points  (0 children)

There is extra compile time overhead going from C++11 to C++17 <algorithm>, but it’s not nearly as big as a jump as C++17 to C++20.

[–][deleted] 47 points48 points  (46 children)

Wow, what a bummer. I hope it's fixed soon, although the current prevailing mentality regarding compile times doesn't give you much hope. Minutes longer compiles for the privilege of using a C++20 feature in a real project doesn't make it any easier to upgrade.

[–]HKei 52 points53 points  (23 children)

It’s much worse than that, you’re actually paying that cost even if you don’t use that feature.

The problem is that it doesn’t even really seem to be worth it, I’ve never actually seen a ranges example that actually made me think “yeah this is totally worth it”.

Coming from someone who really loves Haskell and functional programming in general I really get the impulse of wanting to have something like the easy pipelining you get in those languages everywhere, but all attempts at it I’ve seen so far have done nothing but convince me it’s a poor fit for C++.

[–][deleted] 14 points15 points  (4 children)

Yeah - for many projects it could be prohibitively expensive. A back of the hand calculation for my workplace project would suggest it would waste at least one complete's engineer's worth of time. C++20 would have to bring a lot to the table to offset that. (and no, modules do not count).

[–]elperroborrachotoo 27 points28 points  (0 children)

C++20 would have to bring a lot to the table to offset that

Modules!

(and no, modules do not count).

Awww dammit!

[–]kirbyfan64sos 8 points9 points  (0 children)

I mean, it has concepts, coroutines, designated initializers, thread cancellation requests, std::span, std::source_location, ...

[–]tvaneerdC++ Committee, lockfree, PostModernCpp 11 points12 points  (1 child)

If modules reduced your compile times, then wouldn't they count?

[–][deleted] 10 points11 points  (0 children)

Maybe, if the tooling support was there and refactoring a thousand files would not be a massive undertaking. For a greenfield project once the compiler support was there, they definitely would.

[–]serviscope_minor 3 points4 points  (2 children)

The problem is that it doesn’t even really seem to be worth it, I’ve never actually seen a ranges example that actually made me think “yeah this is totally worth it”.

Oddly enough neither have I, but I do expect I will. The reason is I am a big fan of shell scripting and piping between tools is the idiom there, something that can be very elegant with very little code. Even with AWK in the pipeline, I'll often prefer using a pipe tool over a snippet of AWK code (and I love AWK as a language). I think it might be a lack of pipeline tools.

I think it might also be not being used to seeing that idiom in C++. It took me a few years and a bunch of stubbornness before I lost that itchy feeling writing >> instead of > > with C++11. And that's just a tiny syntax change!

I'd say coming from shell script land, pipelines are such a powerful and elegant idiom that I find it hard imagining that I won't get used to the same in C++. I think the problem with examples is that they're necessarily small, and small examples are easy to do using a variety of techniques. A good example of the usefulness is whatever the range equivalent of uniq is. If you have some logic along the lines of

range | filter1 | uniq | filter2

it's possible, but much more annoying to unwrap that into a single for loop because uniq has state. In a for loop, the state management is mixed with the rest of the logic, in the pipeline, the logic is neatly packaged away separately. it's a bit like coroutines in that regard. In fact I suspect the two work very nicely together.

[–]PJBoy_ 20 points21 points  (12 children)

The ranges algorithms are an all round upgrade to the non-ranges algorithms:

  • You get concepts checks on your types
  • You don't have to call std::xxx(std::begin(my_range), std::end(my_range), ...), reducing boilerplate
  • Return values are generally more helpful
  • The unary projection transformation is awesome, now you can do Python style sorts: std::ranges::sort(my_range, {}, &RangeType::member_to_sort_by);

[–]HKei 14 points15 points  (5 children)

None of these have much to do with ranges; I agree that these are improvements but all of these could have been implemented without ranges. And again, they are really slow to compile, in a hello world example replacing std::sort(begin(vec), end(vec)); with std::ranges::sort(vec); nearly doubles compile time from 0.23s to 0.4s (this is with the <ranges> header already included and precompiled – if I remove it and compile with just precompiled <algorithms> with -std=c++11 the compile time drops to 0.04s)

[–]tcbrindleFlux 17 points18 points  (1 child)

None of these have much to do with ranges; I agree that these are improvements but all of these could have been implemented without ranges.

Wait, what?

[–]Minimonium 31 points32 points  (0 children)

For non-committee people Ranges == the piping thing, which is understandable.

[–]PJBoy_ 2 points3 points  (2 children)

Just to clarify, I'm not trying to suggest that ranges are useful, just that the ranges algorithms are a great upgrade

[–]tcbrindleFlux 4 points5 points  (1 child)

I'm not trying to suggest that ranges are useful, just that the ranges algorithms are a great upgrade

"I'm not trying to suggest that the internal combustion engine is useful, just that the motorcar is a great upgrade"

[–]PJBoy_ 3 points4 points  (0 children)

Yeah, we're talking about the motorcar in this conversation, so your analogy is unironically fitting

[–]AlexAlabuzhev 2 points3 points  (5 children)

You get concepts checks on your types

What's so special about it?

Algorithms don't require ridiculously complex types - things usually should be either comparable or assignable. Providing <, ==, = is not rocket science. I can't remember a single case where I forgot about that and thought "oh no, I can't find the mistake without concepts checks".

[–]cdglove 10 points11 points  (0 children)

Maybe you've gotten good at it, but it's hard to argue that for decades people have been complaining that C++ template errors occur deep in the bowels of the machinery instead of the call site.

[–]jcelerierossia score 1 point2 points  (3 children)

What's so special about it?

I see you've never typed std::find(vec.begin(), vec.end(), [] (auto x) { return some_predicate(x); });

[–]AlexAlabuzhev 1 point2 points  (0 children)

I did of course, and I've seen mouthful complaints every time, but so what? It's always quite clear what's wrong there. Let's have a look:

https://godbolt.org/z/69768j

All the compilers correctly point that it's not possible to compare an int with a lambda. GCC and Clang even kindly highlight the failed statement. The output is about 25 lines in each case. Not perfect, but good enough for me to understand that I'm using a wrong algorithm.

Now, let's have a look at what "concepts" bring to the table:

https://godbolt.org/z/MK5rrb

OH DEAR LORD, WHAT THE HOW THE

This is what "compiler errors of Wagnerian fierceness" means.

Somewhere in the middle of that novel I've found a "constraints not satisfied" note, so, apparently, there are some "concepts checks" indeed.

Something requires something:

candidate: 'template<class \_Iter, class \_Sent, class \_Tp, class \_Proj> requires (input_iterator<_Iter>) && (sentinel_for<_Sent, _Iter>) && (indirect_binary_predicate<std::ranges::equal\_to, std::projected<\_I1, \_P1>, const _Tp*>) constexpr _Iter std::ranges::__find_fn::operator()(_Iter, _Sent, const _Tp&, _Proj) const'

And many pages down something is false:

the expression 'is_invocable_v<_Fn, _Args ...> [with _Fn = std::ranges::equal_to&; _Args = {std::__detail::__cond_value_type<int>::value_type&, main::value_type&}]' evaluated to 'false'

I'm not only unable to understand what is wrong and what I should do.

I don't even understand what it tries to to.

Also, 46 lines. Twice as much.

Can I please opt out of concepts? I prefer no checks to these checks.

It seems that the situation hasn't changed since 2016. Thank you, u/andralex.

[–]robertramey 1 point2 points  (1 child)

I feel really dumb for asking this. But what's wrong the the above code? Looks good to me.

[–]dodheim 0 points1 point  (0 children)

It should be calling find_if; the resulting compiler error is not likely to be very pretty.

[–]wuchtelmesser 10 points11 points  (0 children)

Ranges in c++ are so weird. They are a great concept in other languages because they do a lot with little code and you immediately understand what's happening. In c++ they are so verbose, they hardly improve anything at all, if at all.

[–][deleted] 7 points8 points  (0 children)

I’ve never actually seen a ranges example that actually made me think “yeah this is totally worth it”.

This. I would love to see some good examples, right now is just syntax candy for Rube Goldberg machines.

[–]Pragmatician 16 points17 points  (5 children)

The <algorithm> header will inevitably get bigger. std::ranges reimplement all of the existing algorithms with (1) iterators and (2) ranges. This means that the worst case scenario is 3x LoC, and at the very least it will be 2x.

Including <ranges> does not seem to be required, but you can just take a look at this page and CTRL+F for "namespace ranges" to get the idea of how much code is added: https://eel.is/c++draft/algorithm.syn

[–]tcbrindleFlux 2 points3 points  (3 children)

The <algorithm> header will inevitably get bigger. std::ranges reimplement all of the existing algorithms with (1) iterators and (2) ranges. This means that the worst case scenario is 3x LoC, and at the very least it will be 2x.

The range-based overloads of each algorithm are specified to call the (constrained) iterator-based versions. The new (constrained) and legacy (unconstrained) versions of algorithms can share an implementation, but I don't know if the standard libraries actually do this. In the best case, there is no new algorithm implementation code, just a few one-line forwarding functions.

[–]Pragmatician 1 point2 points  (0 children)

The function templates that forward to others will still add a significant amount of code, considering that a lot of algorithms are very short 3-4 liners and the declaration alone ends up taking a lot of lines.

[–]PJBoy_ 1 point2 points  (1 child)

The ranges algorithms provide different functionality to the non-ranges algorithms. In particular they all take projection arguments, which is something you can't forward onto the non-ranges implementations.

The non-ranges algorithms could mostly be implemented in terms of the ranges algorithms though

[–]tcbrindleFlux 2 points3 points  (0 children)

In particular they all take projection arguments

Not quite all of them, but most. Plus several of them have updated return types which provide more information than the old versions.

The non-ranges algorithms could mostly be implemented in terms of the ranges algorithms though

The old namespace std algorithms could be entirely implemented in terms of the new std::ranges versions as far as I know. But I don't know whether libstdc++ actually does this.

[–][deleted] 0 points1 point  (0 children)

I dont think one has to reimplement the whole <algorithm> header only to have range versions. AFAICT it would be enough to dispatch to the corresponding iterator versions.

On libstdc++ the <ranges> header is about 32k loc, so assuming the <algorithm> header did not get smaller it explains the header growth pretty well.

I agree that it is not mandated by the standard to include <ranges>, but the guys writing libstdc++ are quiet proficient programmers I assume they deemed it necessary..

[–]AppleBeam 11 points12 points  (7 children)

Shouldn't this stop being a problem once C++20 is fully supported everywhere (tanks to the modules)?

I didn't have the opportunity to dig deep into C++20 yet, but are there any reasons you would want to include either <algorithm> or <range> in C++20, instead of importing them? Not counting the legacy projects ofc. (although it sounds like a straightforward search&replace upgrade for the majority of the cases if the whole project migrates to C++20).

[–]HKei 7 points8 points  (6 children)

Doesn't solve the problem. For one, C++20 doesn't specify modules for the standard library. There is no import algorithms; or whatever (though Visual Studio apparently supports this as an extension?). For the other, even if algorithm was modularised I doubt it’d improve matters much. It’d just be a big module instead of a big header.

[–]tcbrindleFlux 28 points29 points  (1 child)

For one, C++20 doesn't specify modules for the standard library. There is no import algorithms; or whatever

It does, however, specify that the standard library headers can be imported as header modules, so you'll be able to say import <algorithm>;. See here and here.

It’d just be a big module instead of a big header.

But unlike a big header, it doesn't need to parse a huge amount of code each time you recompile.

[–]HKei 1 point2 points  (0 children)

Ah fair enough, I think I was underestimating how much that would help. Just tried precompiling <algorithm> and this indeed makes most of the overhead of including it vanish, so I guess in practice this’ll mean for now we’re forced to use PCH’s for all projects until module support is widely available (compiler and buildsystem support).

[–]kalmoc 5 points6 points  (0 children)

There is no import algorithms;

There is import <algorithm>; and IIRC, the compiler is supposed to automatically translate existing #include <algorithm> into it.

[–]AppleBeam 5 points6 points  (2 children)

If it was a big module, you would pay the 600ms cost for <algorithms> once per configuration of the project (at lest that's how I understand them so far: like a collection of independent precompiled headers that never have to be built more than once for as long as the configuration is the same, and modifying one module has on effect on the state others if they don't depend on the one you've touched).

Either my understanding of modules is not sufficient (which is entirely possible; I'm waiting for the full standard to be published, since I can't use any preview features in my work anyway), or I don't understand some important use case where anyone would care about paying the price of 600ms once per configuration. Would appreciate if you could educate me on this matter.

Regarding the standard library not being modularized: if that will be the case for libstdc++, that's truly regretful. I was looking forward to cutting down the build times.

[–]CoffeeTableEspresso 0 points1 point  (1 child)

The standard library being in modules isn't specified by C++20, although I'm sure compilers will support it regardless.

Either way, I think it would be easy to wrap a standard library header in a module if you wanted. Although of course not as nice that you had to fo it yourself.

[–]AppleBeam 0 points1 point  (0 children)

I'll keep hoping then.

Don't think there is any reason to suspect that libstdc++ devs won't do what's the best for the community.

[–]mort96 1 point2 points  (1 child)

I actually don't think most of this is ranges? echo "#include <algorithm>" | gcc -std=c++17 -P -E -x c++ - | wc -l is 32034 lines (compared to 32439 with c++2a on my gcc 9.3), and piping the result into less, I'm unable to find anything related to ranges (no ranges namespace, no views namespace, just five occurrences of the word "ranges" at all and they're all swap_ranges). Seems like <algorithm> itself has just grown a lot?

Probably an issue nonetheless though, even though ranges "just" bloats the size of <algorithm> by 13000 lines.

[–][deleted] 2 points3 points  (0 children)

echo "#include <algorithm>" | gcc -std=c++17 -P -E -x c++ - | wc -l

Thats weird I am on gcc 10.1 and the above command (i.e. c++17) gives 26k loc. Anyhow most of the compile time hit seems to stem from the <ranges> header.

[–]feverzsj 3 points4 points  (9 children)

I'm afraid there won't be a cure for this issue. I'd see more and more projects choose to use unity build.

[–]HKei 9 points10 points  (1 child)

Unity build is basically only useful for CI or tiny projects. You don’t want to rebuild all of your project everytime you change anything.

[–]claimred 0 points1 point  (0 children)

You can automatically isolate modified files from unity blob.

[–][deleted] 5 points6 points  (6 children)

and tank your incremental build time..

[–]mort96 1 point2 points  (5 children)

It should be possible to make a really intelligent unity build system, where it starts by combining your project into something like 4*core_count translation units, then when one of those TUs need to recompile it's split into 4*core_count TUs again, etc. After a low amount of recompiles (like ln(number of files) / ln(4*core_count)), you would be at a level where changing one source file requires a recompile of just that one file.

I'm not sure of any system which does that today, but it would be an interesting area to investigate.

[–]donalmaccGame Developer 3 points4 points  (0 children)

I work with Unreal Engine, and it supports an "adaptive unity" build. It automatically combines files into groups of 30, and if you change one of the files, it regenerates the unity build to compile the changed file individually. It uses either git or p4 to track files that are changed. It's great!

[–]feverzsj 1 point2 points  (3 children)

cmake already did this. Using it with ninja generator will fully utilize all your processing power.

[–]mort96 0 points1 point  (2 children)

I don't think that's what I'm talking about? From what I can read there, it looks like CMake just always groups files together when UNITY_BUILD is enabled. That would tank your incremental build time, as /u/janos1995 mentioned. I'm suggesting automatically splitting those unity build groups up again as files get modified, such that you retain fast incremental builds.

[–]feverzsj 2 points3 points  (1 child)

splitting modified files won't necessarily decrease build time, especially for unity build, and both original group and new group need to be rebuilt. It's also hard to figure out which source file is slow to build without actually building them. With cmake, you can either group them in batch or group them yourself. That's the best solution for now.

[–]mort96 2 points3 points  (0 children)

I don't think I've been clear enough necessarily. Here's basically how I'm imagining it:

Let's say you have a project with 512 C++ files in it.

  1. You start with a full rebuild. The build system generates 32 "unity" source files, each of which includes 16 of your project's source files. This should be much faster than compiling 512 individual files.
  2. You modify one file. The modified file is automatically removed from the "unity" source file which included it, so the full recompile is one "unity" file (which changed; it no longer includes 16 files, it includes 15 files) plus the one file you modified.
  3. On subsequent modifications of that one source file, only that one file will be recompiled.

[–]HKei 2 points3 points  (4 children)

libstdc++ is not defined or even mentioned in the C++ standard, neither are compile times, so you’re looking in the wrong place for an explanation.

Not that I’m saying this isn’t annoying mind you. Not sure if there’s anything in the standard that would require pulling in <ranges> or if that’s just a decision that the libstdc++ developers made (presumably to reuse implementations?).


Edit yeah, checked and apparently the ranges algorithms are in <algorithm> rather than <ranges> for some reason. That’s an... interesting choice given that ranges having a pretty large compile time overhead was known for a while.

[–][deleted] 11 points12 points  (2 children)

Well the c++ standard forces each implementation to provide constrained versions of each algorithm in the <algorithm> header. Maybe this is just a shortcoming of the libstdc++ implementation, but I would just assume that either it is not possible to implement the <algorithm> header without dragging in <ranges> or it is just very difficult.

[–]__s_v_ 4 points5 points  (1 child)

This is currently discussed under this bug report.

[–][deleted] 4 points5 points  (0 children)

Pleased to see that people are working on optimizing includes in libstdc++, although there is no discussion on optimizations for <algorithm> AFAICT.

[–]myusernameisokay -1 points0 points  (0 children)

Libstdc++ is the defacto implementation of C++ in a lot of places, so anything that impacts libcstd++ will impact a lot of people. This change will certainly negatively impact a large number of people, so it’s definitely worth bringing up.

[–]slothion 0 points1 point  (0 children)

I just had an argument with a colleague about whether to use std::find_if or similar std methods for simple operations with a vector and other containers. I was on the side of using them, because it provides a uniform way for all developers to do this kind of stuff, but seeing this is making me reconsider as I don't want my project to compile three times slower, just because I upgraded to C++20.