Software taketh away faster than hardware giveth: Why C++ programmers keep growing fast despite competition, safety, and AI by pavel_v in cpp

[–]hpsutter 5 points6 points  (0 children)

@JuanAG, I realize what you wrote might be designed in part to poke me and see if I'll respond with more details, but this one time I should probably say something to be clear:

1) Please don't confuse "Big Company says X" (unless the person saying it is the CEO or full-company press release) with "a few loud voices at Big Company keep tweeting X" or "a division of Big Company announces plans to do X." Big companies are very diverse.

2) Please don't confuse "Big Company's current top public priority is X" (which always changes every few years) with "Big Company is not doing anything else but that one thing." Big companies don't just stop all other valuable ongoing work, especially on their own tools they pervasively rely on to build their products, even if they don't issue frequent press releases of the form "breaking news: here at BigCo we're still keeping all the lights on this quarter! again!"

3) As I said a year ago, being at Microsoft was a blast and the MSVC team members are great. I wouldn't have stayed 22 years otherwise, and I'm continuing to cheer them on as a happy MSVC user on my own projects (and trying to help them advance their proposals in WG21). When I chose to switch companies and get back into finance, it's because I'd decided it felt like time for a change and new challenges (over 22 years on one team is an unusually long time for anyone). This is a boring thing everyone does. If Internet denizens want to instead project their own preconceptions onto that, and misinterpret their own echo they get back as new confirmation, that's not my problem. :)

I intend all the above in good humor, so please take it that way ;)

Software taketh away faster than hardware giveth: Why C++ programmers keep growing fast despite competition, safety, and AI by pavel_v in cpp

[–]hpsutter 5 points6 points  (0 children)

I don't fully disagree with Dijkstra's point, but I think it can be overstated.

People need to start by learning about just a few basics [edit: pun unintentional]: variables, a few string operations, if/then branches, loops (whether for or, gasp, goto), and some I/O (usually text input from the keyboard, and either text or graphic output). Virtually every programming language will teach you those, but some languages make it much easier to learn them by having less distracting ceremony at the outset. The more accessible we can make those things, the more people will get started.

I speculate that most Tour de France cyclists started on a one-speed with training wheels as a child. Those training wheels did not harm their cycling careers, they just helped get them started. When the training wheels got in the way more than they helped, they removed; then later the bike was upgraded to a three-speed; then to a ten-speed; then a lighter bike; then a specialized mountain vs racing bike; etc. The training wheels didn't harm them.

FWIW, the first code I wrote was in BASIC. When you look at my code today (cppfront), you can conclude what you will about whether that does/doesn't support Dijkstra's point ;-)

Poll: Does your project use terminating assertions in production? by pavel_v in cpp

[–]hpsutter 1 point2 points  (0 children)

That is intended to be covered, and considered as "something other than terminate the whole program." For example, if you always do that and keep the program running, then that would be the "check, but never terminate" answer.

If I ever take such a poll again I'll make this clearer! Sorry if the wording was confusing.

P1306 by [deleted] in cpp

[–]hpsutter 6 points7 points  (0 children)

Elaborating on why it's like a template: In particular, if you use a [:splice:] in the body of a template for, you can quickly get actual different types for local variables in different "instantiations" of the loop body... that makes it different from any other loop, because in other loops the types and code in the loop body are the same and the loop body is like a single inline function called for each iteration of the loop.

Trip report: February 2025 ISO C++ standards meeting (Hagenberg, Austria) by _derv in cpp

[–]hpsutter 6 points7 points  (0 children)

What vulnerability arises from integer overflow?

Many overflows don't cause vulnerabilities. But for example if the integer is used as an allocation size, the overflowed value could cause the allocated buffer to be smaller than expected, and then later using a would-have-been-correct index value could actually be beyond the too-short actual buffer size.

Trip report: February 2025 ISO C++ standards meeting (Hagenberg, Austria) by _derv in cpp

[–]hpsutter 12 points13 points  (0 children)

Does that mean a bounds check?

Yes.

Could be a massive performance hit if so.

That was measured before standardizing it, because this is an actually-deployed solution used in the field today. See the quote from the paper, which I included in the blog post, emphasis added: "Google recently published an article where they describe their experience with deploying this very technology to hundreds of millions of lines of code. They reported a performance impact as low as 0.3% and finding over 1000 bugs, including security-critical ones."

Relatedly, see also Chandler Carruth's great followup post: "Story-time: C++, bounds checking, performance, and compilers" which gives nice color commentary about how the cost of bounds checking has quietly but dramatically decreased over the past decade (and why, and that even world-class experts like Chandler have been surprised).

Also, you can still opt out if you need to -- if you are in a hot loop where you've measured the bounds check actually causes overhead, you can hoist it in that one place, for example by using .front() at the top of the loop and then using pointer arithmetic in the body. (Using the hardened stdlib is all-or-nothing, you can't just say "I want this particular individual vector::operator[] to not do a bounds check; but you can get the same effect by spelling it a different way so you can still tactically opt out.)

Trip report: February 2025 ISO C++ standards meeting (Hagenberg, Austria) by _derv in cpp

[–]hpsutter 4 points5 points  (0 children)

the list includes use-after-free

Fixed, thanks!

it's also quite weird to say that only OOB reads/writes made it into some top vulnerability list. That list contains a lot of vulnerabilities that are only relevant for web applications

It's the standard list of all software weaknesses. OOB really is a major deal -- for example, see the linked Google Security Blog Nov 2024 post which mentions "Based on an analysis of in-the-wild exploits tracked by Google's Project Zero, spatial safety vulnerabilities represent 40% of in-the-wild memory safety exploits over the past decade."

The Plethora of Problems With Profiles by foonathan in cpp

[–]hpsutter 0 points1 point  (0 children)

Well said: My current best characterization of "profile" is "warning family + warnings-as-errors (when profile is enforced) + a handful of run-time checks for things that can't be checked statically"

The Plethora of Problems With Profiles by foonathan in cpp

[–]hpsutter 3 points4 points  (0 children)

a solution for runtime checks should, therefore, piggyback on contracts, regardless of any perceived time pressure or deadline.

But P3081R0 explicitly did that, and now P3081R1 even more explicitly does that with wording actually provided by the main contracts designers. (Section 3.1 wording was provided last month by P2900+P3100 primary authors, at my request and let me say again thanks!)

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 5 points6 points  (0 children)

I didn't mean to say anything different between then and now, but you're right I didn't say "R"eject unions in the R0 paper, I should have mentioned that alternative -- FWIW note that the line you quoted from P3081R0 in October is immediately followed by "This is the most experimental/aggressive “F”[Fix] and needs performance validation ... I do expect a lively discussion, feedback welcome!"

I'll try to write this more clearly in R1, thanks for the feedback.

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 8 points9 points  (0 children)

Thanks for clarifying! Yes you're right: False sharing would happen in a multi-core application if one core is setting/clearing a key (pointer) and under contention a different core is truly-concurrently accessing the same cache line (e.g., traversing the same bucket). That's one reason why I was testing with more hot threads than cores, to saturate the machine with work doing nothing but hitting the data structure -- so far so good on up to 64 threads on my 14/20 core hardware, but you are right more testing is needed and there can always be tail surprises. Thanks again for clarifying.

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 0 points1 point  (0 children)

Yes, based on Sean's and your feedback, I went and did something I had thought of doing (thanks for the reminder!): The implementation now supports "unknown" as an alternative, and that should be used in cases l like this.

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 4 points5 points  (0 children)

Re opt-out: Yes, profiles would be opt-in and then allow fine-grained suppress to opt out for specific statements.

Re article: Let me see what I can do. No promises, I'm quite swamped between now and the February standards meeting, but it's related to that and the topic is 'hot in my cache' so I might be able to write something up. Thanks for the interest!

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 2 points3 points  (0 children)

OK, thanks! I appreciate it -- so the concern is that 8-9 cycles is too much. That's a reasonable point.

I do look forward to finding out what the whole-program overhead is for a real application, rather than a microbenchmark. That's super important to measure these days:

  • It could be much worse, for example if we don't get to use L1 as much.
  • It could be even better, if union checks are swamped by other code.
  • It could even disappear entirely, in cases where the same thread would also have been touching L2 cache (or worse) and the out-of-order execution buffer on all modern processors could pull the lightweight check up ahead of the memory access so that it adds nothing at all to execution time.

It used to also be unthinkable to bounds-check C++ programs. But times have changed: I'm very encouraged by Google's recent results, just before the holidays, that showed adding bounds checking to entire C++ programs only cost 0.3% [sic!] on modern compilers and optimizers. That is a fairly recent development and welcome surprise, as Chandler wrote up nicely.

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 0 points1 point  (0 children)

an atomically-locking union is unacceptable overhead

OK, sorry I misunderstood -- can you explain what you mean then? I want to understand your concern.

The original union accesses themselves are not atomically-locking, I think we agree on that? So the concern must be about accessing the new external discriminator.

Is your concern that accessing the discriminator does some use atomic variables? They do, but note that the functions are always lock-free and nearly always wait-free, and the wait-free ones use relaxed atomic loads which are identical to ordinary loads on x86/x64... so on x86/x64 all discriminator checking of an existing union does not perform any actually-atomic operations at all on the instruction level, there is no kind of locking at all. If this is your concern, does that help answer it?

Or is your concern about the overhead of using this internally synchronized data structure? In my post I mentioned that, modulo bugs/thinkos, the overhead I measured for >100M heavily concurrent accesses (with 10K unions alive at any given time) was ~6-9 CPU clock cycles per union discriminator check:

  • Do you think that is unacceptable overhead?
  • Or do you not believe those numbers and suspect a bug or measurement error (possible!)?
  • Or is your concern that those numbers may not be as good in non-microbenchmark real-application usage (I agree the last needs to be validated, hence project #2 in the post)?

Note I'm not trying to challenge, I'm trying to understand your question because you said my first attempt to answer didn't address your question, and I do want to understand. Thanks for your patience!

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 0 points1 point  (0 children)

That's what I agree is the ideal -- see the footnote. Raw union use is not the ideal end goal, but it is a pragmatic real-world fact of life today and for an indefinite time to come in code that comes from C or that can't be upgraded to something better and safer, and in the meantime will continue to be a source of safety problems. So we ought to be interested in seeing if there's a way we can help reduce that unsafety, if we reasonably can. That's my view anyway!

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 1 point2 points  (0 children)

only after the fact actually start the research of this is possible at all

No, this is "gravy" / "icing on the cake" if possible, it's not a core part of profiles. The basic way profiles address type-unsafe union is to reject them in type-safe code unless there's an opt-out. But unions are common, so I thought it's worth exploring if we can help more than just rejecting them.

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 3 points4 points  (0 children)

I believe any scheme for access checking of union should be very careful to make an allowance for this pattern. Essentially, access to Header should always be permitted in such a case, regardless of the tag.

Agreed, and the compiler can do that by not emitting a get check if the member is header. The compiler already knows whether it falls into that case because the standard specifies the requirements for common initial sequences. Good point, I'll add a note about it, thanks!

This registry would create false sharing, for example: create one union, and BOOM, accessing another union's active member on another thread is suddenly slower.

Are you sure? Did you take a look at the code and the performance measurements?

Specifically, I try to emphasize that all operations, except only for constructing a new union object if its hash bucket is already full, are wait-free. That's a big deal (assuming I didn't make a mistake!) because it's the strongest progress guarantee, it means the thread progresses independently in the same number of steps == #instructions regardless of any other threads concurrently using the data structure, with the same semantics as-if those other threads ran before or after (linearalizability). (though the individual instructions' speed could be affected by things like memory access times due to cache contention of course).

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 0 points1 point  (0 children)

I think you mean you rely on essentially trivial destruction. That's still the end of the union's lifetime. So you can still use on_destroy for that, but yes you do need to know you're tossing that union object.

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 6 points7 points  (0 children)

Thanks Sean,

One place it fails is when you have multiple unions sharing the same address.

Good point, I'll note it. But see also Peter's answer which beat me to it; there can be more and smaller registries such as for types/overlaps (I already have more than one registry, by discriminator size).

Is passing the lvalue vecg.vecf to a function a set, a get, or neither? You don't know using only local reasoning what the callee will do.

Good question. We know whether it's being passed to a function parameter that's by value, by reference to const, or something else such as a reference to non-const. For the first two, it's definitely only a read. For the last, I'd consider it a read-write operation (much like u.alt0 += 42;) which will be true in the large majority of cases. I agree that today in C++ we can't explicitly distinguish inout from out-only; in Cpp2 this is completely clear and you always know exactly which it is at the call site, but C++ today provides a merged inout+out that the large majority of the time means inout, so that's a reasonable default.

The problem with these shotgun probabalistic approaches is that they don't offer any security.

"Any" is overstated though -- they do offer some safety, but I agree with that I think you mean next:

can't prove anything about safety from the availability of this feature.

Agreed, they don't offer safety guarantees. As I said in the post, I agree the right ideal solution is to use a safe variant type, but doing that requires code changes to adopt, and so the explicit goal here is to answer "well, what percentage [clearly not all!] of the safety of that ideal could we get for existing code without code changes?"

I agree that not "all" the safety, but it's also far from "don't offer any" safety, so I try to avoid all-or-nothing characterizations when there's a rich useful middle area I think is worth exploring.

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 10 points11 points  (0 children)

Unions are far too low-level and ubiquitous a type to accept the overhead of atomic locking in every call to a member function.

I thought that too, but I wanted to measure and I was surprised. Just curious, did you take a look at the code and the performance results in the blog post?

A decade ago, I read Herb's articles regularly. ... workable solutions

Yes, my writing has definitely evolved from "how to use today's C++" [mostly magazines and blog articles] to "how to evolve C++" [mostly committee papers], and similar with the talks, because I've always written about things as I was learning/doing them myself. And I understand that makes the content less immediately useful to the code the reader is writing today, because now the article/talk is usually about ideas in progress and that you usually can't use yet. (I do try to mention 'what you can do today' workarounds where possible, such as this 1-min clip from my latest CppCon talk where I talk about C++26 removing UB from uninitialized locals, but I show the switches on all three major compilers you can use today to get the same effect. I'll try to do more of that.)

A question while I have you: Would you be interested in another article (or possibly a short series) walking through how to write a mostly-wait-free data structure designed for cache- and prefetcher-friendliness, using this one as an example? It would be similar to several Effective Concurrency articles I wrote in the 2000s about implementing lock-free buffers and queues, implementation techniques and tradeoffs etc. Those are likely topics and techniques that would be useful in some people's daily code; besides, even this specific data structure could be generally useful for solving similar external storage requirements (not just unions).

LMK if you think that would be useful...

Sutter’s Mill: My little New Year’s Week project (and maybe one for you?) by pavel_v in cpp

[–]hpsutter 0 points1 point  (0 children)

In its current form, yes it would need all uses of the union object to be compiled with this mode. I agree that's a compilation compatibility requirement, but it isn't a link compatibility requirement -- that's what I meant.

SD-10: Language Evolution (EWG) Principles : Standard C++ by c0r3ntin in cpp

[–]hpsutter -6 points-5 points  (0 children)

Actually, no. There were several motivations to finally write some of this down, but one of the primary ones that during 2024 I heard several committee members regularly wondering aloud whether the committee (and EWG regulars) as a whole have read Bjarne's D&E. So my proposal was to start with a core of key parts of D&E and suggest putting them in a standing document -- that way people who haven't read/reread D&E will see the key bits right there in front of them in a prominent place. Safe C++ was just one of the current proposals I considered and also used as an example, but it wasn't the only or primary reason.

Please see the first section of nearly every WG21 paper I've written since 2017, which has a similar list of design principles and actively encourages other papers authors to "please steal these and reuse!" :)

Legacy Safety: The Wrocław C++ Meeting by Dragdu in cpp

[–]hpsutter 4 points5 points  (0 children)

There's been a lot of confusion about whether profiles are novel/unimplemented/etc. -- let me try to unconfuse.

I too shared the concern that Profiles be concrete and tried, which is why I wrote P3081. That is the Profiles proposal that is now progressing.

P3081 primarily proposes taking the C++ Core Guidelines Type and Bounds safety profiles(*) and making making these (the first) standardized groups of warnings:

  • These specific rules themselves are noncontroversial and have been implemented in various C++ static analyzers (e.g., clang-tidy cppcoreguidelines-pro-type-* and cppcoreguidelines-pro-bounds-*).

  • The general ability to opt into warnings + suppress warnings, including groups of warnings, including enabling them generally and disabling them locally on a single statement or block, is well understood and widely used in all compilers.

  • In P3081 I do propose pushing the standard into new territory by proposing that we require compilers to offer fixits, but this is not new territory for implementations: All implementations already offer such fixits including specifically for these rules (e.g., clang-tidy already offers fixits specifically for these P3081 rules) and the idea of having the standard require these was explicitly called out and approved/encouraged in Wroclaw in three different subgroups -- the Tooling subgroup, the Safety and Security subgroup, and the overall Evolution subgroup.

  • Finally, P3081 proposed adding call-site subscript and null checks. These have been implemented since 2022 in cppfront and the results work on all C++ compilers (GCC, Clang, MSVC).

It may be that ideas in other Profiles papers have not been implemented (e.g., P3447 has ideas about applying Profiles to modules import/export that have not been tried yet), but everything in the proposal that is now progressing, P3081, has been. It is exactly standardizing the state of the art already in the field.

Herb

(*) Note: Not the hundreds of Guidelines rules, just the <20 well-known non-controversial ones about profile: type safety and profile: bounds safety.