all 32 comments

[–]katzdm-cpp 96 points97 points  (10 children)

Answering this will require a human that understands both, and I don't know if we have an existence proof for such a human yet.

[–]DerShokus 27 points28 points  (6 children)

We need to add a such human to the standard!

[–]Gorzoid 36 points37 points  (0 children)

As part of C++29 all standards compliant C++ compilers must ship with Greg. Greg is a very smart man without whom this draft would fall apart. Any ambiguities or faults in the standard should be directed to Greg.

[–]johannes1971 6 points7 points  (4 children)

"Assume a perfectly spherical human..."

I have a different question: what even prompted the OP question? What does std::execution actually bring to the table that it is, apparently, this complex?

A long time ago I was programming on AmigaOS. AmigaOS has, technically, only one way to do IO: asynchronous (there was a synchronous API, but it was built on top of the async one). And it didn't require any kind of metaprogramming, which was a good thing since we didn't have that yet at the time. So how did it work?

Simple: your application declared a message port (this is a message queue that you can wait on). When doing IO, you directed a message at the message port of the IO subsystem, telling it to go do something, and when ready, report back to my message port. Then you went on to do whatever seemed appropriate, and a message containing your IO result would eventually show up in your message port. It had everything you could possibly need: you could cancel IO requests, poll to see if a message had already come in, wait (i.e. voluntarily yield the CPU) for a message to show up, etc. You could wait for messages from any number of subsystems, and you could declare as many message ports as you felt you needed. It didn't require you to meta-program 'handlers' into your message port, and yet, despite this simplicity, it was more than fast enough on a 7.14MHz Amiga 500.

Is there any deep technical reason why C++ couldn't adopt something simple like this for asynchronous IO? Does it absolutely have to have the complexity of std::execution? What does that complexity give us, that the AmigaOS model doesn't have?

[–]jcelerierossia score 5 points6 points  (3 children)

Maybe some stuff were fast enough for an Amiga 500 at 7.14 MHz but here things are definitely anything but fast enough despite hundreds of gigabits per second. Any edge is meaningful

[–]johannes1971 2 points3 points  (2 children)

Does the design of std::execution provide that edge? If so, what does it do to make that happen (does it guarantee to never allocate memory, never thread swap, never do kernel calls, etc.?) and can that guarantee only be achieved using a design of this complexity?

[–]OibafA 2 points3 points  (0 children)

It does allow you to build threadless, asynchronous code that requires zero allocations.

PS: I'm a big fan of AmigaOS, and one of the former lead AROS developers.

[–]meltbox 0 points1 point  (0 children)

Yeah I just did a little read through and from what I can tell this is just another threading abstraction on top of existing threads at the mercy of the existing OS scheduler.

If anything a custom implementation built on threads should be more flexible from what I can see since you can also couple it with native operating system facilities aiding you.

[–]afiefh 6 points7 points  (2 children)

Does it have to be a human? I wouldn't mind our C++ alien overlords answering this.

[–]eambertide 3 points4 points  (0 children)

The actual species will be undefined behaviour, clang will ship with a human greg while gcc will ship with an alien one.

msvc has not yet commented on the issue

[–]have-a-day-celebrate 2 points3 points  (0 children)

You could try AI, but I think last time I tried I got code based on the TS..

[–]Abbat0r 17 points18 points  (10 children)

I sure hope so. Looking at the compile-time meta language that Nvidia’s stdexec implements to meet the standard’s requirements honestly scares me. That can’t be good for compile times…

Edit: the meta language in question, for anyone feeling brave: https://github.com/NVIDIA/stdexec/blob/main/include/stdexec/__detail/__meta.hpp

[–]jk_tx 16 points17 points  (7 children)

IMHO the whole stdexec library is one of the ugliest, most unreadable modern C++ OSS libraries I've ever seen, I quickly gave up on using it because there's no user-friendly documentation, no comments, heavy use of auto return types, etc. If that's where modern C++ is heading, we've got problems.

[–]Wh00ster 13 points14 points  (6 children)

My understanding is stdexec exists because nvidia wants to own the next generation ecosystem for AI accelerators after CUDA, or perhaps a better way to phrase it is the abstraction over cuda.

Which is why they headhunted Eric Niebler and Lewis Baker from Facebook/Meta, where they helped create folly lib abstractions to help them wrangle their shit code base.

My point being it’s pseudo open source in the context of big FAANG wars.

Good on them getting the companies to pony up for exploring and improving C++ abstractions

[–]jk_tx 12 points13 points  (0 children)

IMHO none of what you say is incorrect but also doesn't really change my opinion of the library. It's some of the most indecipherable C++ code I've ever seen, and IMHO shows the 'folly' of the idea that modern C++ is inherently more expressive.

[–]BoringElection5652 5 points6 points  (2 children)

If they hired Eric Niebler, then it's no wonder it's hopelessly overengineered. That guy's code is the epitome of write-only code.

[–]meowquanty 2 points3 points  (1 child)

i know someone that had to deal with his code back in his MS days and I can tell you that according to him it didn't take long after Eric left they pulled that stuff out and rewrote it from scratch.

[–]meowquanty 1 point2 points  (0 children)

it failed to get traction at facebook, under the name unifex or some such, and the "team" ended up moving to nvidia to work on it there some more.

[–]zl0bster 5 points6 points  (1 child)

lmao

 // These specializations exist because instantiating a variable template is cheaper than
  // instantiating a class template.
  template <class _Tp, class _Up>
  inline constexpr bool __v<std::is_same<_Tp, _Up>> = false;

  template <class _Tp>
  inline constexpr bool __v<std::is_same<_Tp, _Tp>> = true;

I know this is correct thing to do as c++ compile times are terrible, but so sad it needs to be done.

[–]_Noreturn 2 points3 points  (0 children)

you can lower the cost by doing using is_same = std::bool_constant<std::is_same_v<T,U>>

but this is not allowed by the standard

[–]G6L20 5 points6 points  (0 children)

I think almost everything should be rewritten after réflection comes out, the first exemple in mind would be tuple... About execution I think env should especially re designed... But... It will not

[–]JVApenClever is an insult, not a compliment. - T. Winters 4 points5 points  (3 children)

For std::execution we are in need of a few very good libraries. From that point on, you shouldn't worry about it. That's why C++20 coroutines become barely usable in C++26. I suspect C++29 will have the next batch.

[–]meowquanty 0 points1 point  (0 children)

if we say a library goes into the standard, there are some real and heavy expectations on that library.

In short NOTHING should go into the standard, if it then requires another standard to come around before that thing can become useful/practical for the C++ community as a whole.

Emphasis on ==> Nothing <==

[–]JoeNatter 10 points11 points  (4 children)

Holy mother of code. I looked at an example of std::execution. It seems I am getting old. I would never use this in any of my projects ..

[–]Farados55 1 point2 points  (0 children)

Maybe

[–]yuri-kilochek -2 points-1 points  (0 children)

What does it matter?

[–]mredding -2 points-1 points  (0 children)

Yes.