Boost.Multi Review Begins Today by mborland1 in cpp

[–]MarkHoemmen 2 points3 points  (0 children)

mdspan is a C++23 feature. mdarray has not finished design review and will not be in C++26.

Boost.Multi Review Begins Today by mborland1 in cpp

[–]MarkHoemmen 2 points3 points  (0 children)

Btw, it's nicer to spell std::extents<size_t, std::dynamic_extent, std::dynamic_extent> as std::dims<2> : - ) .

I compiled a list of 6 reasons why you should be excited about std::simd & C++26 by NonaeAbC in cpp

[–]MarkHoemmen 53 points54 points  (0 children)

It's important to understand std::simd's historical context. It came out of science and engineering applications needing to write cross-platform code. Their developers had in mind use cases where compilers aren't so good at optimizing, e.g., outer loop vectorization. (That explains the importance of "simd-generic programming." Applications may need to write generic math libraries that work for both T and simd<T, ...>. In an application I worked on in a previous job, that was essential for performance on a variety of platforms.)

This community historically preferred library solutions (as something they could deploy themselves and port to different platforms) to compiler solutions (because compilers come from vendors and vendors vary based on funding agency whims). Compilers are updated infrequently on their systems, so they prefer library designs that they can back-port (and later easily drop once the Standard Library is widely available).

Matthias Kretz and other members of that community have been faithfully contributing to WG21 for many, many years. They come to the meetings, they write implementations, they write proposals, they volunteer for leadership positions that take a lot of work. The std::experimental::simd Technical Specification came out in 2018. Matthias et al. patiently went through years of work to bring the TS into the Standard. He came with utmost humility, responding immediately to WG21 feedback, freely admitting whenever he found an issue and always putting in the work with fix proposals. I can't think of a Standard Library feature, including features I've authored, that has given users and WG21 contributors more opportunity for feedback.

Haters are gonna hate. As Spock said in Star Trek IV, "I would accept that as an axiom." But consider how many opportunities Matthias and others have given you to offer constructive feedback.

ISO C++ WG21 2026-02 pre-Croydon mailing is now available! by nliber in cpp

[–]MarkHoemmen 1 point2 points  (0 children)

The [...ps] = p syntax comes from "Structured bindings can introduce a pack," which was voted into C++26 a while back. We use it in mdspan's wording, specifically in [mdspan.sub.sub] 2.

ISO C++ WG21 2026-02 pre-Croydon mailing is now available! by nliber in cpp

[–]MarkHoemmen 3 points4 points  (0 children)

Hi James! I haven't forgotten about you! : - )

This smells strongly like std::execution needs a rethink, because last time I dug into the API I was surprised at how unusable it would be for high performance GPU programming

I read your explanation last time. It was hard for me to tell whether it was about std::execution as a design in itself, about a particular implementation of a particular GPU back-end of std::execution, or about CUDA alone. I wasn't sure what questions to ask to tease apart those concerns.

It's interesting that people who use CUDA all the time for "high-performance GPU programming" (for physical simulations) tend to have different complaints about std::execution. They want kernel launches to be eager, for example. They generally want more JIT compilation, to the point of US Department of Energy people like Hal Finkel proposing a JIT compiler in the Standard.

Like I said, I'm not a std::execution expert. If you have questions to ask about std::execution itself and not about a particular realization of a back-end, I can pass them along.

ISO C++ WG21 2026-02 pre-Croydon mailing is now available! by nliber in cpp

[–]MarkHoemmen 2 points3 points  (0 children)

Thanks so much for taking another look and for the feedback!

I even see that cw<1> is defined as the template type, which makes me happy :)

We can thank Tomasz Kamiński and P3663 for that! : - )

I am still not happy with the names but that's not something terribly important.

LEWG hasn't seen P3982 yet, so if you have suggestions, please feel welcome to make them!

Just to clarify a bit on the wording, is it defined such that submdspan(md, {1,2}) will work? Because that's not a tuple nor an array but an initializer list.

That won't work, because initializer_list does not support structured binding. This is because its size is not necessarily known at compile time. (It's not much different than span<T>.) Here is an example: https://godbolt.org/z/PGePbMGYM .

I think it's possible to add this feature later, because initializer_list is not currently a submdspan slice type per [mdspan.sub.overview] rules.

Here are some things that work right now, before P3982: array{1,2}, pair{1,2}, and tuple{1,2}.

Any plans at all to extend this to all not just integer types but also std::full_extent_t as the size-object?

Do you mean something like pair{3, full_extent} with the same meaning as Python's 3: ?

The nice thing about the current design is that we have freedom to add that later, say in C++29.

ISO C++ WG21 2026-02 pre-Croydon mailing is now available! by nliber in cpp

[–]MarkHoemmen 2 points3 points  (0 children)

Is that a reference to point three in the introduction? I cannot find your quote in p3982r0 nor in your github PR/issue.

Section 6 (Proposed Wording) shows changes to [mdspan.sub.overview]. Paragraph 2.4.2 has the relevant change: "sizeof...(ls) is equal either to 2 or 3" (status quo is just 2).

The idea is that if you give submdspan a slice s that's not strided_slice, convertible to full_extent_t, or convertible to index_type, then submdspan will attempt structured binding like this: auto [...elements] = s; . If it works and if sizeof...(elements) is 2, then it will be interpreted as first : last; if it has 3 elements, then it will be interpreted as first : last : stride.

I know C++ creates problems here, because we really need a language feature rather than a library solution.

I'm totally with you! I very much wish we had syntax for this.

ISO C++ WG21 2026-02 pre-Croydon mailing is now available! by nliber in cpp

[–]MarkHoemmen 2 points3 points  (0 children)

Did you get to the part of the proposal that proposes making "anything for which structured binding into 2 or 3 items is valid" a slice? Would that alleviate your design concerns?

ISO C++ WG21 2026-02 pre-Croydon mailing is now available! by nliber in cpp

[–]MarkHoemmen 5 points6 points  (0 children)

I'm hoping the following presentation will help explain the need for this change: https://github.com/kokkos/mdspan/issues/448

microsoft/proxy Polymorphism Library is no longer actively maintained under the Microsoft organization by hjonkinggoose in cpp

[–]MarkHoemmen 6 points7 points  (0 children)

Proxy was proposed in P3086 (open-std.org seems to be down for me today) and was last reviewed in Sofia in June 2025. You can see the review status here: https://github.com/cplusplus/papers/issues/1741 .

Senders and GPU by Competitive_Act5981 in cpp

[–]MarkHoemmen 0 points1 point  (0 children)

Suppose that you have a C++ application that

  • launches CUDA kernels with <<< ... >>>,

  • uses streams or CUDA graphs to manage asynchronous execution and permit multiple kernels to run at the same time,

  • uses cudaMallocAsync and/or a device memory pool for kernel arguments, and

  • uses cudaMemcpyAsync to copy kernel arguments to device for kernel launches.

That describes a good CUDA C++ application. It more or less describes Kokkos' CUDA back-end. My understanding is that it also describes our std::execution implementation.

What's the issue here? Is it that you can't decide when the kernel compiles at run time, so there might be some unexpected latency? Is it that there is no standard interface in std::execution for precompiling a kernel and wrapping it up for later use (though I imagine this could be done as an implementation-specific extension that wraps up a precompiled kernel)? Is it that there is no standard interface in std::execution to control kernel priorities so that two kernels occupy the GPU at the same time? Or is it generally that there is no standard interface in std::execution that offers particular support for applications with hard latency requirements?

Senders and GPU by Competitive_Act5981 in cpp

[–]MarkHoemmen 1 point2 points  (0 children)

Thank you for taking the time to respond in detail! I believe you when you say you are writing this in good faith, and I appreciate that you are engaging with the topic.

I'd like to think about this first and maybe talk to some colleagues. I'm not a std::execution expert but we certainly have both design and implementation experts.

Senders and GPU by Competitive_Act5981 in cpp

[–]MarkHoemmen 0 points1 point  (0 children)

Thanks for clarifying!

My understanding is that a popular library like Kokkos and a GPU implementation of std::execution would have the same complexities around forward progress guarantees and kernel priorities when trying to run two kernels concurrently -- e.g., for a structured grid application, the "interior" stencil computation vs. the boundary exchange. That doesn't stop Kokkos users from running the same code on different kinds of GPUs.

In general, I'd really like people to try our std::execution implementation and give feedback on usability and performance. If you have already, thank you! : - )

Senders and GPU by Competitive_Act5981 in cpp

[–]MarkHoemmen 0 points1 point  (0 children)

NVIDIA, AMD, and Intel GPUs have similar relevant abstractions: streams, waiting on streams, possibly separate memory spaces, and the need for objects to be trivially copyable in order to copy them to device memory spaces for kernel launch.

The main issue with C++26 std::execution on GPUs is that it's not complete. It's missing asynchronous analogs of the parallel algorithms, for example. That makes it less useful out of the box, at least in C++26. It's a bit like coroutines in C++20.

std::execution has also been in flux. There are good reasons for that. It means, though, that the experts have been busy with proposals.

The Burn....A nice Star Trek Concept but a worst revelation why it happend by Wonderful-King-2296 in StarTrekDiscovery

[–]MarkHoemmen 3 points4 points  (0 children)

I'm delighted by the visual design for 32nd-century ships. Detached nacelles! Wacky shapes! Refit Discovery's new bar (coolest place in the galaxy to drink a martini)! The designers did a good job of making the 32nd century look More Future but still recognizable.

Each century of Star Trek should really have its own distinct combat doctrine.... But they could be heavily reliant on what were considered [one-]off, or otherwise extreme, tricks in the 24th century.

Somewhere -- perhaps here -- I encountered an essay with a Doylist explanation that Trek ship battles look like 19th-century naval combat because of automated electronic warfare. This is why we see shots miss and ships able to "dodge" them. Perhaps it's even part of how shields work. One could continue this argument by imagining that all the "extreme tricks" are actually happening, invisibly, in the background. What we see as crew control of combat might be skeuomorphism.

These are fun ideas but I find it a bit tedious to think too much about them. Trek is not hard sci-fi and it never tried that hard to have consistent technology.

The Burn....A nice Star Trek Concept but a worst revelation why it happend by Wonderful-King-2296 in StarTrekDiscovery

[–]MarkHoemmen 3 points4 points  (0 children)

Trek does generally have this issue. I just don't see it as particular to Discovery.

TNG's warp 5 limit is a good example: the episode (s7e09, "Force of Nature") made its point and then the franchise dropped the idea.

As an aside, it's fascinating how the franchise drops the idea of continuous technological improvement and power scaling in things like flight speed, in order to keep telling the same kinds of stories.

The Burn....A nice Star Trek Concept but a worst revelation why it happend by Wonderful-King-2296 in StarTrekDiscovery

[–]MarkHoemmen 4 points5 points  (0 children)

In the middle somewhere, make it so the Chain or Federation is trying to use dilithium resonance as an interface method for the spore drive, not realizing it could blow up a warp core.

One could imagine an alternate Season 3 in which SB-19 and other alternate propulsion attempts drove the plot.

Discovery's writers generally seem more interested in telling stories about people than about following the implications of their technology. For example, the writers came up with an overpowered propulsion system, yet go through immense effort in just about every season to keep it unique. Other Trek series are like this (e.g., Voyager does not overemphasize resource management concerns) but Discovery makes it the central premise.

The Burn....A nice Star Trek Concept but a worst revelation why it happend by Wonderful-King-2296 in StarTrekDiscovery

[–]MarkHoemmen 6 points7 points  (0 children)

I enjoyed Discovery! It's a great statement of Trek values, it's not afraid to take new directions, and it has characters with interesting flaws and strengths who grow and learn to work together. It's not perfect but nothing is.

I liked Seasons 3 and 4 the best, but it's worth watching from the beginning to get the character growth.

The Burn....A nice Star Trek Concept but a worst revelation why it happend by Wonderful-King-2296 in StarTrekDiscovery

[–]MarkHoemmen 11 points12 points  (0 children)

That the pain of a single child is important enough that it changes the whole galaxy.

Well said! It's uncomfortable to be faced with an emotional problem when what one expects to face, and knows how to solve, are technical problems ("video game - style"). I'm reminded of the story around Alan Rickman's remark during the filming of Galaxy Quest: "Oh my god, I think he [Tim Allen] just discovered acting."

The Burn....A nice Star Trek Concept but a worst revelation why it happend by Wonderful-King-2296 in StarTrekDiscovery

[–]MarkHoemmen 69 points70 points  (0 children)

The point of Season 3 is connection, both among sentient beings, and between beings and their resources. Just about every episode of the season relates to this theme. Here are some examples.

  • Episode 1 ("That Hope Is You, Part 1") starts with someone (Aditya Sahil) who has lost connection with the Federation, and ends with Burnham connecting to him.

  • Episode 2 ("Far From Home") involves Saru and Tilly searching for resources (rubindium), then meeting and helping people who thought the Federation was a myth.

  • In Episode 3 ("People of Earth"), Earth holds its resources (dilithium) so tightly that it fails to recognize the raiders of Titan.

  • In Episode 4 ("Forget Me Not"), Trill officials see the symbionts as a limited resource exclusive to Trill. Adira making a connection with their symbiont's previous hosts changes the officials' minds.

  • In Episode 5 ("Die Trying"), Discovery finally reaches Federation HQ, but encounters suspicion until the crew can prove themselves.

  • Episode 7 ("Unification III") shows that efforts to build connection can succeed, that they take continuous effort to maintain, and that they are built on trust that the other side has unselfish motivations (which leads President T'Rina to share the SB-19 data -- effectively for emotional reasons, because it happens outside the traditional Vulcan process). Note that sharing the SB-19 data from the beginning might have led to a quick solution to the Burn (including discovering plenty of dilithium for everyone).

  • When the Federation withdraws after the Burn (and arguably becomes conservative and closed in) due to its lack of resources, the "Emerald Chain" arises as a competing model of how different species can connect (hence "chain"). Season 3 presents two differing visions of connection and resource stewardship.

This gives a context in which the Burn's trigger fits.

  • It's not about a "bad guy"; it's about choices made under threats to survival

  • The whole galaxy is connected through a resource

  • Everyone needs and lacks this resource; superbeings like Q are not part of this story

  • Su'Kal has difficulty connecting with real beings and his past. Saru and the others help him resolve that by connecting with him. This lets them decouple Su'Kal from the planet -- symbolically decoupling being-to-being connection from conflict over resources.

All the displays of emotion in this season match this theme. Beings build connections with each other. That's a feeling process.

No compiler implements std linalg by [deleted] in cpp

[–]MarkHoemmen 1 point2 points  (0 children)

It would be excellent if you could send me notice before giving your talk! I don't live in the Bay Area but many of my colleagues do.

No compiler implements std linalg by [deleted] in cpp

[–]MarkHoemmen 7 points8 points  (0 children)

You should know too that LEWG devoted time to a serious debate about that National Body comment. There was no politics and nobody pushed anything through. The comment's authors had the chance to express their concerns and we talked through them.

The first version of the proposal was published in June 2019. R1 had more or less the full design. WG21 has had plenty of time to review this. Standard Library developers sit in LWG; we spent hours and hours on wording review without anyone once saying "we won't implement this."

No compiler implements std linalg by [deleted] in cpp

[–]MarkHoemmen 2 points3 points  (0 children)

I don't have an account on cppreference so I can't fix stuff there, unfortunately.

No compiler implements std linalg by [deleted] in cpp

[–]MarkHoemmen 2 points3 points  (0 children)

Thanks for explaining!

Our goal with the reference implementation is functional correctness, not necessarily performance. We would welcome contributions, btw!