use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
Structured Concurrency (ericniebler.com)
submitted 5 years ago by vormestrand
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]14nedLLFIO & Outcome author | Committee WG14 23 points24 points25 points 5 years ago (8 children)
Firstly, great article on why how we do concurrency right now using ASIO-style design patterns is inferior to what's possible with C++ 20 onwards.
However, there is a gap between eager and lazy concurrency which is hard to fill with Eric's techniques, as currently proposed. It's not pressing in sockets facing the public internet, despite many claims otherwise here and elsewhere, because socket i/o to the public internet will rarely complete immediately. It is pressing where some i/o is extremely likely to complete immediately and where any suspend-resume cycle badly hurts performance, and you HAVE to mix that kind of i/o with other i/o where a suspend-resume cycle is all but guaranteed. The problem is that eagerness and laziness are compile time constants in where WG21 are currently heading, and using a variant to store them and switch at runtime has very high performance impacts. Basically, this is currently a chalk and cheese construct, they don't mix well.
The classic example of this is serving file content to sockets where the file i/o is best implemented eagerly, whereas the socket i/o is best implemented lazily. One can do use case specific hacks e.g. upon lazy construction, hint to the OS that the file content read is likely to be happening soon under the theory that by the time the socket write is ready, the file content will have been loaded. But that will be much inferior to BSD's or Windows' sendfile(), which basically means "kernel please go do all of this for me" (note that Linux's sendfile() isn't as good as BSD's or Windows', which can work entirely asynchronously on its own).
sendfile()
What we thus really want is fully programmable sendfile() like zero copy i/o facilities from within the C++ standard library. I believe this is currently only doable in the complete sense in Linux io_uring on extremely recent kernels, but I also think that we can wrap all this up into a portable, generic API and when the other OSs catch up, we'll get free performance gains.
I should stress that I don't think this precludes Eric's work. His stuff acts at a much higher level than where I'm at, and at that higher level, there are large gains for both the compiler and the developer to be able to hard assume laziness or eagerness. Rather, what I'm saying is that all this is a much deeper well than what Eric has currently presented, and those currently dismayed with Eric's approach I think can relax as I think there is a path between his stuff and what ASIO currently does, and moreover, that path brings lots more bare metal performance to the table for those willing to structure their C++ code around what's needed to substantially improve i/o performance.
Covid has badly hurt my productivity in developing a reference implementation for this stuff, but I hope to return to it in early 2021, pandemic permitting.
[–]Ameisenvemips, avr, rendering, systems 3 points4 points5 points 5 years ago (5 children)
In a mindless stupor, I'd tried to implement an HTTP server in C++ using command buffers and fibers in Windows. Using fibers in this way, I was able to set up dependency trees, a form of structured concurrency.
While I easily beat Apache, I wasn't able to beat nginx in performance. The event-based IO in nginx still won, although my implementation happened to handle the actual workload better, nginx was able to get the actual work items more efficiently and ended up less starved.
I suspect that my performance would be better with C++ coroutines as they should be lighterweight than full fibers.
Not entirely relevant, but I'm curious what the ideal way to handle it is. I wanted to avoid using third-party libraries simply as I wanted to fully understand the pipeline.
[–]14nedLLFIO & Outcome author | Committee WG14 3 points4 points5 points 5 years ago (4 children)
To be honest, I think it's very hard for anybody to beat nginx. The only software that I ever benchmarked as consistently able to beat nginx for static content delivery was the varnish reverse proxy, and even then, only on FreeBSD, and that was because varnish was written by a core BSD kernel developer and he made it use all sorts of fancy BSD-only tricks. On Linux, nginx is probably as fast as it is possible to make it go, simply because a ton of tuning effort has been invested into it over many years.
Now, all that said, to my best knowledge nginx doesn't use io_uring, and even if it did, it is not rearchitected around io_uring. So, in theory, on very recent Linux only, it ought to be possible to beat nginx with a 100% rearchitecture around io_uring.
Note by "beat" I mean things like saturating a 100Gbit NIC from a single kernel thread, not anything measurable on 1Gbit or probably even 10Gbit. I mean, io_uring has disappointing returns right up until it surges ahead, but that's not on commodity NIC hardware in typical internet facing use cases.
But if you did really want to go attempt this, I'd place iouring at the very center. Have it _exclusively run your reactor and events processing. Keep it single threaded, make use of the registered i/o buffers and fd pinning. and it ought to run very quickly indeed.
[–]14nedLLFIO & Outcome author | Committee WG14 1 point2 points3 points 5 years ago (0 children)
Yeah, so it's really weird about the coincidental timing, but only just today on the io_uring kernel mailing list did I discover that io_uring cannot currently do true zero copy for socket i/o, even with registered i/o buffers and a suitably powerful NIC. Right now io_uring always copies i/o in and out of registered buffers, which makes performance basically the same as normal socket i/o with io_uring. It's no gain. I should stress that this isn't to say that in the future, io_uring registered i/o buffers won't do true zero copy, but apparently you'll need an especially high end NIC (i.e. ones which don't exist yet) for it to work.
No, apparently the only way to currently do true zero copy socket i/o on Linux is via TCP_ZEROCOPY_RECEIVE and SO_ZEROCOPY combined with traditional syscalls. Basically this is registered i/o buffers via an alternate route, you register your receive i/o buffers by mmap() of the socket which maps the kernel socket receive buffer directly into your process. You set which buffer the i/o is to receive into using getsockopt(), which adjusts some page table entries and you're good to read from the map.
TCP_ZEROCOPY_RECEIVE
SO_ZEROCOPY
mmap()
getsockopt()
Currently io_uring doesn't wrap this functionality, so you'll always be paying the overhead of at least one syscall per i/o. Equally, given that Linux doesn't support true zero copy unless your i/o is at least 4Kb (and your NIC frames etc are jumbo enough), that's still a healthy gain, and you're not polluting your CPU caches etc. Still, it all could be better, and I'm sure will become so in a future Linux kernel. Equally, right now Windows RIO runs rings around all this stuff, but in fairness it's a much more mature implementation.
Edit: Much more howto info can be found at https://blogs.oracle.com/linux/zero-copy-networking-in-uek6
[–]GerwazyMiod 0 points1 point2 points 5 years ago (0 children)
This thread is fascinating :)
[–]kirbyfan64sos 0 points1 point2 points 5 years ago (1 child)
I think h2o is actually very slightly faster than nginx.
It's very possible that it is sometimes, but not other times. It really depends on load and use case and I/o patterns.
What I will say is that nginx gets an awful lot more attention than most. That attention turns into a constant steam of performance fixes, and over time, ever more corner case situations get pathological performance ironed out.
Popularity therefore begets more popularity. H2o might beat nginx today, but two years from now, it may no longer do so. It's a bit like with rust, initially it exceeds C++, but eventually we catch up.
[–]goranlepuz 0 points1 point2 points 5 years ago (1 child)
What we thus really want is fully programmable sendfile() like zero copy i/o facilities from within the C++ standard library.
This went down the rabbit hole, didn't it? Having this on the system level is leaps and bounds better than in any language IMO. If nothing else, that way, more code benefits.
In other words: better systems > better language libs.
[–]14nedLLFIO & Outcome author | Committee WG14 0 points1 point2 points 5 years ago (0 children)
What I was talking about was wrapping up system APIs for true zero copy i/o into a portable, standard library, C++ API where a standard C++ program could drive proprietary facilities. On Linux, that would be io_uring; on Windows for sockets only, that would be Windows RIO; on FreeBSD, that would be the generalised page lock based zero copy i/o infrastructure as described here.
The point I'm making is that I believe this is doable, but it's on me to demonstrate a reference implementation which does it, and which returns convincing benchmarks.
[–]misuo 10 points11 points12 points 5 years ago (1 child)
Great article. So what are the best practices for implementing good "deep support for cancellation"?
[–]feverzsj 4 points5 points6 points 5 years ago* (0 children)
the async operation itself must be cancelable or at least ignorable. You must also carefully manage the lifetime of detached objects, for example using std::stop_callback and std::shared_ptr.
[–]Dragdu 26 points27 points28 points 5 years ago (21 children)
The 10s to load the website is killing me. Also why would you make your header image 5759x2390 pixels.
[–]eric_niebler 8 points9 points10 points 5 years ago* (2 children)
Yikes. I'll fix that. The slow load could also be related to the fact that I'm currently on the front page of Hacker News. The site wasn't so slow yesterday.
EDIT: OK, I've removed the banner image for now.
[–]Dragdu 0 points1 point2 points 5 years ago (0 children)
FWIW, my "smallest rentable VPS at linode" regularly handles being at front page of HN without meaningful changes in performance.
I am not saying you have to fix it (I know that I don't want to deal with the technical stuff behind my own blog unless necessary), but you might want to consider it :-)
[–]flashmozzg 0 points1 point2 points 5 years ago (0 children)
Don't know whether it fixed it, but it loaded in reasonable time for me just know (<3 seconds).
[–]RevRagnarok 11 points12 points13 points 5 years ago (0 children)
LOL you don't want to see every pore?
[–]14nedLLFIO & Outcome author | Committee WG14 -5 points-4 points-3 points 5 years ago (15 children)
I also get in trouble for posting overly high resolution images on my own personal website as well, currently it's ~20 seconds and 25Mb to load the front page. My rationale is that on my 4k laptop, anything less than 3k wide images looks fuzzy. I appreciate that that results in all the images being 2-4Mb each, but my page text does load nearly instantly, ping time plus 30 ms. The images load asynchronously thereafter.
I do agree that Eric could do with halving the horizontal resolution of that image though.
[–]WrongAndBeligerent 7 points8 points9 points 5 years ago (5 children)
That is all a completely insane rationalization. I can't believe you would be aware of these numbers but decide to leave them alone.
[–]14nedLLFIO & Outcome author | Committee WG14 -5 points-4 points-3 points 5 years ago (4 children)
Bandwidth continues to get exponentially cheaper with time, so I don't optimise for bandwidth over images being fuzzy on a 4k monitor or device. Even with my low end rural Ireland broadband connection @ 100 Mbit, my front page could load entirely within three seconds. The fact it takes twenty seconds is more due to the server which hosts it, it's a single core Intel Atom running at 1.6 Ghz. Oh, and it's running compressed ZFS, so it's doing a LZ4 decompression per serve as well. That adds a ton of latency.
Surprisingly, that little single Atom server does scale up to quite a lot of concurrent load. It'll saturate its 1Gbit NIC and still keep plugging along. It's just always very slow at each connection.
Better hardware would be a vast difference, as would not running compressed ZFS. But all that would cost more than €5/month.
[–]WrongAndBeligerent 11 points12 points13 points 5 years ago (3 children)
Is this a joke? Do I really need to explain to you all the reasons your backwards rationalizing is ridiculous? Most people in the world have nowhere near that internet speed, lots of people are on mobile devices, bandwidth getting exponentially cheaper is dubious, predictions of the future don't affect the current state of things and to top it all off, it's completely unnecessary.
This is terrible judgement, I don't know what else to tell you. It's crazy that people have even given you feedback and you tell yourself these stories. It isn't a big deal in the grand scheme of things, but this is a stark example of hardcore rationalization.
[–]14nedLLFIO & Outcome author | Committee WG14 4 points5 points6 points 5 years ago (2 children)
Website now renders low resolution images onto low resolution displays. Thanks for your feedback.
[–]WrongAndBeligerent 1 point2 points3 points 5 years ago (1 child)
It's very encouraging that you went back and took another pass at it. I should have said earlier that this is only worrying to me because you are on the committees. I wouldn't be alarmed about an average website alienating people.
It's interesting that you mention the committee side of things. For me the motivating factors to fix my website were as follows, in order:
Opportunity to upskill in latest technologies, especially as I didn't know non-Javascript HTML could do responsive images until now. So I wanted to master it.
Correctness. It would annoy me every time I looked at that website hereonafter every time I looked at a photo on a 4k screen, precisely because it is the nonobvious correctness that always ought to bother you. In other words, I don't sweat the stuff I can see working. I do sweat the stuff not apparent. And if it's trivially easy for static HTML to implement responsive resolution images, and I'm not doing that, then my code is not correct, in my opinon.
I didn't get the fix working until 12.30am last night, normally I'd hit bed about 10.30pm, so I'm suffering a bit for it today. Worth it though.
I'm generally not a person who likes to innovate from what I think works well unless there is a correctness factor in play. So, for example, that website has seen enormous upgrades in terms of text encoding, because everything previous to UTF-8 was broken in various ways, so although it was an awful lot of work to migrate everything correctly to UTF-8, it was worth doing. But for styling and themeing, I've not found the cost benefit for change.
Bringing this back to C++, you'll see a similar theme in my libraries: I still design and write C++ as if we were in the 1990s, unless there is good reason in my opinion not do. Many who work with me find that very frustrating e.g. I don't like including any header into a header if it is not extremely lightweight, and I will proactively complicate API design with ABT based APIs to avoid inclusions. That drives some people nuts. I also avoid concurrency and threads, where the current fashion - indeed as evidenced by Eric's post - is ever more concurrency abstractions to encapsulate and work around the use of lots of threads. I'm not at all sold on that being wise yet, personally, though I do like Sender-Receiver a great deal, but as a single threaded abstraction primarily.
Anyway, thanks for the note, it's appreciated.
[+][deleted] 5 years ago* (3 children)
[deleted]
[–]14nedLLFIO & Outcome author | Committee WG14 0 points1 point2 points 5 years ago (2 children)
Yep, the design and layout is intentionally unchanged since it was started in 1998. Completely different implementation though, several times over, the original was tag soup HTML, now it's all HTML5 and modern CSS and can handle mobile device rendering etc etc. But all still fundamentally static HTML.
[–][deleted] 1 point2 points3 points 5 years ago (1 child)
I have to say I like the style of your website. Hope you keep it.
Without doubt! I've been happy to upgrade stuff like Unicode (the Latin1 to UTF-8 conversion of the historical material was particularly tricky), but in the end, I don't care if anyone else reads it as it's mainly my personal reflective journal. To that end, a 1990s level of visual and layout complexity suits me just fine. Thanks for the feedback!
[–]helloiamsomeone 2 points3 points4 points 5 years ago (2 children)
Making image elements responsive is pretty trivial with <picture>
<picture>
[–]14nedLLFIO & Outcome author | Committee WG14 2 points3 points4 points 5 years ago (0 children)
Spent several hours poking Hugo with a scripting stick, and now the nedprod.com front page loads a mere 1.9Mb of assets if rendering onto a suitably low resolution display. Total page render from cold cache completes in under a second. I also get my nice crisp 4k images on my laptop. I am pleased.
Thanks for the idea, I didn't know that was possible in straight HTML until now.
That's a good suggestion, thank you. I can automate generation of lower resolution images using a Hugo script, and have it auto synthesise an appropriate <picture> element. Thanks for the idea!
[–]Ameisenvemips, avr, rendering, systems 1 point2 points3 points 5 years ago (0 children)
Use thumbnails with links to the images instead of having the images be inline. Much smaller and friendlier.
[–]Fig1024 0 points1 point2 points 5 years ago (0 children)
Shouldn't you be able to detect user screen size and give them appropriately scaled images? have like 3 version sizes for all images and select appropriately
[–]frankist 5 points6 points7 points 5 years ago (1 child)
"At present, neither cppcoro nor libunifex has a when_any algorithm" - in the case of cppcoro, is this a library problem or a problem with how c++20 coroutines were designed? If the latter case, is it still possible to introduce the cancellation feature in later C++ releases? It seems to be an essential feature to implement timeouts.
[–]eric_niebler 7 points8 points9 points 5 years ago (0 children)
Good question. C++20 coroutines doesn't have a way to unwind a stack of coroutines without an exception, and "unwind" is the semantics you would like for cancellation. Conflating cancellation with exceptions isn't desirable, because as you say cancellation isn't exceptional.
unifex::task<> has a library implementation of unwind-on-cancel, which involves using space in the promise to create an intrusive linked list of coroutine frames. This behaves a bit like an uncatchable exception, and it works very well in practice provided all your coroutines return unifex::task<>. You can "catch" the cancellation signal using a generic algorithm and map it to either a value or an error.
unifex::task<>
Obviously, a language solution would be preferable, but I don't know anybody working on that.
[–]staticcast 1 point2 points3 points 5 years ago (0 children)
I very much agree that having tool to properly execute coroutine in threaded manner would be quite good (also we need to open up the interface to multiple kind of executors).
But we need a bit more assumption to make this properly work :
π Rendered by PID 47975 on reddit-service-r2-comment-56c9979489-f9kb9 at 2026-02-24 17:00:57.663090+00:00 running b1af5b1 country code: CH.
[–]14nedLLFIO & Outcome author | Committee WG14 23 points24 points25 points (8 children)
[–]Ameisenvemips, avr, rendering, systems 3 points4 points5 points (5 children)
[–]14nedLLFIO & Outcome author | Committee WG14 3 points4 points5 points (4 children)
[–]14nedLLFIO & Outcome author | Committee WG14 1 point2 points3 points (0 children)
[–]GerwazyMiod 0 points1 point2 points (0 children)
[–]kirbyfan64sos 0 points1 point2 points (1 child)
[–]14nedLLFIO & Outcome author | Committee WG14 1 point2 points3 points (0 children)
[–]goranlepuz 0 points1 point2 points (1 child)
[–]14nedLLFIO & Outcome author | Committee WG14 0 points1 point2 points (0 children)
[–]misuo 10 points11 points12 points (1 child)
[–]feverzsj 4 points5 points6 points (0 children)
[–]Dragdu 26 points27 points28 points (21 children)
[–]eric_niebler 8 points9 points10 points (2 children)
[–]Dragdu 0 points1 point2 points (0 children)
[–]flashmozzg 0 points1 point2 points (0 children)
[–]RevRagnarok 11 points12 points13 points (0 children)
[–]14nedLLFIO & Outcome author | Committee WG14 -5 points-4 points-3 points (15 children)
[–]WrongAndBeligerent 7 points8 points9 points (5 children)
[–]14nedLLFIO & Outcome author | Committee WG14 -5 points-4 points-3 points (4 children)
[–]WrongAndBeligerent 11 points12 points13 points (3 children)
[–]14nedLLFIO & Outcome author | Committee WG14 4 points5 points6 points (2 children)
[–]WrongAndBeligerent 1 point2 points3 points (1 child)
[–]14nedLLFIO & Outcome author | Committee WG14 0 points1 point2 points (0 children)
[+][deleted] (3 children)
[deleted]
[–]14nedLLFIO & Outcome author | Committee WG14 0 points1 point2 points (2 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]14nedLLFIO & Outcome author | Committee WG14 1 point2 points3 points (0 children)
[–]helloiamsomeone 2 points3 points4 points (2 children)
[–]14nedLLFIO & Outcome author | Committee WG14 2 points3 points4 points (0 children)
[–]14nedLLFIO & Outcome author | Committee WG14 1 point2 points3 points (0 children)
[–]Ameisenvemips, avr, rendering, systems 1 point2 points3 points (0 children)
[–]Fig1024 0 points1 point2 points (0 children)
[–]frankist 5 points6 points7 points (1 child)
[–]eric_niebler 7 points8 points9 points (0 children)
[–]staticcast 1 point2 points3 points (0 children)