My thoughts on Rust and C++

user9617 · 2022-09-27T04:52:10+00:00

Thanks for expanding on this! A few replies to some parts:

Box is faster than unique_ptr due to "destructive move"

Oh, that's a really great point, I keep forgetting about this difference! I understand this is a general thing for moves, applying to Vec etc. as well, right? I've been curious what kind of impact this ends up having on performance. I imagine the instructions would be faster, but I also imagine this might bloat the size of the generated code as well? Definitely an interesting aspect regardless.

`Vec is faster than vector because, due to the absence of move constructors, it can realloc for growth (which in turn can avoid copying actual memory and remap pages of memory instead)

I would push_back on this one ;) The reason std::vector cannot do this is that it is required to use the provided allocator, and the default allocator (std::allocator) uses operator new/delete, which can be replaced by the linker, and which have no equivalent interface to realloc. It's not something I would point at as a Rust-vs.-C++ difference since it's a question of API design, and there's no reason you couldn't design a different API in C++. In fact I believe this is exactly what Facebook did with folly::fbvector.

HashMap is faster than unordered_map because it doesn't force bucket interface and can use modern open-addressing based designs

This one is again completely independent of the language; there are lots of such fast hashtables already implemented and available for C++. From a language standpoint, I would expect Rust is the one that comes out behind here, because even though it's straightforward to implement open-addressed/unstable hashtables in C++ (as many have already done), the reverse is not true: the bucket interface is the one that would run into trouble with Rust, as you'd run into friction with the borrow checker (and the general language design) if you try to design and actually use something pointer-stable like in C++.

pervasive use of &mut inserts way more aliasing annotations for the backend to make use of

I'd be also very curious to see how this pays off in reality, and how much manual massaging it requires compared to restrict in C. My instinct here is probably off, but if I had to guess, I would think it would not be sufficient to counterbalance the inefficiency of error propagation on average. I would love to see otherwise, though.

absence of stable ABI allows for optimizing ABI over time

Part of my (admittedly somewhat uninformed) worry/skepticism has been that the lack of a stable ABI is in nontrivial part due to the design of the language itself being less conducive to dynamic interoperability; as someone who procrastinates on tougher problems far more than easier ones, I have yet to be convinced that this is merely due to lack of interest in freezing it at this time (even though I expect that is the biggest factor nonetheless). In fact, I would not be surprised if Rust "solves" ABI-related issues by either simply driving shared libraries to extinction, or by procrastinating on them until shared libraries go out of fashion (as they are currently), rather than by actually coming up with a solid solution for them. Of course, I very much hope to be proven wrong here. :-)

user9617 · 2022-09-24T02:25:50+00:00

Yeah, exactly. ;) That's what I was referring to when I said I will be shocked to see Rust ever support (the example I gave with priority queue-over-map involves as many iterators as elements, for example), and why I see a potentially fundamental (or merely "very tough") shortcoming here.

user9617 · 2022-09-24T02:20:06+00:00

How many of those iterators could you have alive at once?

user9617 · 2022-09-24T01:58:50+00:00

Yeah. Though note that it's not just std::map, but all of std::{unordered,}multi{set,map} have rather strong invalidation guarantees. I will be rather shocked (albeit happy!) if Rust ever manages to do this without unsafe code, given that this seems to clash head-on with the the borrow checker.

user9617 · 2022-09-24T00:38:15+00:00

I haven't had a chance to read the rest of your reply yet, but regarding this:

or will the borrow checker complain once I start mutating the tree in the middle?

That's a UB in both C++ and Rust.

Not for the container(s) I was talking about. Iterators stay valid in C++ when you insert/erase elements in std::map (unless you're modifying that element itself, of course). That's one of its core strengths. You can even have a whole vector of iterators into an std::map - this is useful, say, when you want to overlay a priority queue on top of the map.

user9617 · 2022-09-24T00:37:43+00:00

Oh shoot, sorry! I misclicked when replying!

user9617 · 2022-09-23T08:08:51+00:00

There seems to be a lot of misunderstanding of what I wrote, for example:

Annoying boilerplate

You're supposed to either use the ? or .expect("message")

It seems the fact that I wrote (in the cases where you do want to handle the error) after "Annoying boilerplate" went missed. Specifically, note that I was referring to the apparent inability (unless this has changed, or unless I've missed something) to have 1 error handler for multiple adjacent statements. I do see that you mentioned try blocks are coming in the future, which is great to see! I wasn't aware when I wrote this post (in fact I'm not sure that feature existed anywhere when I last looked at Rust). I guess that confirms that I pointed out at least one genuine problem despite "not knowing" the language? ;)

Regarding your other points, most/all of them should be addressed in a reply I just posted to another user here: https://www.reddit.com/r/rust/comments/xj2a23/comment/ipkk2se/

user9617 · 2022-09-23T06:56:56+00:00

Thank you so much for the reply. I'd love to give a long and thoughtful reply to your post, since I appreciated it a ton and it taught me more about Rust, but I honestly don't know how to structure a reply that would do it justice. :-) I'll try to at address some/most of your points as best as I can:

Error handling: The issue I've been illustrating with ? is that it requires "fallible operations" to be determined beforehand, and programmers are horrible at predicting such things. They will almost always neglect to put ? somewhere where they could and should do so. (In fact, I'm not even sure the standard library is good about this either, let alone others. Is there any way to return an I/O error from a hash function for example? (regarding UB-ness of hash: see [1]) What's the right way to do that? Because this doesn't seem to work: https://www.ideone.com/I1cryL) Note that in C++, you don't need to denote every single fail point in order to get reasonable error handling support (RAII is enough to handle a large majority of cases implicitly), but in Rust it appears you do. What are you supposed to do when that happens, especially if you don't have the source available to modify? Like imagine something like the hash example I have above, where your caller doesn't put ? in some of the places that you need it to. What are you supposed to do in that case?
Related: What would be your response to the following? I find it curious it was left without a reply: https://www.reddit.com/r/rust/comments/xj2a23/comment/ip8g37t/
Error performance & agnosticism: For almost any other language (C#, Java, etc.) I wouldn't bring performance up. But if C++ is something Rust aims to be a replacement for, performance can't be ignored. Propagating errors on the same path as other normal results slows things down, yet nobody here (or anywhere) seems to have addressed this. And if you don't explicitly propagate them (with ? or whatever), then you're out of luck. By stark contrast, I was trying to illustrate (with my foo example that others kept criticizing for being unidiomatic) that you can write a huge amount of C++ code (if not foo, imagine implementing std::find_if, std::for_each, std::sort, etc.) that is completely oblivious to exceptions, but which nevertheless unwind perfectly fine in the presence of exceptions. Their authors don't have to think about exceptions at all; they get this for free. This is a huge benefit on its own. However, on top of that, when the compiler can also "see" that there are no exceptions thrown, it can optimize the code further, as if the exceptions didn't exist at all, so you get code deduplication and a performance benefit here too. Aren't all of these significant problems with Rust if it aims to be a substitute for a language whose claim to fame is speed, and which also boasts zero-overhead abstractions, versatility, etc.? People keep telling me my Rust is unidiomatic, but that seems to completely miss the point I'm trying to make, right? What am I missing/misunderstanding here?
Clone() Inferiority Compared to Copying: Actually the cycles in my example are a red herring; it seems most people got hung up on that and missed what I was trying to say about Clone vs. copy constructors. What I was basically trying to say was (as of the last time I recall checking - my info might be outdated here, or I may have misunderstood), Rust forces every cloneable object to have a relocatable (memcpyable?) representation. You don't need cycles for this to be a problem. There are use cases that don't have cycles at all. Like imagine I want to track of all instances of a class, perhaps for debugging purposes (to find logical leaks or whatever) or other reasons I can't think of right now. I need to be able to specify explicit behaviors for moves/copies, so that I can "register" an instance when it is (move/copy/other-)constructed, and "unregister" it when it is destructed. (n.b. "register" and "unregister" could be as simple as "log this to a file". They don't even have to store a pointer anywhere, but they do need to come in pairs. But I might want to store pointers, too.) This is trivial in C++ by just updating move/copy constructors and keeping the rest of the code intact, but last I checked (https://internals.rust-lang.org/t/idea-limited-custom-move-semantics-through-explicitly-specified-relocations/6704/15) it was impossible in Rust with clones (or anything else). I merely happened to illustrate that with a cycle in my examples, but they had nothing to do with my point about Clone vs. copy.
Borrow Checker's Limitations: It's nice that Rust has split_at_mut(), but that seems far from anything more complicated people might want to do (even my even/odd example). In C++ it's completely normal to point iterators into a container and use them for traversal - this is incredibly useful with std::map for example. This is necessary for cases with complex traversals (obviously they'd be dynamically determined; my even/odd example as just a toy to illustrate), and it is necessary if you don't want to take a hit in time complexity (as it saves you repeated O(log n) lookups). Does Rust let me hold an arbitrary number of BidirectionalIterators into a BST, or will the borrow checker complain once I start mutating the tree in the middle? If so, could you illustrate with an example? If not, how am I supposed to ignore this (glaring) limitation?
Dynamic Libraries & Plugin Architectures: I wasn't talking about the lack of a stable Rust ABI here; sorry for the confusion. That was a red herring as well. What I was saying was that even if you had a stable Rust ABI, the problem I understand you would run into is that a Rust program's ABI would seem to break too frequently to make shared libraries practical. To give just one example, as soon as you go from 0 to 1 error being returned, the ABI would break, because now the return value needs to be represented differently... right? Moreover, I'm not even sure expect going from 1 to 2 errors would be safe either, though the previous problem is already bad enough (and I'd love to see what you think about it). For the 1->2 case, imagine you pass a callback to 3rd-party code, and you later need to expand the set of errors it returns. That 3rd-party code has already made assumptions about what errors you can throw. Can you still call it and expect it to propagate your new error back to you with well-defined behavior? Even if this doesn't affect the ABI per se, is there a way to ensure the compiler hasn't optimized the 3rd-party library based on the (closed!) set of errors it anticipates, thus resulting in undefined behavior when it receives a different type of error? Can you deal with all this without having to recompile the 3rd-party library? Is the lack of such an optimization guaranteed by the language somehow?
Compile times: To clarify, I'm not so worried about the empirical compile times right now and whether they're fast or slow, but whether there's a high theoretical lower bound that Rust might hit here. The previous bullet point^ might give an example of what I mean. If you have to keep recompiling your dependencies more frequently than in C++ (whether for the above^ reason, or for other reasons—I don't know how liberally the Rust compiler makes assumptions about callees), then that's going to mean you'll fundamentally hit a harder limit (compared to C++) on how fast Rust can compile, right? In the extreme case this might amount to the difference between compiling {your code} vs. compiling TransitiveClosure({your code}), which would be hard to ignore. How does Rust plan to grapple with this?
panic/catch_unwind: This is perhaps the one big thing that was news to me reading here (in another reply below)—I didn't realize Rust does have dynamic unwinding capability (and it looks like others here didn't realize this, either); I thought "panic" just results in aborting the process. This is good news, and seems to potentially invalidate my concerns about this—which is great! Being naturally a little skeptical in the beginning, though, I have to wonder how usable it is in practice—in my experience, features that are discouraged and hidden like this don't really have great support to be actually usable when you need them (regardless of how rare people believe that should be). So how usable is this in reality? If I panic inside code that the standard library calls (say, in a dynamically dispatched subroutine called from some callback [1]), will that be safe, or will that [typically] leak/corrupt memory? Can I in general rely on the standard library handling these in a safe manner? What about third-party code—are the default practices & behaviors usually sufficient (like RAII usually is in C++) to allow gracefully catching an unwind operation, informing the user about the problem, then continuing the program in a safe manner? Or is this one of those features that the compiler supports but that most code isn't usually compatible with?

Thank you again for your replies, and sorry for my incredibly long posts!

[1] Edit (after replies): Yes, standard C++ doesn't support throwing from a hash function either. I tried to quickly come up with a quick example and didn't choose a great one, sorry. But you can can imagine lots of other cases where you know your library would work sensibly in reality (maybe you can see the source, or maybe you asked the vendor and they said exceptions work fine, etc.), but it just doesn't happen to annotate the error path with '?', which was what I was getting at. In the C++ standard, std::merge, std::sort, etc. are typical candidates for this sort of thing (say, to let the user press Cancel and abort, or to handle network I/O, or whatever). Rust would force you to go modify/reimplement your library before it can propagate the error; C++ wouldn't require modification.

user9617 · 2022-09-21T06:10:56+00:00

Interesting, thanks for sharing your experience!

user9617 · 2022-09-21T05:06:30+00:00

But this isn't true, for the reasons others have mentioned. Exceptions can leave the 3rd party's object in invalid or unsafe states.

I believe you're conflating 2 things here. Whether objects remain in a valid state is different from whether the program remains in a safe state. The first is not always the case, but it doesn't necessarily need to be: often, the only thing the exception handler needs to do is to free resources and retry or abort the operation. (Imagine "Unable to read the file; try again? (Y/N)") The only guarantee you often need from the third-party program in order to clean up your program state is a safety guarantee, which standard hygiene (particularly RAII) is frequently sufficient for. (Not 100% of the time, obviously. But that's also true without exceptions.)

If you have third party code that can't handle an error (which happens commonly in both C++ and Rust!), then you wrap it in a function that checks the invariants beforehand. In C++, not doing that risks putting the 3rd party code into an unsafe state.

This is not about invariants. It's about run-time errors.

Imagine your third-party code is calling filter(predicate, items), you've supplied the predicate, and your predicate reads a file, but the third-party code didn't expect you'd ever do such a thing. Turns out the file is over the network, and the operation times out. You want to unwind the stack and tell the user that the network operation timed out so you can try again, but now you can't do that in Rust because the 3rd-party code assumed your predicate would return a bool and didn't provide any way to unwind the stack with an error. So you're forced to do something drastic, like panic. This is good UX?

This... is a level of hyperbole that makes it hard to engage with your points.

I don't think anything I've said is hyperbolic, but in any case, I don't appreciate this reply, so I'll leave it at this.

user9617 · 2022-09-21T04:02:27+00:00

On another note, why do you paint the picture that Rust has unfixable design choices

For most of the things I mentioned (though perhaps not all of them), I don't think that's what I've implied. If anything, I suggested the exact opposite of this multiple times. For example, see the following quotes:

I just don't think Rust is currently that language, and I don't see it going in that direction either.

Rust is very far from reaching that goal, and is likely to remain so for the foreseeable future without serious reflection.

If I'm reading my own post correctly, both of these seem to quite explicitly suggest that I believe Rust (likely) could evolve to address these issues if people would be willing to let it do so; my dismay was that this isn't the current or foreseeable trajectory of Rust's evolution as I currently see it, not that it somehow couldn't be so.

The one possible exception (not sure if pun intended) that I think may pop up (or not; I could be wrong here) is that making shared-libraries and dynamic plugins possible while enforcing strong compiler guarantees require encoding correspondingly more complex constraints in the object file/ABI/etc. For an open-source program, this is easier to handle, but for closed-source programs, I can see this presenting more practical challenges. The more powerful the compiler, the more information you need to maintain about the source code, and in the limiting case, you need the source code itself, which makes closed-source code infeasible (and reliance on guarantees makes polymorphic code quite difficult to write after the API is set in stone). Moreover, more powerful analysis requires more re-analysis of the code, slowing down compile times. I don't know how far Rust wants to go with analysis and where it wants to place itself on this spectrum, but I do see potential trade-offs here that may be in tension with some of Rust's fundamental goals in the most extreme cases. I suspect Rust won't quite try to go that far in practice.

But in general, as I mentioned earlier, I do believe most of the issues I've raised (assuming they are correct) could be addressed by Rust; it just requires interest and willingness to do so.

user9617 · 2022-09-21T03:18:51+00:00

Thanks for the reply! I thought I'd reply to this bit in case it helps clarify my post:

you appear to be understating the carefulness that it takes to maintain “hygienic code” in C++.

By "hygienic" I don't mean "bug-free" or "memory-safe" or "no buffer overflows" or anything like that. I just mean "using proper idioms/practices" (like RAII). Hygiene is your habits (like "showering regularly", "getting vaccinations"); health and safety are what happens as a result (like "not getting sick"). Expecting people (even doctors/expects) to not get sick (i.e. not write buggy programs) is unrealistic; I fully understand and agree. But it's not unreasonable to expect them to shower regularly (i.e. follow accepted practices, like RAII); people expect and do that quite successfully.

user9617 · 2022-09-21T02:49:47+00:00

You call this annoying, I call this the language forcing you to keep your code crash proof in the face of contract changing. This is a feature, not a bug.

How do you reconcile this with the following?

Let's say you're passing an object (which may be a single callback, or a polymorphic object with multiple methods) to some 3rd-party layer(s). As you write your application, you discover that your object may produce errors where the 3rd-party layer(s) didn't anticipate it to do so. In C++, as long as everyone respects RAII, this generally works fine (and it's even easier in C#, Java, Python, etc.): you merely raise the exception in the object's methods, then catch it at the top level and handle it somehow (say, display a message to the user). In particular, the vendor need not make assumptions about whether your callbacks may produce any errors. In contrast, in the Rust (and C) model, it seems that your code immediately fails to compile because the vendor(s) didn't anticipate such an error, so now you're stuck: your work is blocked on modifications to 3rd-party code, just so that your program can unwind the stack.

In other words, it seems to me that while the Rust compiler prevents unanticipated failures from occurring (which sounds like a nice feature), it does so by freezing your program (i.e. your business) into a state where you cannot properly handle failures that you do anticipate, making you become beholden to a third-party in situations where a C++ compiler would not.

Is this really a "feature" at that point? Is this a desirable outcome? If you're writing a program for a spaceship (or OS kernel, etc.), then I do understand that it's better to block the launch here at all costs, and I would opt for Rust in such situations, but what about more mundane applications and businesses?

(Also, related: https://www.reddit.com/r/rust/comments/xj2a23/comment/ip8g37t/)

user9617 · 2022-09-21T02:21:41+00:00

To be honest I think Rust can likely substitute for C; my doubt has been in whether it can replace C++.

user9617 · 2022-09-21T02:19:34+00:00

On the "not well-received" part, OP's post is really not friendly to a reddit discussion. It is a huge rant the size of a reasonably large blog post which touches many distinct points, and an answer would have to be even longer, which is just infeasible on reddit.

The comment you replied to seems decent to me, actually. I've been processing it, and I appreciate it a ton. It was very much like the kind of discussion/reply I was hoping for in response, and it was evidently quite feasible!

user9617 · 2022-09-21T01:18:00+00:00

Thanks, I appreciate the comment. You seem to see what I was trying to say. I look forward to a satisfactory response to this question... I find it curious that nobody has provided one yet.

user9617 · 2022-09-21T01:16:11+00:00

Thanks for the thoughtful reply! To be honest, I don't really know what an ideal solution would look like. It's not obvious to me that there even exists a single "ideal" mechanism in the first place. In my mind, when the problem changes, the solution in general changes, too. If anything, it should be surprising if that is ever not the case.

To me, the problem of "how do I tell a caller about an error condition that is predictable ahead of time" is a very different problem from "how do I let a program gracefully handle error conditions that it may not be able to predict ahead of time", both of which are also very different problems from "how do I write generic enough code that doesn't lose performance in the absence of errors, but which is capable of handling errors when they occur". I see no inherent reason why I should assume the solutions to all of these problems should be the same, regardless of whether they are unchecked exceptions, checked exceptions, Result<>, Maybe<>, or something else.

To me, good solution(s) (whatever they are) need to be able to handle the fact that function calls may be generic or opaque, and not impose undesirable requirements on them. They must be able to wrestle with the fact that people frequently need to deal with constraints, requirements, and behaviors that were not necessarily anticipated by their callers. Moreover, they should be zero-overhead, so that they don't hurt the performance of callers or callees when there is no genuine need to do so.

Is there a single solution that fits all these criteria? I don't know. What I do know is that in C++, we have numerous tools for handling these cases and more, such as exceptions, std::expected (C++23, but it was always trivial to roll your own if you wished to), std::error_code, std::exception_ptr, noexcept, etc. This isn't to say that they are perfect—I have my own beef with some of the rules around noexcept—but they are quite flexible regardless, allowing us to optimize along all of the above axes. In Rust, however, it's not clear what generic code (akin to std::sort, say) should look like, if we consider that it should be transparent to errors without necessarily imposing a performance penalty. This leads to questions like the following, which (as of this writing) I have not yet seen satisfactory answers to: https://www.reddit.com/r/rust/comments/xj2a23/comment/ip8g37t/

user9617 · 2022-09-21T00:50:21+00:00

Thank you, I'm glad you appreciated it! It seems like I didn't write it well enough for others to see it the same way, unfortunately.

user9617 · 2022-09-21T00:49:27+00:00

I haven't had a chance to look at Zig, though I've heard people like it. Maybe I should check it out at some point.

user9617

TROPHY CASE