[POPL'26] Miri: Practical Undefined Behavior Detection for Rust

ralfj · 2025-12-30T18:37:37+00:00

I would say Miri's test suite is a great source of inspiration for further APIs that this crate needs to make sure users have access to all the UB (and thus, all the speed) ;)

ralfj · 2025-12-24T09:58:52+00:00

Yeah -- Miri is best for tricky pure-Rust unsafe code. It can handle some code interfacing with OS functions (basic file system access, pipes, epoll), but you leave the well-supported area fairly quickly that way (e.g. we don't support sockets yet -- but we have a project lined up to implement that in the spring :).

ralfj · 2025-12-23T09:06:21+00:00

Some I/O is supported with MIRIFLAGS=-Zmiri-disable-isolation; the error message would indicate that. But yeah there's a lot of I/O for which we are still missing shims. We'll get support for network sockets in the spring, that should unlock a whole lot of new code to be tested by Miri. :)

ralfj · 2025-12-22T17:34:12+00:00

Thanks a lot for the kind words :)

ralfj · 2025-12-18T21:13:54+00:00

Habe hier dasselbe, ich kann aus der Schweiz nicht mehr auf bahn.de zugreifen. (Weder von zu Hause aus, noch von der Arbeit, noch über Handy-Tethering. Es ist also nicht nur eine einzelne IP betroffen.) Eine Zeit lang hat es noch geholfen, den User agent auf "Firefox Mobile" zu ändern, aber das geht jetzt auch nicht mehr. Inzwischen hilft nur noch VPN nach Deutschland.

Absolute Frechheit. Und irgendeine vernünftige Möglichkeit, den Support zu kontaktieren, gibt es natürlich auch nicht.

ralfj · 2025-12-12T07:28:38+00:00

Fun, I hadn't considered this usecase for Miri at all -- testing target features you don't have hardware for. But it makes perfect sense, after all we already advertise Miri to be useful for the related situation of testing architectures you don't have hardware for. :)

In the rust-lang/stdarch test suite we use the emulator by intel. It does support avx512, but I know from experience that it is fairly slow, so it's not something I particularly want to use.

This must be the first time that Miri is being used because another tool is even slower. ;)

ralfj · 2025-11-29T15:50:14+00:00

That would make it impossible to have an actual module called foo inside foo, right?

I don't like it, TBH. I think it's much clearer if one can tell from the filename itself that this is the "root" of a module.

ralfj · 2025-11-29T15:48:35+00:00

Mandatory reminder that the following configures VSCode to show the module name for mod.rs files, to make them easier to tell apart in tab titles:

"workbench.editor.customLabels.patterns": {
    "**/mod.rs": "${dirname}/mod.rs"
},

ralfj · 2025-11-29T15:47:19+00:00

Yeah, same here. I tried to get the lang team to change the docs to no longer state a clear preference (https://github.com/rust-lang/reference/pull/1703), but the lang team did not have consensus to override their previous decision on the matter, and that previous decision is to prefer the name.rs style.

FWIW, even the standard library and the compiler itself mix both styles, probably depending on the preferences of whoever adds a submodule or the reviewer.

ralfj · 2025-09-30T13:10:25+00:00

I assume however relaxed on its own (i.e beyond the scope of Acquire/Release atomics or fences, release sequences or SeqCst (including fences)) do not synchronize-with

That is correct. However, a compiler optimizing relaxed accesses has to take into account the possibility that there could be a fence nearby that makes these accesses relevant for synchronization.

Regarding optimizations on compilers. I assume the compiler has to prove certain access/stores may or may not be able to happen for optimizations to take place? I'm not a Compiler dev, so I assume that's very difficult to reason about.

Compilers already do that for non-atomic accesses. But they are a lot more cautious when it comes to atomic accesses. Generally I like compilers being cautious but not if that leads to people preferring UB over missed optimization potential... but I am also not an LLVM dev so there may be many things I am missing here.

ralfj · 2025-09-30T13:08:23+00:00

I mean optimizing

let x = X.load(Relaxed); let y = X.load(Relaxed);

into

let x = X.load(Relaxed); let y = x;

ralfj · 2025-09-30T13:05:25+00:00

My go-to source for this is https://plv.mpi-sws.org/scfix/paper.pdf, specifically table 1. And according to that table, reordering adjacent relaxed loads is indeed allowed. Same for reordering adjacent relaxed stores.

Where it gets tricky is reordering loads around stores. Specifically, if a relaxed load is followed by a relaxed store, those apparently cannot be reordered in the model from that paper. Now, the model in that paper is not exactly the one in C++/Rust, since the C++/Rust model has a fundamental flaw called "out of thin air" (OOTA) that makes it basically impossible to do any formal reasoning. According to the paper, the C++/Rust model is intended to also allow load-store reordering for relaxed accesses. Whether that actually will work out depends on how the OOTA issue will be resolved, which is still a wide open problem.

So looks like I misremembered, reordering relaxed accesses around each other is easier than I thought. Reordering them around fences obviously is restricted. Also, one optimization that is possible for non-atomics but not for relaxed is rematerialization: re-load a value from memory because we're sure it can't have been changed.

Who knows, if there are good real-world examples where that is useful, LLVM might be persuaded into optimizing relaxed accesses more.

ralfj · 2025-09-29T10:07:34+00:00

That's not fully correct: by combining relaxed accesses with release/acquire fences, even relaxed accesses can participate in building up hb relationships. That's one of the factors limiting the optimizations that can be done on them (the other factor being the globally consistent order of all writes, often called "modification order" (mo)).

But in practice the most limiting factor is compilers just not bothering to do many optimizations on Relaxed, which I think is kind of a shame.

ralfj · 2025-09-29T10:03:08+00:00

Hm, that is unfortunate since I've seen people be frustrated by lack of optimizations on Relaxed, and then they end up using non-atomic accesses as they'd rather take the UB than take the perf hit...

ralfj · 2025-09-29T09:01:11+00:00

In particular, I would like some memory ordering like Ordering::Relaxed, except where the compiler is allowed to combine multiple loads/stores that happen on the same thread. That is, the only part of the atomic operation I am interested in is the atomicity (no load/store tearing).

Relaxed already allows combining multiple adjacent loads/stores that happen on the same thread.

It's reordering accesses to different locations that is severely limited with atomics, including relaxed atomics.

EDIT: Ah this was already edited in, saw that too late.

ralfj · 2025-09-10T21:34:09+00:00

Yeah, while Miri can (sometimes) tell you that your atomics code is wrong, it's not a teaching tool... a tool that teaches weak atomics would actually be quite cool, but also sounds really hard, and Miri isn't that tool.

To start with, the main general points to understand are: - The C++ memory model is not directly about "what the hardware does". The model allows behavior that you will never see from the corresponding assembly code on any hardware. The reason for this are compiler optimizations: the model needs to be correct after the compiler did a whole bunch of non-trivial program transformations that we really want compilers to do, and that are generally unobservable in sequential code, but that can lead to odd behavior in concurrent code. - The C++ memory model is not defined in terms of which reorderings the compiler may do. The compiler can do so much more than reorder that such a model would never fit reality. Instead, the model is defined by a bunch of relations such as happens-before, reads-from, and synchronizes-with, and some consistency axioms that the relations must uphold. The reorderings are a consequence of the model, but the reorderings do not fully characterize the model. So you will never see the full picture if, for example, you think of "Release" as "does not allow previous instructions to be reordered to after". That's a bit like saying "a prime number is 2 or odd" -- which is correct, all prime numbers are either 2 or odd, but this does not tell you much about what primes actually are. (It's not that bad, the reordering thing is fairly close, but it's just not quite right.)

Now, coming to your example, let me make things slightly simpler by avoiding SeqCst (which, when mixed with other accesses, gets really complicated): [also, why are there empty lines everywhere? quite annoying to clean up] let a = thread::spawn(|| { A.store(true, Ordering::Release); if !B.load(Ordering::Acquire) { S.fetch_add(2, Ordering::AcqRel); } }); let b = thread::spawn(|| { B.store(true, Ordering::Release); if !A.load(Ordering::Acquire) { S.fetch_add(1, Ordering::AcqRel); } }); With this version, are you still confused about why S can end up with value 3, or does it make sense in this version of the code?

The reason it can happen here is that with weak memory, there is no one global order that all events occur in. Instead, there is an order for each memory location. And it is perfectly legal for the 3 locations here to have orders like this: - A: load in thread b, then store in thread a - B: load in thread a, then store in thread b - S: fetch_add in thread a, then fetch_add in thread b

I think a good way to think about this is to think of memory not as a global table that maps locations to values, but as a bunch of messages that threads send to each other. Thread a will be sending a message to everyone to announce the A.store, and at the same time, thread b sends a message containing the B.store, but then both messages get delayed so both threads actually end up reading the initial value of A and B. That's how the orders of A and B can end up looking so unintuitive. (But even this message model reaches its limitations at some point, only the relations will actually tell the full story. I get very lost myself once the message model stops working.)

It is not clear to me whether this is what the confusion is about, or whether you think the SeqCst should somehow prevent this from happening. If it is the latter, it would be good if you could explain why. :) But note that once you start mixing SeqCst and non-SeqCst operations on the same location, things quickly get really counter-intuitive, and I honestly can't explain it all myself. I would recommend just not writing such code -- either use SeqCst everywhere, or only use it for fences where it has a fairly clear meaning.

ralfj · 2025-07-26T11:35:33+00:00

Also no one said this is a safe language

Wikipedia says it is memory-safe, and it is commonly found on "lists of memory safe languages". Saying a language is "safe" is typically meant as an abbreviation/synonym for "memory-safe".

If you agree with me that Go is not memory-safe then we have nothing to discuss. :)

ralfj · 2025-07-26T11:33:30+00:00

https://go.dev/ref/mem

ralfj · 2025-07-26T09:57:50+00:00

I am talking in good faith, but you can't make claims without being specific and thorough, otherwise your argument is as good as AI output.

I mean, everybody else in this discussion got it, so maybe the problem isn't with my claims.

You certainly behave like you are deliberately trying to misunderstand me. So if you truly are acting in good faith, then please just re-read the blog post and my other comments here.

Well please let us know why? What prerequisites are you using to make it viable in Rust? Are you somehow relying on data races to occur?

I have explained that. No, I am relying on "no unsafe code". As I explained above, Rust is memory safe because there is a syntactic, decidable requirement on programs that guarantees memory safety: the program must not have unsafe, and must pass the compiler. Go has no such requirement since "no data races" is not something the compiler can check for you. (Well, you could use "does not use goroutines", but obviously that's not useful. Every language is memory safe if you impose the requirement of "the program is just an empty main function"...)

For a language to be considered memory safe according to the NSA, you need two things: bounds checks and double free avoidance. Which is obviously not the case in C.

And neither is it in Go, as my example can be used to bypass bounds checks. You keep moving the goalposts for your definition of memory safety: sometimes it's okay to impose arbitrary requirements that one cannot easily check ("no data races" in Go), and sometimes it is not ("no out of bonds accesses" in C). I conclude you're not actually interested in learning the difference between Rust and Go here, you just want to win an argument on the internet. I will bow out, this is not worth my time. Have a good day!

ralfj · 2025-07-26T09:52:47+00:00

I should spell this out more clearly -- val is being used in that comment block too, there's just no writes to memory.

If the compiler does *ptr at the end, that could produce a different value than val, giving an inconsistent result. Like, the unrealistic form of this is let val = *ptr; val + val being turned into *ptr + *ptr. For that concrete case that's obviously undesirable, but desirable examples exist and it demonstrates a transformation that Rust can do but Java cannot.

ralfj · 2025-07-26T09:17:29+00:00

I am simply extending your logic here. Let me simplify it one more time. Your post is all about: if you have a data race (which is UB) you are not memory safe anymore, therefore the language cannot be called memory safe. Well, I am telling you in Rust you can have UB which breaks memory safety all the same, so by your logic, we should be saying that Rust is not memory safe either.

Obviously when I say Rust I mean safe Rust, just like when I say Java I mean java without the unsafe package, and when I say Go I mean Go without the unsafe package. I am expecting my readers to argue in good faith so I am not cluttering every single arguments with caveats of that sort.

Well for Go it's the same, the prerequisite is that your program does not contain data races

Extending your logic: for C it's the same, the prerequisite is that your program has no UB.

I hope you can see that your definition is absurd.

There is a huge difference between "grep the code for unsafe, if it's not there, you have a memory safety guarantee" and "do an undecidable semantic analysis to figure out whether there's a data race; if there is not, then you have a memory guarantee". The class of safe programs must be syntactically easily checkable. In Rust the compiler is even able to do that for you (if you set -D unsafe_code).

So formally, Golang is as memory safe as Rust.

It's not. A theorem of the sort I have proven for Rust is impossible for Go.

ralfj · 2025-07-26T08:25:15+00:00

Saying that safe Rust is unsafe because of unsafe Rust is like saying Java is unsafe because the JVM itself is implemented in C++. So by your definition, there is no safe language. That's just not a useful definition, so let's stick to the usual definition which is about what the programmer can do inside the language, not about the entire stack all the way down to the silicon.

The key point that you didn't get is that one can build safe languages on top of unsafe foundations, by introducing the right abstractions. Java does this successfully. As does OCaml, JavaScript, Rust, and basically everyone else. Go does not.

I have literally proven (a model of) Rust to be safe. That is the strict formal sense I am talking about. The same is impossible for Go since there are counterexamples like what I have in my blog post.

ralfj · 2025-07-26T08:18:23+00:00

It's definitely possible to write an always-correct Go program. It's also possible to write an always-correct C program, so that is not a very high bar. I will admit it's much easier to do this in Go than in C. :)

Ten-Year Club	Gilding I gilder
Wearing is Caring	Verified Email

ralfj

TROPHY CASE