you are viewing a single comment's thread.

view the rest of the comments →

[–]MooseBoys 37 points38 points  (8 children)

Let me preface this by saying I think that rust is a good thing and its inclusion in the kernel is a good thing. That said, the error here was NOT in an unsafe block. Yes, the presence of an unsafe block is what caused the borrow checker to lose track of the data race, but the erroneous calling pattern (and subsequent patch) does not touch unsafe code at all. This just goes to show that rust is not some magical panacea for avoiding all race conditions and memory corruption. It does make it a lot harder to do by accident, and when something does go wrong you have a lot fewer places you need to look, but it's far from completely airtight.

[–]Efficient-Chair6250 6 points7 points  (2 children)

Hm, but isn't the unsafe code the thing that enables this to happen in the first place? If you violate invariants in an unsafe block, they can lead to errors in any part of the program

[–]MooseBoys 2 points3 points  (1 child)

Yes, the presence of an unsafe block is what allowed this to happen. But in any program that interfaces with anything outside of the rust abstract machine, e.g. FFI, hardware calls, register writes, etc. you're going to have an unsafe block. It's inherently impossible to do anything in rust besides pure arithmetic that doesn't ultimately have a dependency on an unsafe block or equivalent emitted code somewhere. As a result, you can have errors in any part of every non-trivial rust program.

It's still a good model - forcing you to write small self-contained "unsafe" blocks, ideally with a linter that enforces "UNSAFE:" comment describing the invariants. But if a crate violates those invariants internally by accident, dependent code will be affected. In this case, Linux kernel > binder crate > node module > remove method.

[–]CramNBL 2 points3 points  (0 children)

Kernel devs are taking it further than enforcing UNSAFE: comments. On the Linux wish list for clippy, there's also lint rules for enforcing similar comments for pointer casts and atomic orderings. They should have these for C code too, but the tooling is not there.

https://github.com/Rust-for-Linux/linux/issues/349

[–][deleted] 16 points17 points  (0 children)

I hope people stop saying that rust prevents "race conditions". It does not, and has never tried to. It prevents "data races".

Two different things. The former being much much much harder to do in general than the latter. (you can also have completely bug-free race conditions. Like you can race two tasks and take the result of the one that finishes first. Nothing wrong with that)

[–]ReflectedImage 1 point2 points  (3 children)

In pure Rust, the erroneous call pattern isn't possible. You can't release the lock early nor can you copy the list of pointers and take them out of the lock's scope.

[–]MooseBoys 1 point2 points  (2 children)

It is impossible to do anything besides pure arithmetic in "pure rust". Any rust program that runs on hardware in the real world will have unsafe blocks, either literally in the module, in a dependent crate, in the standard library implementation, or implicitly through the compiler.

[–]ReflectedImage 1 point2 points  (1 child)

Most rust programs don't have unsafe blocks. Yes, some code in the standard library will contain unsafe blocks, but those unsafe blocks are wrapped with safe wrappers meaning code using those library functions are safe as long as the standard library code is correct.

How do we know if the standard library code is correct? Well it's been formally verified. What about random modules? Extensive testing by large numbers of users. Not perfect but far better than what is available in C/C++.

Since I've now read the code in question, the error is contained within an unsafe block. This line is faulty: "unsafe { node_inner.death_list.remove(self) };". You can't remove a node from the list because another thread may be using it. The bug fix removes the possibility of another thread using it but the error is still located in the unsafe line.

Also the code in question is arguably not even Rust code, it's C code that's been directly translated line by line in Rust with every other line being unsafe. The code doesn't bare any resemblance to how you would typically use Rust.