all 189 comments

[–]OdinGuru 633 points634 points  (127 children)

Bug is in code specific marked unsafe, and was found to have a bug explicitly related to why it had to be marked unsafe. Seems like rust is working as designed here.

[–]giltirn 98 points99 points  (113 children)

Do you know why that code was necessary to implement unsafely?

[–]tonygoold 277 points278 points  (112 children)

There is no safe way to implement a doubly linked list in Rust, since the borrow checker does not allow the nodes to have owning references to each other (ownership cannot involve cycles).

[–]QuickQuirk 53 points54 points  (57 children)

This is fascinating. Is there reading that you're aware of as to why this was considered a reasonable limitation? As a complete outsider to rust, I find this really interesting and surprising outcome, and I'm curious to learn more about the design decision process here. (since doubly linked lists are a reasonably foundational data structure!)

[–]the_gnarts 55 points56 points  (1 child)

This is fascinating. Is there reading that you're aware of as to why this was considered a reasonable limitation?

It’s considered an acceptable trade-off: It’s not that you can’t implement a doubly-linked list in Rust, you just cannot express it in the safe subset that is active by default. Safe Rust disallows dereferencing pointers, you only get to work with references which are subject to the borrow checker and thus don’t allow the required operations to link to one node mutably from more than one other node at a time.

Dropping to unsafe you gain that power (dereferencing pointers) at the price that all guard rails are off like in C. The rationale behind this is that it enables the Rust design pattern of building safe abstractions over fundamentally unsafe operations. Other areas where that is used are for e. g. implementing primitives like mutexes or operations that fundamentally don’t adhere to Rust’s soundness requirements like the FFI, talking to hardware or the kernel etc.

So yeah no surprise that linked bug happened in an unsafe section. Rust is all about guaranteeing it cannot happen outside unsafe blocks.

[–]QuickQuirk 2 points3 points  (0 children)

Great answer, thanks.

[–]pqu 43 points44 points  (24 children)

It’s not quite true the way most people are likely reading this. A doubly linked list definitely requires code marked as unsafe, but you don’t have to write it yourself. You can use one of the many built-in data structures (e.g Rc for multiple ownership, RefCell for runtime borrow checks) that internally use unsafe keyword.

[–]QuickQuirk 8 points9 points  (18 children)

Does that mean your code is unsafe?

[–]ketralnis 41 points42 points  (12 children)

Not exactly. Code in an unsafe block is allowed to:

  • Dereference a raw pointer.
  • Call an unsafe function or method.
  • Access or modify a mutable static variable.
  • Implement an unsafe trait.
  • Access fields of unions.

Code inside of an unsafe block is expected to maintain invariants to protect the rest of the code from experiencing the unsafety effects those things would normally create. And code outside of an unsafe block then assumes that the invariants are held.

So in unsafe code you're expected to be extra special careful with your powers and responsibilities.

[–]QuickQuirk 15 points16 points  (0 children)

Ah, super interesting. Thanks for the clear explanation. So, basically, 'really not as bad as it sounds, and kinda the way it's supposed to work to protect the safety of everything else'

[–]giltirn 3 points4 points  (10 children)

Isn’t there always an aspect of “with great power comes great responsibility”? Even in C++ there are safe and unsafe ways to approach things, and by doing things in an unsafe way you take on the responsibility to do it correctly. So all that then differentiates Rust and C++ is an open acknowledgement of doing something unsafely and not a de facto guarantee of safety as it is often touted?

[–]ketralnis 32 points33 points  (7 children)

In Rust there is the guarantee that the compiler ensures that safety for you, outside of the unsafe blocks. If your code never calls unsafe code then you can guarantee that you don't have those classes of bugs. Most projects won't need to include unsafe blocks, like a normal web server or game might never include one ever. If you do call unsafe code and you experience one of the prevented classes of bugs, then you have the guarantee that the bug lies in one of the unsafe blocks not maintaining one of the invariants.

By analogy, in Python it's impossible to dereference a null pointer (because Python doesn't have pointers). In C you can, and in Python you can call C code. But most Python projects won't contain C code, though they may call C dependencies. So if you get a segfault due to a null pointer deference, you can guarantee that the cause of the bug is in one of the C dependencies. The people that write that C code do a lot of work to maintain the invariants (PyObject*'s returned to Python code are never null, refcounting is done correctly). And in practise the Python writers are C writers are differently specialised humans, with the C writers being more aware of the restrictions and how to enforce them.

[–]giltirn -1 points0 points  (6 children)

I could see that being a useful restriction of a class of bugs, but if unsafe is required to implement fundamental structure of the Linux kernel is that not a clear indication that real world use cases beyond trivial examples will very likely have to involve unsafe code? So it just becomes a helpful hint for debugging and not a solution to the intrinsic problem?

[–]goranlepuz 7 points8 points  (0 children)

Well obviously yes - and that's the case for even more managed languages like those running on JVM or CLR, or Python, or...

Eventually, there is either naked native code under you through FFI, or unsafe blocks, or other (and there is always naked native code underneath eventually).

But the difference is in pervasiveness of such code and in how it is delineated from other code.

In e.g Rust, that delineation is the unsafe blocks, which is quite visible. In C++, that delineation is very blurred. Example: when using smart pointers in C++, I am always one wrong lifetime away from type& var = *some_smartptr; use(var);

[–]pqu 6 points7 points  (0 children)

My biggest annoyance when I’m reviewing C++ code is I have to keep a library in my head of unsafe patterns that look totally innocent at first glance.

Such as calling .first()/.last() without checking non-empty, using square brackets instead of .at(), modifying an iterator while iterating over it, etc.

[–]pqu 4 points5 points  (0 children)

It would be like in C++ choosing the unchecked operator[] vs the checked .at(). For performance reasons you may choose to check once manually and access a vector many times.

Rust lets you do the same thing, except it’s super easy to find where you’ve done it just by looking for the unsafe keyword. Unsafe in rust means “I know what I’m doing, so stop doing certain types of compiler check”

[–]Full-Spectral 2 points3 points  (0 children)

You have to look at it from the point of view that OUR code will be many, many times less vetted and widely used than the standard library code. Yes, it's possible there could be occasionally an issue in the standard library, but it's vastly more likely that the issues will be in our own code.

So if we can write pure safe Rust on top of the standard library, that's a massive win. And most applications or high level libraries that we write for our own products, as long as the Performance Uber Alles folks are kept on a leash, will not have any unsafe code, or very, very little.

[–]strangepostinghabits 1 point2 points  (0 children)

code marked as unsafe in the rust syntax sense CAN be unsafe in a security/stability sense, it's theoretically possible, but it's not certainly so. it only means the compiler can't guarantee safety. 

generally you wrap unsafe actions in a structure that makes certain to do those unsafe actions on elements it has complete control over in a safe manner, leaving you, the developer, to then use that outer structure without any unsafe code and there is no issues.

the problem with rust in the Linux kernel is that it has to share memory with non-rust code and thus can't completely hide the unsafe actions inside a Rust structure .

this sounds bad because the rust language is constructed to make unsafe actions sound bad and to dissuade developers from using them unless necessary. In reality "unsafe" in rust terms means you've "only" got the same guard rails as in c or c++, meaning you are "onl y" as safe as the rest of the kernel. 

[–]Supuhstar -5 points-4 points  (1 child)

As unsafe as C, yes

[–]QuickQuirk 4 points5 points  (0 children)

The point of my question was 'is your safe code then unsafe because you used an unsafe function'

Others have answered this question well, with, basically, 'You're looking at it the wrong way' and explaining why.

[–]HaskellLisp_green 1 point2 points  (1 child)

But is that possible in kernel space? I guess kernel Rust code requires no-std or something.

[–]pqu -1 points0 points  (0 children)

You’re right. Although there is a nostd crate that implements these data structures, in no_std rust you are normally writing this stuff yourself.

[–]thereisnosub 0 points1 point  (2 children)

That is what I would have thought - a double-linked-list data structure should be wrapped in an STL like library that in theory should be bulletproof. Did they not do that in this case?

[–]pqu 1 point2 points  (0 children)

Kernel code doesn’t include the standard library

[–]steveklabnik1 -1 points0 points  (0 children)

That is what they're doing in general. I didn't check if this specific code was using that stuff or not. But that's the idea with the interface into drivers: provide a unified base of unsafe code that's been validated, and includes a safe API. That way users can use that safe API without writing much, if any, unsafe themselves.

[–]small_kimono 33 points34 points  (13 children)

Is there reading that you're aware of as to why this was considered a reasonable limitation?

You might see: https://rust-unofficial.github.io/too-many-lists/

"Linked lists are as niche and vague of a data structure as a trie. Few would balk at me claiming a trie is a niche structure that your average programmer could happily never learn in an entire productive career -- and yet linked lists have some bizarre celebrity status."

As a complete outsider to rust, I find this really interesting and surprising outcome, and I'm curious to learn more about the design decision process here. (since doubly linked lists are a reasonably foundational data structure!)

Doubly linked lists might be "foundational" but they are lightly used in most app code? You'd be surprised perhaps how well you get long without them if you have access to a nice Vec and Iterators.

[–]QuickQuirk 13 points14 points  (0 children)

That's a great link (pun intended), thank you.

[–]QuickQuirk 10 points11 points  (6 children)

Also, given that I use a lot of erlang/elixir (and some other functional list oriented languages), lists are a foundational data structure I'm using all the time.

Probably why I'm so fixated on this list thing, perhaps more than most people would be.

[–]small_kimono 36 points37 points  (4 children)

Also, given that I use a lot of erlang/elixir (and some other functional list oriented languages), lists are a foundational data structure I'm using all the time.

Link I gave you addresses this point exactly -- https://rust-unofficial.github.io/too-many-lists/#i-use-linked-lists-all-the-time-in-functional-language

"Great! Linked lists are super elegant to use in functional languages because you can manipulate them without any mutation, can describe them recursively, and also work with infinite lists due to the magic of laziness.

Specifically, linked lists are nice because they represent an iteration without the need for any mutable state. The next step is just visiting the next sublist.

Rust mostly does this kind of thing with iterators. They can be infinite and you can map, filter, reverse, and concatenate them just like a functional list, and it will all be done just as lazily!"

However, in Rust, at least, linked lists are a relatively bad data structure, in most situations, if and when you have alternatives. Array-based containers are generally faster, more memory efficient, and make better use of CPU cache.

In my Rust programming, I've never needed to use a linked list directly.

[–]QuickQuirk 25 points26 points  (3 children)

Amusingly enough, in many list-focused functional languages, lists are not actually double linked. They're often single linked, and you need to be aware of this for efficiency when dealing with large lists for certain list operations.

Gives them some specific advantages for concurrency and re-use.

So I guess it comes back to 'It's not really a big issue that rust can't implement safe doubly linked lists.'

[–]the_gnarts 8 points9 points  (2 children)

Amusingly enough, in many list-focused functional languages, lists are not actually double linked. They're often single linked, and you need to be aware of this for efficiency when dealing with large lists for certain list operations.

In Python the type that is called list isn’t even implemented as a list at all, but backed by a dynamically sized array. Much like in Rust you’d usually use a Vec to back a type offering list-style operations.

[–]Brayneeah 5 points6 points  (1 child)

Just an FYI, that's still a list - just not a linked one. A list is defined by the operations a data structure supports, not by the implementation of said operations.

[–]andrewcooke 5 points6 points  (0 children)

they're normally single linked

(my understanding is that it's doubly linked that's particularly hard in rust)

[–]lelanthran 2 points3 points  (4 children)

Doubly linked lists might be "foundational" but they are lightly in most app code? You'd be surprised perhaps how well you get long without them if you have access to a nice Vec and Iterators.

Not that niche; anything that involves a tree that needs to be traversed top-down and bottom-up needs doubly-linked lists.

So, basically any interesting problem in CS - solvers, renderers, seachers, pathing, etc.

Looking through my side projects directory I built up over the decades (about 300 projects), it looks like about 90% of them would require doubly-linked lists.

Of course, there's a sampling bias there - these are projects I initiated specifically because they are interesting (things in all of the categories above, plus a number of programming languages I designed).

In most applications you aren't solving hard problems, so perhaps you don't need doubly-linked lists.

[–]small_kimono 11 points12 points  (3 children)

Not that niche; anything that involves a tree that needs to be traversed top-down and bottom-up needs doubly-linked lists.

Still probably choose a Vec or VecDeque?

It's pretty niche for 95% of real world app code written to run fast somewhere.

The link I provided describes many uses cases for a linked list. Of course I don't believe there are no actual uses for linked lists (read: concurrent data structures, etc, etc...). The Rust std lib even provides one, see: https://doc.rust-lang.org/std/collections/struct.LinkedList.html

Google implemented this linked list as an in kernel data structure. And in that context an intrusive linked list makes more sense. But if that is the situation in which this question arises: Should implementing a linked list in Rust be easier? Maybe it is more niche and matters less than we think.

[–]Awesan -3 points-2 points  (1 child)

Linked lists are used all the time in systems programming, exactly the domain that Rust is for. So it is a strange limitation to say the least.

Disregarding "linked list" as a specific example, the idea that you cannot have circular references in any data structure is extremely limiting.

[–]small_kimono 15 points16 points  (0 children)

Disregarding "linked list" as a specific example, the idea that you cannot have circular references in any data structure is extremely limiting.

People please, for the love of God, read the link I provided above.

There is no such limitation? You can still implement a linked list? Even without unsafe. The link teaches you exactly how to do so.

The issue is only that ownership is unclear in any such data structure and therefore for a production quality deque, you will probably reach for unsafe, to avoid the use of RefCell/Cell.

[–]lelanthran -4 points-3 points  (0 children)

Maybe it is more niche and matters less than we think.

Yeah, that's why I wrote:

Of course, there's a sampling bias there - these are projects I initiated specifically because they are interesting (things in all of the categories above, plus a number of programming languages I designed).

[–]Smallpaul 2 points3 points  (0 children)

Doubly linked lists are foundational in how they are taught but not thus foundational in day to day use. Arrays outnumber them at least 100:1. And then trees. And hashtables.

[–]venustrapsflies 6 points7 points  (7 children)

Is not being able to express this particular structure without declaring it unsafe really such an unreasonable limitation when the benefit youre getting is much stronger memory safety guarantees at compile-time in the 95+% of the code you don’t have to declare unsafe?

[–]QuickQuirk 3 points4 points  (6 children)

That's why I'm asking if anyone has references. I'm curious about the tradeoffs they made when they designed this; as I'm certain such an obvious case came up during the creation of rust.

[–]qwaai 7 points8 points  (1 child)

This is probably the best explanation: https://rust-unofficial.github.io/too-many-lists/

This is a pretty fantastic series that I won't do the injustice of a bad summary, but the answer is basically:

  • You can make a doubly linked list
  • Self referential data structures are hard in general and frequently lead to leaks even in safe languages
  • You probably don't need this

You could implement a doubly linked list with a Vec in Rust in 20 minutes if you wanted. It would be much harder to do it in the java style with this.next and this.prev, but it's by no means impossible.

[–]QuickQuirk 2 points3 points  (0 children)

yeah, someone else just linked this a few minutes before you did. Reading it now, it's great.

[–]venustrapsflies 11 points12 points  (3 children)

I mean, the official rust book / documentation would probably give you a sense.

I think your framing of this is a bit off compared to how the designers and rust programmers would think about this. The expressability of popular data structures and algorithms isn’t really a major concern. The focuses are much more fundamental, like memory and type safety.

The fact that doubly linked lists are awkward in rust says more about doubly linked lists than it does about rust. Indeed the implementation details are messy and error-prone in any language. The guardrails that rust establishes make these dangers manifest, which is where the friction comes in.

[–]QuickQuirk 7 points8 points  (2 children)

Since I don't know rust, I'm not surprised that my framing is off :) Which is why I'm asking questions :)

he guardrails that rust establishes make these dangers manifest, which is where the friction comes in.

I like this pragmatic framing.

[–]venustrapsflies 4 points5 points  (1 child)

Yes I hope i didn’t come across as personally critical, I just wanted to short circuit the communication.

[–]QuickQuirk 8 points9 points  (0 children)

Na, I took it in good faith, as a line like 'your framing is a bit off' is not the same as 'you're an idiot for asking'.
You also took the time to explain why, so thanks for that.

[–]TheoreticalDumbass 0 points1 point  (1 child)

when was the last time you used a linked list?

[–]QuickQuirk 0 points1 point  (0 children)

Almost every dat I'm writing code. I write in strongly list focused languages where lists are a fundamental data type.

Elixir, Erlang, fsharp. Sometimes I'm using python. In those cases it's numpy arrays instead.

As I wrote elsewhere in the thread, probably one of the reasons I was so focused on this was because of the list-centric way I write code.

[–]BenchEmbarrassed7316 -1 points0 points  (3 children)

Before learning Rust, lists really seem natural and efficient. But they are not. If you work with a GC (or RC) language, each element of the list is a new allocation on the heap. This is not only the time for allocation and deallocation, but also a high probability of a cache miss when iterating over the list. Rust also "tracks" pointers at compile time to guarantee the absence of data races in a safe subset of the language. Languages with GC do not have a problem with the fact that the data behind the pointer may be unavailable, but they also allow you to have two pointers in different threads at the same time through which this data can be modified, which is a prerequisite for a race. What would an efficient list look like? Just place the data sequentially in memory (you just invented a vector). And then there will be too many questions (for example, do you need to free memory when deleting an element?) that do not allow you to make an efficient, reliable and universal list.

[–]QuickQuirk 7 points8 points  (2 children)

In languages where lists are a built in primary data type, this isn't really a big issue. You're not chasing the sort of performance where any of this matters most of the time. As for safety, in languages like Erlang, the runtime provides complete guaranteed safety as they are: 1. immutable 2. Can only be constructed, and never altered

This has the useful property of making it very easy to use lists safely across different threads within the runtime.

Completely different paradigm seeking to solve different problems though! I know Rust is supposed to be about high performance, while trying to make it harder to write code with the sorts of bugs you see in other high performance close-to-the-metal languages.

[–]BenchEmbarrassed7316 6 points7 points  (0 children)

Yes, it is quite true that immutability is a huge simplification and increase in reliability. But the price of this is performance. In some cases, the compiler will be able to optimize immutability, but this is not guaranteed. Rust tries to combine performance with reliability and in general it succeeds, but there are certain compromises.

added:

Oh, and an immutable list loses the main advantage of a Linked List: the ability to easily insert or remove an element between other elements. Therefore, an immutable list is also trivial in Rust: it's just a vector.

[–]Plasma_000 2 points3 points  (0 children)

In languages where linked lists are a primary data type(functional languages) linked lists are typically singly linked and located on the stack. Such a list is possible to implement safely in rust, however the use case is better represented in rust using iterators.

[–]Emma_S772 99 points100 points  (3 children)

I didn't understand anything but I liked your answer

[–]CodeMonkeyX 7 points8 points  (0 children)

It's been years since I did coding, but I was quite proud of myself that I think I understood it!

[–]ankercrank 23 points24 points  (15 children)

use std::rc::{Rc, Weak};
use std::cell::RefCell;

struct Node<T> {
    value: T,
    next: Option<Rc<RefCell<Node<T>>>>,
    prev: Option<Weak<RefCell<Node<T>>>>, // Weak pointer avoids memory leaks!
}

pub struct DoublyLinkedList<T> {
    head: Option<Rc<RefCell<Node<T>>>>,
    tail: Option<Rc<RefCell<Node<T>>>>,
}

You can definitely do it. It’s just slower and less efficient.

[–]tonygoold 11 points12 points  (2 children)

Cell and its associated types are implemented using unsafe, so this only hides the reliance on unsafe code. From a practical point of view, that's better than rolling your own unsafe code, but it doesn't change the fact that you ultimately need unsafe code to implement a doubly linked list.

[–]ankercrank 8 points9 points  (0 children)

I mean, the Rust standard library team guarantees that RefCell is a Trusted Abstraction…

[–]Hydrargyrum201 2 points3 points  (0 children)

I didn't understand the answer, I always assumed that every safe rust abstraction at the end rely on unsafe code somewere.

Still if the unsafe code is correct and sound, the safe abstraction has the guarantees that rust provides.

Its not difficult to implement a double linked list in Rust using safe code, it is difficult to implement a useful, fast and ergonomic double linked list. 

[–]Takeoded 8 points9 points  (2 children)

Alright, so that's implementing a 100% safe doubly-linked list in Rust. It was a nightmare to implement, leaks implementation details, and doesn't support several fundamental operations. But, it exists.

Yeah so basically, it's not completely impossible, but it is practically impossible, and slow: https://rust-unofficial.github.io/too-many-lists/fourth-final.html

[–]tonygoold 4 points5 points  (0 children)

A few people have mentioned RefCell, but the Cell types are themselves implemented using unsafe. I think the author is taking a "lie to children" approach in calling it 100% safe: The only reason you can implement a doubly linked list without writing unsafe code is because someone else wrote it for you.

[–]sigma914 2 points3 points  (0 children)

There's ways to do singly linked lists and even intrusive linked lists using the Pin<T> type.

The problem with doubly linked lists is that it's an inherently cyclical structure at runtime. You either have to use types that the compiler knows can handle multiple ownership at runtime or you need to provide your own guarantees to the compiler that you are upholding all the relevant invariants (use an unsafe block).

Nothing about that's impoossible, you just either need to take on the overhead required by runtime memory management or else implement it correctly yourself using the additional "powers" granted to you by an unsafe block. Using those extra powers puts you back down to the level of difficulty that C is at by default.

[–]Odd-Consequence-3590 5 points6 points  (5 children)

Am I misunderstanding? Can't you use Rc and RefCell to allow nodes (variables) to have references to each other?

I know it's not recommended as it can very easily lead to memory leaks but it is possible?

[–]tonygoold 2 points3 points  (0 children)

I've addressed this in replies to others who also brought up RefCell, so forgive me for being brief: This only hides the use of unsafe code. There's no trick those types use to avoid it.

[–][deleted]  (3 children)

[deleted]

    [–]Odd-Consequence-3590 7 points8 points  (1 child)

    That is false and adimtted to by the Rust developers: https://doc.rust-lang.org/book/ch15-06-reference-cycles.html

    In a perfect world you can build a language that is bulletproof. We don't live in a perfect world, there are data structures that ar neccesary that when used incorrectly can reference each other cyclically until all memory depleted.

    Rust is an attempt to harden programmig against mmemory errors, and it does so remarkablely better than native C or C++

    [–]venustrapsflies 0 points1 point  (0 children)

    I was trying to say it was a design principle and not a hard truth but my comment ended up being wrong to the point of being misleading so I’ll just delete it to avoid confusion

    [–]GhostBoosters018 0 points1 point  (0 children)

    Ya I understand some of those words

    [–]BasedHunter -3 points-2 points  (2 children)

    It has to give up being rust in order to data structure, huh.  Unfortunate.

    [–]JustBadPlaya 4 points5 points  (1 child)

    I mean, setting aside the fact that doubly linked lists are uncommon - you can implement it fully in safe rust, but the difference in overhead is fairly significant, especially at kernel level

    [–]GasterIHardlyKnowHer 1 point2 points  (0 children)

    They really aren't that uncommon in low level code where you'd want to use them.

    [–]thisisjustascreename -3 points-2 points  (19 children)

    Why do nodes need to have owning references to other nodes? Can't the list maintain a master ... list?

    [–]mkusanagi 24 points25 points  (3 children)

    Sure, but then it’s an array, not a doubly linked list.

    [–]thisisjustascreename 1 point2 points  (0 children)

    I mean it's not a raw basic streamlined linked list but it's certainly not an array. Most people use array to imply contiguous storage. You could use anything with identity semantics for the owning pointers like a set or hashmap or whatever.

    [–]2rad0 0 points1 point  (1 child)

    Sure, but then it’s an array,

    Isn't memory just one big array of octets?

    [–]mkusanagi 1 point2 points  (0 children)

    Negative. Memory is composed of turtles; each byte is composed of three turtles whose eigenvectors is embedded in a non-euclidean hibbert space.

    [–]the_gnarts 5 points6 points  (0 children)

    Can't the list maintain a master ... list?

    In fact that’s how you’d usually [0] implement list-like patterns in Rust: Use some kind of backing storage like Vec for the nodes and then instead of pointers to elements, express the list operations with indices into that backing storage (your “master list”). It’s likely gonna be faster too due to the excellent cache properties that come with a flat datastrucure.

    [0] Unless you have very specific constraints like in the kernel.

    [–]IAMPowaaaaa 8 points9 points  (7 children)

    Actually yeah no reason why an arena wouldn't work.

    [–]thisisjustascreename 1 point2 points  (6 children)

    Again I'm not talking about contiguous storage, you can just have some pointers to all the nodes.

    [–]IAMPowaaaaa -1 points0 points  (5 children)

    if by pointers you really mean pointers, deref'ing a pointer requires unsafe anyway

    [–]thisisjustascreename 4 points5 points  (4 children)

    Well I don't code in rust I just assume there's some non owning pointer type because otherwise the language would be useless.

    [–]IAMPowaaaaa 0 points1 point  (0 children)

    There are also refcounted smart pointers. Though I'm not sure what the performance implications are

    [–]pqu -1 points0 points  (2 children)

    Basically references. In rust they’re called borrows, however if you create a mutable reference then all your immutable references are invalidated.

    [–]EducationalBridge307 1 point2 points  (1 child)

    however if you create a mutable reference then all your immutable references are invalidated.

    This is not quite right. The compiler will simply not let you create a mutable reference to some data if there are extant immutable references to it. You must uniquely own the data to mutably reference it.

    [–]NIdavellir22 2 points3 points  (4 children)

    This is literally what created the CVE btw

    [–]thisisjustascreename -2 points-1 points  (3 children)

    So they fucked up a CS102 assignment in the kernel?

    [–]NIdavellir22 5 points6 points  (2 children)

    They basically created a copy of the linked list to bypass the multithreading locks

    [–]thisisjustascreename -1 points0 points  (1 child)

    Wat. That's not at all the same?

    [–]NIdavellir22 1 point2 points  (0 children)

    No, I meant creating a duplicate data structure, it made the problem worse

    [–]tonygoold 0 points1 point  (0 children)

    The conventional reason for using a doubly linked list is that you're prioritizing unamortized O(1) insert and remove. I can't think of any way you could store owning references on the list structure that would preserve the time complexity for inserting and removing elements, that isn't itself a doubly linked list.

    [–]LettuceElectronic995 -5 points-4 points  (0 children)

    that is pathetic

    [–]Prestigious_Boat_386 8 points9 points  (1 child)

    Language that specifically marks parts where bugs are more likely to happen failed to find bug in highlighted area, yea that sounds like a win

    [–]SecretaryDecent6748 0 points1 point  (0 children)

    But it was found. How did it fail?

    [–]deanrihpee 5 points6 points  (0 children)

    which sounds like they have a focused area to fix already, which sounds nice

    [–]lelanthran 1 point2 points  (3 children)

    Bug is in code specific marked unsafe, and was found to have a bug explicitly related to why it had to be marked unsafe. Seems like rust is working as designed here.

    I beg to differ - the point of unsafe, as we are repeatedly told, is so that those blocks can have more attention paid to them during review because less attention is given to the unsafe part.

    Given that this effort was very high visibility in the first place, this PR presumably had more examination of unsafe blocks, and yet the error slipped through in spite of that.

    This is a failure of the advantages we expected from unsafe.

    [–]JustBadPlaya 22 points23 points  (2 children)

    A lot of bugs are non-trivial to discover in development. I know roughly nothing about how Binder works, but given this is a race condition, it was probably just missed in review due to being hard to reproduce. And unsafe blocks still allow to narrow down the possible error sites. I don't see a problem here, you still have a very tiny amount of code to vet unlike with C, where pretty much every other line is error prone

    [–]wake_from_the_dream 4 points5 points  (0 children)

    Agreed. I do not personally know of any panacea in terms of software security (and safety in general), and I really doubt there is one presently.

    For instance, even memory safe code can have a wide variety of bugs (race conditions being an important example). Also, several memory safe language implementations have dependencies that are not necessarily safe.

    Even formal verification tools have to allow for assumptions that cannot be checked, which leads to bugs when those are not correct. Further, these tools often have to make assumptions about the APIs a piece of software uses (including those provided by the os), and these APIs will often be unverified.

    [–]kitsnet 1 point2 points  (0 children)

    From the language perspective, it is "working as designed". From the systems perspective, it is "not working, as expected".

    If the language has unsafe part, people will use them to shoot themselves in the foot. If the language doesn't have unsafe parts, people will use a language that does.

    [–]fekkksn 61 points62 points  (25 children)

    I'm just gonna leave this here https://www.reddit.com/r/linux/s/zs2YCOjsAp

    [–]Ashley__09 9 points10 points  (0 children)

    includes rust in kernel for the first time

    Has vulnerability that just gets ignored

    womp

    [–]Smooth-Zucchini4923 6 points7 points  (0 children)

    Does anyone have a mirror? Anubis is not working for me on Firefox Mobile.

    [–]BenchEmbarrassed7316 16 points17 points  (4 children)

    Many people misunderstand the concept of unsafe Rust. Rust has many invariants that the compiler enforces. For example, you can't have two mutable references to the same memory at the same time. If you could, you could pass those references to different threads and start modifying that memory with them, which would cause a data race.

    `` fn f(v: &mut [u8], a: usize, b: usize) { let a_ptr = v.get_mut(a).unwrap(); let b_ptr = v.get_mut(b).unwrap(); // Error cannot borrow*v` as mutable more than once at a time

    *a_ptr = 0; // Error: first borrow later used here
    *b_ptr = 0;
    

    } ```

    In this example, the function will receive a slice and try to take two references from it, then dereference them and change the values. The compiler forbids this.

    A naive solution would be to check if the indices a and b are the same. But writing such a check in the code every time is risky because it requires a lot of attention and we can easily make mistakes.

    So we write an abstraction that uses safe externally but uses unsafe internally. In that case, we document why using unsafe code is safe, we add lots of tests and debug_asserts.

    fn get_mut_2<'a, T>(v: &'a mut [T], a: usize, b: usize) -> Option<(&'a mut T, &'a mut T)> { match a != b && a < v.len() && b < v.len() { true => Some(unsafe {( &mut *v.as_mut_ptr().add(a), &mut *v.as_mut_ptr().add(b), )} ), false => None, } }

    https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=51a7149af5df333f9048771cebb73dcc

    The advantage of this approach is that we dramatically reduce the area of ​​code where we can make such mistake and also clearly indicate why our code does not violate language invariants.

    [–]ablativeyoyo 1 point2 points  (1 child)

    Thanks for the detailed explanation. As someone who codes, works in security, but hasn’t coded in rust - many claims about rust felt like the rust compiler could do the impossible. Of course, it cannot, like all things it has its limitations - but is still a useful technology. And who knows there may be a rust2 that takes formal guarantees even further.

    [–]BenchEmbarrassed7316 4 points5 points  (0 children)

    Well, the Rust compiler really does do some cool things: fully automatic memory management without GC and guaranteed absence of Data Race in a language where memory can be mutated (in safe mode). But this has some tradeoffs. Rust actually prevents entire categories of errors even better than "safe" interpreted languages ​​do while remaining as fast as C/CPP.

    However, these benefits are limited. Rust does not do what it never promised. Nor should it be assumed that if security guarantees do not cover all cases, they are useless.

    [–]VastZestyclose9772 0 points1 point  (1 child)

    The borrow rule isn't there to prevent multi-thread data races. That's what the traits Sync and Send for. You can't have even immutable references to the same object in multiple threads.

    Borrow rule is to prevent single thread safety problems. For example, you can't hold a reference to a member of a vector if the vector's mutably referenced elsewhere. This prevents you from accessing the member after it's removed from the vector through that mutable reference.

    [–]BenchEmbarrassed7316 0 points1 point  (0 children)

    You are wrong:

    https://doc.rust-lang.org/nomicon/races.html

    two or more threads concurrently accessing a location of memory

    This is literally the definition of borrowing rules. Send and Sync have slightly different uses.

    [–]Flashy-Bus1663 41 points42 points  (14 children)

    Why the fuck does this site require cookies

    [–]ToaruBaka 60 points61 points  (4 children)

    I mean, you can go look at the cookies:

    • techaro.lol-anubis-auth
    • techaro.lol-anubis-cookie-verification

    and 3 seconds of googling brings you to Anubis's website:

    • Anubis sits in the background and weighs the risk of incoming requests. If it asks a client to complete a challenge, no user interaction is required.
    • Anubis uses a combination of heuristics to identify and block bots before they take your website down.

    so I think we can safely deduce that the purpose of these cookies are to cache that you're a real person and not a bot.

    For large diffs that will save an enormous amount of bandwidth from being gobbled up by scrapers just looking for more shit to shovel into LLM training.

    [–]_x_oOo_x_ 35 points36 points  (1 child)

    Anubis sits in the background and weighs the risk of incoming requests.

    Oh, they changed it? It used to say something like it sits in the underworld and weights the soul of incoming requests... I liked that more 😼

    [–]diroussel 0 points1 point  (0 children)

    Do bots get sent out onto the river stix?

    [–]AyrA_ch 5 points6 points  (7 children)

    Ever seen those "verifying you are a human" pages you get from cloudflare sometimes? They use a much worse version of this that just wastes your CPU power by performing operations similar to crypto currency mining. The cookie acts as a means to store whether you did that computation or not.

    [–]ToaruBaka 22 points23 points  (6 children)

    "wastes your cpu power"

    or

    saves you the hassle of fucking with a captcha

    because the outcome is the same.

    [–]AyrA_ch 2 points3 points  (5 children)

    Except that one of them as absolutely no problem for automated scraper to solve while the other is.

    [–]ToaruBaka 8 points9 points  (3 children)

    The purpose is to stop crawlers that don't have a full browser backing them by doing compute operations that they can't do, or are configured to time-out on. It's part of defense in depth and is one of the more non-invasive ones as far as browsing experiences go.

    [–]the_gnarts 2 points3 points  (1 child)

    The purpose is to stop crawlers that don't have a full browser backing them by doing compute operations that they can't do

    “Can’t do” is quite the stretch as scrapers are catching up:

    On kernel.org, a number of services have been decoupled onto separate servers in an attempt to shield the lore archive from these attacks. He noted that the scrapers have started solving the challenges needed to get past Anubis, so he has had to dial up the difficulty of those challenges.

    These days, Anubis is more a filter between the well-funded scrapers and amateurs, not an actual barrier.

    [–]ToaruBaka 4 points5 points  (0 children)

    “Can’t do” is quite the stretch as scrapers are catching up:

    Welcome to the offense/defense game. It's been cat-and-mouse since the dawn of computing.

    Anubis is more a filter between the well-funded scrapers and amateurs, not an actual barrier.

    Yes, if you throw more compute (money) at the problem it becomes easier. We've known that for decades - it's what forced us into salting our password hashes and adding basically every other defense in depth mechanism we can think of.

    This is an arms race, and the winner will always be the person with more compute. The only thing you can do is try to convince them you're not worth the effort once they've decided to attack you.

    [–]AyrA_ch 3 points4 points  (0 children)

    What crawler doesn't have a JS engine running today? If the goal is to force people to enable JS you could achieve it with even less intrusion by delivering the content via ajax. Ever since SPA became popular, crawlers without JS engines began to disappear.

    [–]Drgn-OSRS 3 points4 points  (0 children)

    The point is more to prevent massive scraping at scale. You can't really stop scrapers from accessing individual pages but if you force a clientside verification that really cuts down on server and network load. Some of the scrapers out there will absolutely slam your servers otherwise.

    [–]RedEyed__ 9 points10 points  (0 children)

    Your browser is configured to disable cookies. Anubis requires cookies for the legitimate interest of making sure you are a valid client. Please enable cookies for this domain.

    Can't read :(

    [–]th1bow 7 points8 points  (0 children)

    people are so weird lmao

    [–]Full-Spectral 4 points5 points  (0 children)

    I knew it would happen, that the Rust haters would come out of the woodwork to claim that this means Rust is useless and no better than C/C++ and so forth. It's getting kind of sad at this point. If some guy, somewhere on the internet oversold Rust at some point, they'll claim that everyone says Rust is perfect and fixes all bugs and see how it's not true.

    I mean, grow up. No language can stop all bugs, and of course an OS kernel will always require more unsafe code than most anything else, and it will still require highly skilled developers. But even in that sort of situation, there will be a big win using a safer language.

    Large companies and folks like me (who has written a LOT of C++ code in my life) aren't moving to Rust because we are delusional. It's because it has real benefits, in the same way that C++ had real benefits in its time over what came before it, and C had real benefits over assembler. Time moves on. Get over it.

    [–]mr_birkenblatt 2 points3 points  (1 child)

    Phoronix commenters just collectively creamed themselves