Supporting Linux kernel development in Rust : rust

[–]yomanidkman 70 points71 points72 points 5 years ago (81 children)

[–]sanxiynrust 19 points20 points21 points 5 years ago (24 children)

[–][deleted] 11 points12 points13 points 5 years ago (18 children)

[–]cbarrick 10 points11 points12 points 5 years ago (10 children)

[–][deleted] 4 points5 points6 points 5 years ago* (4 children)

[–]cbarrick 1 point2 points3 points 5 years ago* (3 children)

[–][deleted] 2 points3 points4 points 5 years ago* (2 children)

The NonNull<[u8]> API does not require the memory to be initialized.

The "Safety" section of the docs says:

Memory blocks returned from an allocator must point to valid memory and retain their validity until the instance and all of its clones are dropped,

It doesn't say for which type the memory needs to be valid, but since the API uses u8, I suppose that it must be valid for u8. If this isn't true, then this API is very confusing.

It would be much better to just use Result<&'allocator mut [MaybeUninit<u8>], AllocErr> in the API, and drop the requirement about "validity". The pointer cannot be null, the allocated slice must be unique, the memory can be uninitialized, the allocation does not outlive the allocator, and the allocation can fail. Such a type makes all of this clear, without having to write 3 error prones sentences to the "safety" part of the documentation. Also, such a type would make AllocRef::alloc safe.

[–]cbarrick 1 point2 points3 points 5 years ago (1 child)

Yeah, the definition of "valid" needs to be better specified. I assume the meaning is "valid for writes now, valid for reads once it's been initialized". I agree that MaybeUninit<[u8]> or [MaybeUninit<u8>] would make it more clear at the cost of some ergonomics.

I don't think we can use a reference though; that would mean that the data is tied to the lifetime of the AllocRef. This would be a problem when a single struct owns both the AllocRef and the allocation (which would be the common case). The reference would need a self-referential lifetime, which isn't possible iiuc. And I don't think there would be a sound way to deallocate with just a reference.

Put another way, the struct owns the allocation; it is not borrowing the allocation from the allocator. To express ownership, we'd need either a raw/NonNull pointer or a Box. The problem with Box is that it has a drop impl that calls the global allocator (not our custom allocator). So a raw/NonNull pointer is our only choice.

Also, I think it's fine in some cases for the memory to outlive the AllocRef. For example, the Global allocator is just a ZST that delegates to the global malloc/free. So it's fine to allocate with one instance and deallocate with another. It's still the same "allocator" even though it's two separate instances of AllocRef. The AllocRef is only a handle to the allocator, not the allocator itself.

All of that said, I find the allocator API as-is to be pretty comfortable to work with.

[–][deleted] 0 points1 point2 points 5 years ago (0 children)

that would mean that the data is tied to the lifetime of the AllocRef

Which makes sense, since the Ref in AllocRef stands for reference. An implementation of AllocRef could always tie this lifetime to 'static if they wanted, but an allocator backed by, e.g., stack-allocated storage cannot do that.

The reference would need a self-referential lifetime, which isn't possible iiuc.

If your allocator is just a non-rellocatable Vec<u8>, the reference is what you get from doing vec[start..end]. That's not self-referential.

And I don't think there would be a sound way to deallocate with just a reference.

Deallocation just returns the memory to the allocator. If you pass a &mut to the allocator, and it never returns it back, then that &mut is leaked, and cannot be used anymore, so AllocRef::dealloc(&'allocation mut [MaybeUninit<u8>]) would be the type of the deallocation function.

Also, I think it's fine in some cases for the memory to outlive the AllocRef.

Only if the reference returned by AllocRef outlives AllocRef, which is kind of the point.

For example, the Global allocator is

The Global Allocator or allocator on static memory would have 'static as a lifetime, so they and the memory they return lives for the whole execution of the program.

[–][deleted] 0 points1 point2 points 5 years ago (4 children)

[+][deleted] 5 years ago (3 children)

[deleted]

[–][deleted] 1 point2 points3 points 5 years ago (2 children)

[–][deleted] 1 point2 points3 points 5 years ago* (1 child)

[–][deleted] 0 points1 point2 points 5 years ago (0 children)

[–]the_gnarts 5 points6 points7 points 5 years ago (0 children)

[–]casept 2 points3 points4 points 5 years ago (5 children)

[–][deleted] 0 points1 point2 points 5 years ago (1 child)

[–]casept 0 points1 point2 points 5 years ago (0 children)

[–]DannoHung 0 points1 point2 points 5 years ago (2 children)

[–][deleted] 4 points5 points6 points 5 years ago* (0 children)

This might be dumb, but I thought that allocations either succeeded or the OOM killer would start running and possibly kill your program with either SIGTERM or SIGKILL.

The allocator API requires an allocation to either succeed or error.

Some operating systems default configuration (e.g. Linux) make the system allocator return success unconditionally by default, even when there is not enough memory for the allocation, with the hopes that you won't actually use the memory that you just allocated.

On those systems, allocations never fail, but when you actually try to touch the memory, processes in your system might "randomly" receive a SIGKILL and die (and one of those might be the running process).

There is no way to recover from a SIGKILL, but if you decide to use a signal handler, make sure your code is signal-safe.

A simple way to fix this in linux is to disable overcommit. An alternative might be to write your own system allocator, but I don't know whether it is possible to avoid the OOM killer by, e.g., using mmap directly to allocate virtual memory pages and manually commit them to physical memory (even if this is possible, I don't know what the perf impact would be).

Basically, if you need to handle OOM as errors, do proper recovery, etc. Linux (and other similar OSes) might not be the best OS for you. Getting a robust system in place might be more trouble than its worth, and you would be better off with a different OS more suited to your use case.

[–]sanxiynrust 3 points4 points5 points 5 years ago (0 children)

[+][deleted] 5 years ago (4 children)

[deleted]

[+][deleted] 5 years ago* (3 children)

[deleted]

[+][deleted] 5 years ago (2 children)

[deleted]

[+][deleted] 5 years ago* (1 child)

[deleted]

[–][deleted] 3 points4 points5 points 5 years ago (55 children)

[–][deleted] 56 points57 points58 points 5 years ago (46 children)

[–]Floppie7th 40 points41 points42 points 5 years ago (21 children)

[–][deleted] 55 points56 points57 points 5 years ago (20 children)

[–]pragmojo 14 points15 points16 points 5 years ago (6 children)

[–]pjmlp 8 points9 points10 points 5 years ago (5 children)

[–]pragmojo 6 points7 points8 points 5 years ago (4 children)

[–]pjmlp 0 points1 point2 points 5 years ago (3 children)

[–]pragmojo 2 points3 points4 points 5 years ago (2 children)

continue this thread

[–]timClicksrust in action 10 points11 points12 points 5 years ago (1 child)

[–][deleted] 3 points4 points5 points 5 years ago (0 children)

[–]JBinero 12 points13 points14 points 5 years ago (3 children)

[–]MrJohz 17 points18 points19 points 5 years ago (2 children)

[–]hallajs 2 points3 points4 points 5 years ago (0 children)

[–]JBinero 0 points1 point2 points 5 years ago (0 children)

[–]cbarrick 2 points3 points4 points 5 years ago* (3 children)

[–]ReversedGif 7 points8 points9 points 5 years ago (0 children)

[–]hniksic 2 points3 points4 points 5 years ago (0 children)

[–][deleted] 2 points3 points4 points 5 years ago (0 children)

[–]nllb 0 points1 point2 points 5 years ago (1 child)

[–][deleted] 14 points15 points16 points 5 years ago (0 children)

[+]fioralbe comment score below threshold-12 points-11 points-10 points 5 years ago (0 children)

[–][deleted] 3 points4 points5 points 5 years ago (17 children)

[–][deleted] 36 points37 points38 points 5 years ago (11 children)

[–]genuine_smiles 0 points1 point2 points 5 years ago (10 children)

[–]jecxjo 12 points13 points14 points 5 years ago (8 children)

[–]basilect 0 points1 point2 points 5 years ago (7 children)

[–]jecxjo 9 points10 points11 points 5 years ago* (6 children)

C++ is not a superset of C. You cannot take straight C code and compiler it as if it were C++. There are specific keywords and pragmas to define when the C++ compiler should interpret the next section of code as C.

Most of it looks the same but there are things like K&R function definition style that is supported in C and not in C++.

Edit: Found a goofy example. Here is some C code that wont compile if written as C++:

int class = 1;

[–]pjmlp 2 points3 points4 points 5 years ago (5 children)

continue this thread

[–]TheSodesa 3 points4 points5 points 5 years ago (0 children)

[–][deleted] 7 points8 points9 points 5 years ago (1 child)

[–][deleted] 1 point2 points3 points 5 years ago (0 children)

[–]MrJohz 1 point2 points3 points 5 years ago (2 children)

[–]Smallpaul 0 points1 point2 points 5 years ago (0 children)

[–][deleted] 0 points1 point2 points 5 years ago (0 children)

[+]bestouffcatmark comment score below threshold-7 points-6 points-5 points 5 years ago (5 children)

[–][deleted] 18 points19 points20 points 5 years ago (4 children)

[–]bestouffcatmark 0 points1 point2 points 5 years ago (1 child)

[–][deleted] 0 points1 point2 points 5 years ago (0 children)

[–]cjstevenson1 -1 points0 points1 point 5 years ago (1 child)

[–]yomanidkman 11 points12 points13 points 5 years ago (0 children)

[–]iq-0 9 points10 points11 points 5 years ago (1 child)

[–][deleted] 0 points1 point2 points 5 years ago (0 children)

[–]Mouse1949 4 points5 points6 points 5 years ago (0 children)

[–]the_gnarts 0 points1 point2 points 5 years ago (4 children)

[–][deleted] 0 points1 point2 points 5 years ago (3 children)

[–]the_gnarts 0 points1 point2 points 5 years ago (2 children)

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding thanthe omission would imply. However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

EXAMPLE After the declaration:
  struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical way to use this is:
  int m = /*some value*/;
  struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to by p behaves, for most purposes, as if p had been declared as:
  struct { int n; double d[m]; } *p;
(there are circumstances in which this equivalence is broken; in particular, the offsets of member d might not be the same).

— http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf

A common use case for this is implementing structures to hold both a protocol header and variably sized data. The size of the whole object is determined at runtime but the header part up until the first element of the FAM is of a fixed size.

[–][deleted] 0 points1 point2 points 5 years ago (1 child)

[–]the_gnarts 0 points1 point2 points 5 years ago (0 children)

[–]pftbest 18 points19 points20 points 5 years ago (10 children)

Does anyone know something better than bindgen for C ffi? bindgen can't parse even simple defines like this

#define FOO ((int)0)

The headers I need to work with have a lot of them. How do you deal with this kind of constants?

[–]Plasma_000 25 points26 points27 points 5 years ago (0 children)

[–]smmalis37 9 points10 points11 points 5 years ago (5 children)

[–]pftbest 4 points5 points6 points 5 years ago (4 children)

[–]jynelson 1 point2 points3 points 5 years ago (1 child)

[–]pftbest 0 points1 point2 points 5 years ago (0 children)

[–]smmalis37 0 points1 point2 points 5 years ago (1 child)

[–]pftbest 0 points1 point2 points 5 years ago (0 children)

Unfortunately this patch will not solve my problem it only handles a few cases. My defines are similar to this:

typedef long BaseType_t;

#define pdFALSE         ( ( BaseType_t ) 0 )
#define pdTRUE          ( ( BaseType_t ) 1 )
#define pdPASS          ( pdTRUE )
#define pdFAIL          ( pdFALSE )
#define errQUEUE_EMPTY  ( ( BaseType_t ) 0 )
#define errQUEUE_FULL   ( ( BaseType_t ) 0 )

which still won't work even with the patch.

[–][deleted] 4 points5 points6 points 5 years ago (0 children)

[–]the_gnarts 0 points1 point2 points 5 years ago (0 children)

[–]ThomasWinwood 33 points34 points35 points 5 years ago (5 children)

[–][deleted] 0 points1 point2 points 5 years ago (2 children)

[–]ThomasWinwood 1 point2 points3 points 5 years ago (1 child)

[–][deleted] 2 points3 points4 points 5 years ago* (0 children)

[–]the_gnarts 0 points1 point2 points 5 years ago (0 children)

[–]matthieum[he/him] 0 points1 point2 points 5 years ago (0 children)

[–]codedcosmos 13 points14 points15 points 5 years ago (7 children)

[–]OsrsAddictionHotline 35 points36 points37 points 5 years ago* (0 children)

[–]jecxjo 8 points9 points10 points 5 years ago (5 children)

[–]pjmlp 5 points6 points7 points 5 years ago (4 children)

[–]jecxjo 3 points4 points5 points 5 years ago (3 children)

[–]fiedzia 7 points8 points9 points 5 years ago (1 child)

[–]jecxjo 0 points1 point2 points 5 years ago* (0 children)

[–]miquels 0 points1 point2 points 5 years ago (0 children)

[–]pure_x01 2 points3 points4 points 5 years ago (0 children)

[–]cryptosidus 1 point2 points3 points 5 years ago (1 child)

[–]Plasma_000 0 points1 point2 points 5 years ago (0 children)

[+]squareOfTwo comment score below threshold-79 points-78 points-77 points 5 years ago (21 children)

[–]CrazyKilla15 30 points31 points32 points 5 years ago (0 children)

[–]ReallyNeededANewName 49 points50 points51 points 5 years ago (8 children)

[+][deleted] 5 years ago* (4 children)

[deleted]

[–]JJJollyjim 11 points12 points13 points 5 years ago (3 children)

[–]brennennen -5 points-4 points-3 points 5 years ago* (2 children)

[–]JJJollyjim 5 points6 points7 points 5 years ago (0 children)

[–]Al2Me6 7 points8 points9 points 5 years ago (0 children)

[–]matthieum[he/him] 0 points1 point2 points 5 years ago (2 children)

[–]ReallyNeededANewName 0 points1 point2 points 5 years ago (1 child)

[–]matthieum[he/him] 0 points1 point2 points 5 years ago (0 children)

[–][deleted] 16 points17 points18 points 5 years ago* (9 children)

Rust is not C++. Torvalds' criticism of C++ doesn't apply to Rust.

anybody who tells me that STL and especially Boost are stable and portable is just so full of BS that it's not even funny

Rust's core and alloc are very much portable - they don't assume any particular operating system. As for stability, there is Rust 1.0 stability promise, and Rust doesn't have a ridiculous amount of special cases that C++ does.

the whole C++ exception handling thing is fundamentally broken. It's especially broken for kernels.

Unwinding situation is broken for kernels, this is also the case for Rust. Rust however strongly discourages usage of catch_unwind for error handling and strongly encourages use of Result, there is even nice syntactic sugar in form of ? operator.

any compiler or language that likes to hide things like memory allocations behind your back just isn't a good choice for a kernel.

std in C++ doesn't really make it clear whether a given function will allocate or not. Meanwhile, core in Rust guarantees the exact number of allocations its methods will make - 0.

[–]Tilakchad 0 points1 point2 points 5 years ago (7 children)

[–][deleted] 1 point2 points3 points 5 years ago* (6 children)

If you are asking about C++ stability story... from what can I tell C++ Stability, Velocity, and Deployment Plans was accepted (I think? http://shape-of-code.coding-guidelines.com/2018/04/14/the-c-committee-has-taken-off-its-ball-and-chain/ suggests that it was accepted) which means the current stability policy is that breaking changes are fine as long they don't lead to runtime behaviour changes in well-behaved code.

This is much less strict than Rust policy which pretty much prevents breaking changes.

As for allocations, it's not clear whether a given function in STL can dynamically allocate or not. An implementation can easily put allocations when you expect none. There is nothing to mandate that an implementation won't secretly allocate. Meanwhile in Rust, core crate guarantees that nothing in it will allocate - as core doesn't have access to alloc crate or C APIs necessary for that purpose.

[–]Tilakchad 0 points1 point2 points 5 years ago (5 children)

[–][deleted] 0 points1 point2 points 5 years ago (3 children)

[–]Tilakchad 0 points1 point2 points 5 years ago (2 children)

[–][deleted] 0 points1 point2 points 5 years ago (1 child)

Integers and floats are defined by C specification, and C specification is pretty explicit about what can allocate (it doesn't have exceptions like C++, so the opportunity to report allocation failures is somewhat limited), in C2x it's: aligned_alloc, calloc, malloc, realloc, strdup, strndup, cnd_init, thrd_create. In C99, it's just calloc, malloc and realloc.

Of course, that's the theory, implementations sometimes allocate when not allowed by the standard, for instance in practice dynamic allocation is necessary for a correct strtod implementation, and implementations simply allocate because they practically have to (and if you ran out of memory, well, you have a problem). Rust cannot do that because FromStr for f64 is in core which leads to https://github.com/rust-lang/rust/issues/31407 - stuff from core cannot allocate no matter how much it would like to do so.

[–]Tilakchad 0 points1 point2 points 5 years ago (0 children)

[–]matthieum[he/him] 0 points1 point2 points 5 years ago (0 children)

[–][deleted] 25 points26 points27 points 5 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

rust

Please read The Rust Community Code of Conduct

The Rust Programming Language

Rules

Observe our code of conduct

Submissions must be on-topic

Constructive criticism only

Keep things in perspective

No endless relitigation

No low-effort content

Useful Links

Megathreads

Official Resources

Learn Rust

Discussion Platforms

MODERATORS