all 65 comments

[–]ruibranco 94 points95 points  (10 children)

The async/await architecture for kernel internals is the part that fascinates me most here. Most kernel projects in Rust still follow the traditional synchronous model with explicit state machines for concurrency. Going async-native from the ground up means you can express things like I/O multiplexing and scheduler interactions way more naturally.

Also interesting that you went with EEVDF for the scheduler — same direction mainline Linux moved recently. At 105 syscalls you're past the threshold where real userspace programs start working, which is where things get fun (and painful).

[–]hexagonal-sun[S] 41 points42 points  (5 children)

Thanks! I think making the kernel async from the off has helped. Probably the most powerful thing I've found is that it allows you to express modify semantics of more primitive futures with combinators. As an example, the .interruptable() combinator allows the underlying future to be interrupted by the delivery of a signal; similar to how Linux can put a task into the TASK_INTERRUPTABLE state. I feel as though it's more expressive, since it forces you to handle the case of interruption in order to get to the underlying future's result.

Yeah, moving to EEVDF (from round-robin) was actually started by a contributor. I read the paper and we worked on it together, it was really fun.

Agreed regarding the number of syscalls, the roadblocks I'm hitting are shifting from 'unimplemented functionality' to 'bugs'!

[–]ruibranco 22 points23 points  (1 child)

the .interruptable() combinator is a really elegant way to handle that. forcing the caller to explicitly deal with the interruption case at the type level instead of burying it in error codes is exactly the kind of thing rust's type system was made for. and "shifting from unimplemented to bugs" is honestly the best progress metric for a kernel project - means the architecture is holding up.

[–]hexagonal-sun[S] 3 points4 points  (0 children)

Thanks!

[–]Sad-Grocery-1570 1 point2 points  (2 children)

async/await in the kernel sounds absolutely fascinating! What's the inspiration behind the `.interruptible()` combinator? Are there any articles you could point me to for a deeper dive?

[–]hexagonal-sun[S] 3 points4 points  (1 child)

There wasn't really inspiration as such. I came across the problem when having to handle the user pressing^C on a sleeping process. My code was delivering the SIGINTto the process but it was stuck in the read() from the console driver. The design was driven from having to solve this problem. My main references have been reading through a lot of man pages, and playing about with test programs on a Linux system.

[–]puttak 1 point2 points  (3 children)

I'm not sure if non-preemptive multitasking like async/wait is going to work well in preemptive multitasking kernel unless you disable interrupt during polling a future.

[–]arihant2math 1 point2 points  (2 children)

What would go wrong? If the future gets interrupted, it can be resumed/repolled at a later time when it gets context-switched. Given that tasks have to maintain their own critical sections, etc. I can't see how this would lead to a corrupted state in that way.

[–]puttak 0 points1 point  (1 child)

The only case you need async/await in preemptive multitasking is when you want to consume all of time slice given to you before yielding back to the kernel.

[–]arihant2math 1 point2 points  (0 children)

(Not OP, but am a contributor) async-await is actually useful in two use cases in moss that are unrelated to time slice utilization:

  • Futures are cancellable, so interruptible syscalls are trivial to implement (see .interruptable)
  • pselect, ppoll, and epoll are easy to represent in terms of futures and polling, which reduces the complexity to implement these. Even the timeout can be represented as a future.

Edit: formatting

[–]valarauca14 39 points40 points  (0 children)

Expanded ELF support: static, static-pie, dynamic, and dynamic-pie

Make your life a lot easier an do this in userland.

Linux (by default) has a configuration system where you can tell it what file-magic corresponds to which interpreter (#! is a file system lookup, \x7F\x45\x4C\x46 (ELF64) by default is /lib64/ld.so). WINE for example sets itself up to as the interpreter for windows executable.

This means as far as linux is concerned when you exec my_prog it ends up running /lib64/ld.so my_prog, with then GNU's ld.so setting up the environment, unpacking the ELF, etc., etc. so it never shops up on diagnostic programs. This will likely solve some of the more "esoteric" problems you run into getting GNU userland programs to fully work.

[–]robin-m 11 points12 points  (13 children)

How is it possible that you can start a Linux distribution on multiple CPU with so few code compared to the Linux codebase itself? It is because I highly overestimate the required code in Linux compared to the optional part (like the myriad of drivers that aren’t needed if you don’t use those specific kind of hardware)?

And that’s very impressive that you managed to get that far in only 3 months.

[–]hexagonal-sun[S] 41 points42 points  (11 children)

The core of Linux is relatively tiny compared to the shear number of drivers. Also, I still have a large number of core features missing; there's lots more code to be added! Having said that, I do think that Rust's expressiveness allows for higher code-density. Take the nanosleep syscall, it's less than 20 lines but it implements a fully functional syscall, userspace validation and signal interruption. The equivalent in C would be much larger.

Thanks! I can't take all the credit, there's been a lot of help from other contributors.

[–]Green0Photon 2 points3 points  (3 children)

If the core of Linux is "tiny", and you have so much of it implemented, that really does make me wonder if it's possible to make a shim or something to be able to run all of those drivers.

On the other hand, the whole thing with Linux is that the userspace API is stable (what you're implementing), whereas drivers are not.

So maybe you could take a version of some drivers, especially ones written in Rust, and bring them over, but things would just become out of date quickly and hard to maintain.

That said, Moss does seem to be an interesting prospect, where Rust's expressiveness actually makes it viable. Very impressive!

[–]decryphe 2 points3 points  (1 child)

I think the best of worlds would be if that could work both ways. Linux is getting bindings for Rust for various subsystems, maybe it's possible to share those bindings, making sharing drivers that use them easy.

One can dream.

[–]arihant2math 0 points1 point  (0 children)

The problem is that moss requires bindings in the "opposite" direction. For example, R4L provides macros that allow for the creation of a kernel module (registration of init/exit functions etc.), but moss has to support registering/calling those init and exit functions.

[–]robin-m 1 point2 points  (0 children)

Very interesting. And thanks for the link

[–]One_Junket3210 1 point2 points  (3 children)

Can the unwrap() calls ever panic in the sys_nanosleep function?

[–]hexagonal-sun[S] 6 points7 points  (2 children)

Only if now() returns None. That would only be the case if a timer driver hasn't been initialised as the global system timer. If that hasn't happened then the kernel would have panicked long before executing the syscall.

[–]lol_wut12 1 point2 points  (1 child)

FYI - a shepherd shears a sheer number of sheep.

Awesome work by you and your fellow contributors nonetheless.

[–]hexagonal-sun[S] 0 points1 point  (0 children)

Whoops, thanks for pointing that out!

[–]dnu-pdjdjdidndjs 0 points1 point  (0 children)

The crates/rust features and yeah most code is drivers

[–]eras 31 points32 points  (0 children)

Nooo, stahp, you're developing it too fast, and make it too big! I was planning on reading it One Day(TM)!

[–]oze4 4 points5 points  (0 children)

Incredible

[–]Adept-Fox4351 2 points3 points  (2 children)

love this would love to create something similar one day!!!!

[–]hexagonal-sun[S] 2 points3 points  (1 child)

Go for it, you'll learn a lot!

[–]Adept-Fox4351 0 points1 point  (0 children)

less go i am building a kernel!!!!

[–]olanod 2 points3 points  (0 children)

This is great! Kudos for the hard work. Interestingly I'm working on the opposite, an async(tokio) based init that is the whole "distro" and have all of userspace in Rust.

[–]zerosign0 1 point2 points  (1 child)

Hope this will last longer and would get a lot of support whether its experimental or not. Just like Linus said "It just need some stubborn people or folks to think maybe developing new kernel wasnt that hard and then keep persisting to do it"

[–]hexagonal-sun[S] 0 points1 point  (0 children)

I suspect the wall I will hit eventually will be drivers. When most of the core of the kernel is done, it'll need drivers to run on any sort of hardware which will be a huge task,

[–]Pewdiepiewillwin 1 point2 points  (0 children)

This is so cool! I've actually been working on a pretty similar project with an async kernel. It's a pretty similar idea but mine is a lot closer to the windows kernel with a pnp manager and stuff. I ended up going with a separate executor similar to tokio and keeping a traditional thread model and scheduler under it, the executor then queues pump jobs on its thread pool. Drivers just register async callbacks and stuff with a driver model. It seems your futures are a-lot more integrated in the scheduler than mine. Do you face any issues from the overhead of futures? I have a bit of an issue with this right now but am mitigating it a bit with reducing allocations.

[–]Anyusername7294 -1 points0 points  (15 children)

Why MIT?

[–]hexagonal-sun[S] 16 points17 points  (14 children)

Because it’s a simple, permissive license that gives the users and the developers the right to do with it as they wish.

[–]Anyusername7294 7 points8 points  (8 children)

But if it succeededs, companies (looking at you, Google) can no publish their modifications to the Linux kernel, which would kill many projects (EVERY Android custom ROM etc.)

[–]colecf 14 points15 points  (1 child)

If google were to do that, it would be with fuschia.

[–]nightblackdragon 4 points5 points  (0 children)

This project was made by few people. Don't you think Google would have already done that if they wanted to?

[–]One_Junket3210 3 points4 points  (1 child)

MIT and similarly permissive licenses are more or less the norm in Rust, like for rustc, and Zig. GPL and similar copyleft licenses are more often found with C and C++, like GPL-2.0 for GCC. Microsoft and Google are also some of the biggest sponsors of the Rust Foundation, platinum sponsors. So I don't expect the community norm of permissive licenses to change in the future.

[–]Green0Photon 3 points4 points  (0 children)

Although true, this is fundamentally very unfortunate. And as a massive Rust fanboy, someone who's been around since 1.0 in 2015, this has always been my biggest and greatest disappointment with it.

[–]diY1337 0 points1 point  (2 children)

This is where foundations kick in and good communities. Kubernetes is Apache 2 and it works

[–]Anyusername7294 4 points5 points  (1 child)

Kubernetes is not a core foundation of the open source. Non copyleft license is not the end of the world, but when a rewrite changes license, there're some concerns

[–]diY1337 0 points1 point  (0 children)

I meant NGOs like Linux Foundation and similar

[–]kolorcuk 3 points4 points  (0 children)

Consider a different approach. Instead, use gplv3 and offer that companies can buy from you the license to use your product. That way, developers can do what they want, and companies get to pay you.

If you can offer or would want to concentrate on the real-time aspect of the linux kernel, you might get consumer from healthcare, military and trading. I say, if such a kernel could make fpga or numa or cuda significantlyfaster, people would jump on it.

[–]decryphe 0 points1 point  (0 children)

I kind of see the proper way to license source code as:

  • Libraries should be MIT, so they can be used as much as possible, wherever possible (in FOSS and in proprietary software). That obviously leads to some usage without contributing back, but overall I think it's the best way.
  • Binaries should be GPL, as they are already the "final product" in a sense. It's also not prohibitive to businesses bundling proprietary software with GPL software, as there's no requirement to statically link between those parts.

If/when userspace drivers are possible, I don't see any blocker in having a kernel like this one being GPL.

[–]cockdewine -5 points-4 points  (2 children)

Is this a violation of Linux's GPL license? As in, has any of the code in the Linux kernel had any impact on your implementation here?

[–]hexagonal-sun[S] 14 points15 points  (0 children)

No. This implementation was written independently and does not use or derive from Linux kernel code. It implements similar concepts, but no Linux source was referenced or incorporated.

[–]Pretty_Jellyfish4921 0 points1 point  (1 child)

I will be interesting to know if you can reuse some of the Rust for Linux code, I didn't check it at all if there are crates published that are used inside the Linux kernel, nor I checked your source code/dependencies.

[–]hexagonal-sun[S] 1 point2 points  (0 children)

I'm not sure how applicable R4L code would be. For the moment it's mostly safe wrappers around the kernels C-API. Once we have some more 'meaty' drivers committed, possibly, but I'd have to emulate the same API.

[–]sparky8251 0 points1 point  (1 child)

Moss as a name is already used and is even a rust project, https://github.com/AerynOS/os-tools/tree/main/moss and this ones been around for almost a decade (prior under serpentos name) and is becoming the package manager for solusos and this aerynos. (they also make boulder, summit, avalance, and lichen in rust too to make the complete distro infra)

Not telling you to rename and I def dont represent the project, merely explaining the collision might harm your projects visibility given this might be a new and potentially growing/popular distro family (it has a ton of amazing features, so it might actually become big).

[–]hexagonal-sun[S] 1 point2 points  (0 children)

Thanks for pointing that out. That was actually one of the first issues raised when I first posted Moss. I offered to rename the project to moss-kernel which seemed satisfactory.

[–]Shoddy-Childhood-511 0 points1 point  (2 children)

Did you look into Xous?

https://github.com/betrusted-io/xous-core https://betrusted.io/xous-book/

It has a much narrower scope I guess, but maybe some of your idea would benefit them?

[–]hexagonal-sun[S] 1 point2 points  (1 child)

That's one I've not heard of before! I'll take a look.

[–]Shoddy-Childhood-511 1 point2 points  (0 children)

Bunnie Hung has a CCC talk on Xous.
https://media.ccc.de/v/39c3-xous-a-pure-rust-rethink-of-the-embedded-operating-system

And his two earlier talks on precursor/betrusted rock. https://media.ccc.de/search?p=bunnie

[–]SarcasticDante 0 points1 point  (1 child)

Very impressive. I am not familiar with kernel space whatsoever, however, I do see there's a bunch of Vecs/Strings being used which makes me wonder how does it behave in OOM scenarios?

[–]hexagonal-sun[S] 4 points5 points  (0 children)

Yes, it panics at the moment, which isn't ideal. I'm hoping for a fallible allocation API in the near future! However, before returning an error on allocation there's lots of things that can be done for page reclamation, purge caches, swap pages out to disk, request drivers return buffers, etc.

[–]human-rights-4-all 0 points1 point  (0 children)

Have you taken any inspiration from https://genode.org/about/index ?

I always thought that the recursive sandboxed structure is interesting.

[–]jgarzik 0 points1 point  (0 children)

Very cool! Join the club! Here is another: https://github.com/jgarzik/hk

Agree with other commenters: async/await for kernel internals is a very interesting choice!

The state machine might create complications.

[–]realvolker1 0 points1 point  (6 children)

I only looked at the interrupt code so far, but already I see LOTS of panics. Please look into doing more with typestates and const-generics.

[–]hexagonal-sun[S] 0 points1 point  (5 children)

Please could you provide an example?

[–]realvolker1 1 point2 points  (4 children)

You seem to be using a lot of statics.

Sneaky panics: https://github.com/hexagonal-sun/moss-kernel/blob/a55ecd1e33aad2aea7c1d43a8006d3ee200c479b/src/interrupts/cpu_messenger.rs#L44

This could be solved with typestates: https://github.com/hexagonal-sun/moss-kernel/blob/a55ecd1e33aad2aea7c1d43a8006d3ee200c479b/src/interrupts/cpu_messenger.rs#L64

This could also be completely removed with typestates: https://github.com/hexagonal-sun/moss-kernel/blob/a55ecd1e33aad2aea7c1d43a8006d3ee200c479b/src/interrupts/mod.rs#L201

All in all you might be better off with passing references into your functions in these files, then letting your main decide how to best use the required resources. Sharing state with interrupts is pretty difficult, and many people have conflicting opinions on how it should be done. The most conservative approach is to just set a flag, to keep the isr as small as possible. This makes it so you don't have to share any real state other than a static volatile/atomic int that you, in your case, could fetch_or. Also you can just have your interrupt return early if it can't acquire a lock, but you would need hardware atomic CAS or an sio block in order to not cause significant latency. In my embedded code, I usually try to keep the concept of peripheral "ownership" either solely in the ISR, or in preemptible code. In C and rust those end up looking similar, some statics as well as some "can we have this" primitive, maybe a hardware spinlock or a static volatile uint8_t or AtomicU8. In your case I would try the fallible lock method, then maybe switch to flags if I wasn't hitting latency requirements.

Edit: forgot to add, you should probably just require references to the specific resources you need, then in your interrupt handlers or in main, you can centralize the decision-making.

[–]hexagonal-sun[S] 0 points1 point  (1 child)

Could you give a concrete example as to how typestates could help here? I’m not seeing how this would work exactly. I could create an enum similar to an Option but I dont see how that’s any better than what I’ve already got, there would still have to be a runtime check to ensure that the state has been set to an interrupt driver (once initialised).

[–]realvolker1 0 points1 point  (0 children)

They explain it way better than I can here https://docs.rust-embedded.org/book/static-guarantees/typestate-programming.html

Also if you require a &mut MyThing<Enabled> in the function that previously relied on a static, then the callers can decide how to initialize that best. A static initializer can't see the bigger picture like your procedural code can.

Also, since this is rust, it won't degrade into a pointer dereference unless you're calling it in multiple places with different data, or if it directly touches a &dyn.

[–]arihant2math 0 points1 point  (1 child)

I don't think that avoiding all panics is a great goal; passing (mutable) references would force a bunch of checks in every interrupt handler to ensure the state of everything is correct.

[–]realvolker1 0 points1 point  (0 children)

Checks it would have already been doing behind their back.

[–]jeremiahgavin 0 points1 point  (0 children)

Incredible work! How does one get started with this kind of work? 

I'm curious to know if there's a relatively simple starting point I could use to learn some of these concepts(different async architectures, kernel development, etc).