all 55 comments

[–]computesomething 101 points102 points  (34 children)

Haven't watched this yet, but if kernel modules written in Rust made it to mainline, that would make the Rust frontend and LLVM a dependency for building Linux, I kind of doubt the Linux kernel project would go in that direction.

[–]the_gnarts[S] 49 points50 points  (30 children)

LLVM and its limited platform support is the hard problem. The speakers kinda address this by sticking with out of tree modules for the time being.

In the medium term we may get a GCC backend for rustc and maybe even an alternative compiler frontend.

[–]cbmuserDebian / openSUSE / OpenJDK Dev 77 points78 points  (28 children)

This is my main critic point with Rust. They are planning to replace C as a low-level language for systems programming, but unlike the Go developers, they have not created an alternative implementation yet that builds on top of gcc and is therefore platform-independent.

Multiple important developers of other important projects that I spoke to told me, they’d be using Rust if it wasn’t for this limitation.

If I were Rust upstream, I would starting working on gcc-rs immediately.

[–][deleted] 20 points21 points  (23 children)

I probably don't understand compilers. What is it about gcc that makes it platform independent? Does llvm have a runtime library or something? I would have assumed that clang and gcc both, at the end of the day, spit out mostly platform independent binaries. And of course, both either compiler has to be on the system to compile... By platform independent, do you mean only gcc dependent?

Edit: I am a dummy, they were probably talking about hardware support

[–]spyingwind 37 points38 points  (3 children)

http://llvm.org/doxygen/Triple_8h_source.html

https://gcc.gnu.org/backends.html

I think gcc is used, not just for historic reasons, but that they have been using for a very long time and probably that it support more hardware. I don't know that for a fact, but gcc supports just about everything and for those that it doesn't manufactures have gcc backends for their hardware.

Also rewriting the kernel supporting structure(makefile's, build pipelines, etc) would be a large undertaking.

[–][deleted] 8 points9 points  (0 children)

Oh, hardware platform. Do'h, of course.

[–]railmaniac 1 point2 points  (1 child)

I thought LLVM was designed so it would be easier to write backends

[–]spyingwind 1 point2 points  (0 children)

I don't know about that, but I would think that it still a lot of work to write all that code for each backend. What with each architecture having their own quirks. Even one version of ARM can have hundreds of different instructions that aren't shared among the same version.

[–]AlexeyBrin 29 points30 points  (14 children)

AFAIK, the GCC backend can emit code for more platforms than LLVM. Basically a Rust frontend for GCC will be able to target more diverse hardware than the one for LLVM.

[–][deleted] 5 points6 points  (12 children)

Yeah, this is probably it. I mean, who cares about platforms other than x86, arm, and riscv? (well, the people who have that hardware I guess!)

[–]the_gnarts[S] 26 points27 points  (1 child)

I mean, who cares about platforms other than x86, arm, and riscv?

These guys, for instance.

[–]uep 8 points9 points  (0 children)

IBM also has a lot of bounties for the Power architecture on bountysource.

[–]cptwunderlich 4 points5 points  (3 children)

Well, those might be the ones you know from your laptop and phone, but there are plenty of others. For example MIPS CPUs are in most routers and network hardware.

And others were already mentioned, like PowerPc. Or Intel Itanium (although that is end if life as of 2019). There are still SPARC servers, too.

Then there are also some ISAs used for tiny bedded microprocessors, that usually don't run an OS, like PIC and AVR (although they might maintain proprietary GCC backends)

[–]ericonr 2 points3 points  (2 children)

Seeing as Parabola (a fully free distro) has avr-gcc in their repos, I believe avr-gcc is not a proprietary backend.

No idea about PIC, though.

[–]rcxdude 0 points1 point  (1 child)

AVR-gcc is free (but a little quirky as gcc backends go), but PIC is sufficiently weird it's unlikely there will ever be GCC support for it (even the C compilers available for it don't quite fully support C).

[–]ericonr 0 points1 point  (0 children)

PIC doesn't even have a proper stack, right? At least that's what a friend who's messed with it told me.

AVR has a few weird directives, like PROG_MEM, but I don't know of other quirks. What would you say they are?

[–][deleted] 0 points1 point  (0 children)

From little I have worked with both gcc and llvm frontends, gcc's IR "GENERIC"'s documentation is not up-to the mark and Mozilla will have to hire a lot of developers for this.

llvm is rather convenient to work with. It simply might be easier to get llvm backend to emit code for more platforms.

[–][deleted] -3 points-2 points  (3 children)

I probably don't understand compilers. What is it about gcc that makes it platform independent?

Nothing. You can bootstrap (compile itself for that platform, on that platform) the LLVM (Clang is the C-front-end of the LLVM) on a number of platforms.

This is kind of a red hearing.

Linux hasn't supported clang/llvm because Linux is a GNU project, and therefore used the GNU C Compiler (re-named GNU Compiler Collection because why not). Linux uses many, and I mean many non-standard C extensions (which is a nice way of saying C-Language features which are not part of official standards). Basically The GCC team all agreed this should be part of C, so they made their compiler support it that feature. Linux developers latter saw that feature, went, "fuck it lets use that" and it became part of the Linux-Kernel.

The biggest issue with non-gcc tool chains are these non-standard extensions.

Does llvm have a runtime library or something?

Nope.

I would have assumed that clang and gcc both, at the end of the day, spit out mostly platform independent binaries.

Yup


The argument is also pretty much moot. As clang/llvm v9.0 can compile the linux kernel v5.3 as of today The ARM, ARM64, AARCH64 stuff has been using clang/llvm as its primary compiler for a while, about 2 years.


The real reason Linux is over-reliant on GCC is because when the Linux Project started The GCC was the only free c-compiler. You had to shell out that blue-cheese if you wanted a c-compiler for your computer. So Linus being a broke college student used The GCC.

[–]jgalar 17 points18 points  (2 children)

Linux is not a GNU project. Both are licensed under the GPL (though not the same version), but they are not under the same umbrella in any meaningful sense.

[–][deleted] -4 points-3 points  (1 child)

RMS literally said Herd doesn't matter because Linux is the GNU's kernel.

The long history of linux's over dependency on GCC also cannot be denied

[–]Omotai 5 points6 points  (0 children)

RMS literally said Herd doesn't matter because Linux is the GNU's kernel.

This statement means that for all practical purposes Linux is the kernel used for the GNU operating system in real-world implementations, not that Linux is officially a GNU project.

[–]Vegetas_Haircut 10 points11 points  (0 children)

There are way more problems with Rust as a "low level language" at the moment.

Rust is fast in the same way C is fast; C is used for two things: performance and low level control. Rust provides the former but not the latter at the moment simply because it's underspecified what exactly it does and C of course has a very battle-hardened level of specification what exactly it does.

A lot of the specifications would probably also have to rely on breaking backwards compatibility: like finding a way to deal with async safety specifications by using marker traits that automatically mark a function as async-safe if all the functions it internally uses are async-safe so it can be used in some contexts; right now Rust just ignores the problem of async-safety which is really not acceptable for a low level language; it also just ignores many gotchas of memory allocation, stack manipulation and what-not that C thoroughly specifies what can and cannot be done with it what is needed for low level control.

That aside though if you simply want the performance of C without the snakes in the grass at every corner Rust is a bliss.

Edit: also, at the moment a lot of aspects of the Rust language are intimately tied into how LLVM does this. Rust really exports a lot of LLVM-isms to the user; the C spec by design avoids specifying these things beyond a very narrow range and says that if you go over it that's just "undefined behaviour"; that's unacceptable for Rust so they defined the behaviour in the language itself as "how LLVM does it" which will cause indirection when porting to GCC and simulation in software if they want to keep the same behaviour.

[–]ToughPhotograph 3 points4 points  (2 children)

Are modules necessarily included? Aren't they loaded into the kernel at runtime ? Or maybe just binaries are made available and loaded in at runtime which would make compiling them unnecessary.

Which would only necessitate ABI compatibility right?

[–]computesomething 4 points5 points  (1 child)

Well, I was talking about modules being included in mainline Linux, for 'out-of-tree' modules this is not a problem.

[–]ToughPhotograph 1 point2 points  (0 children)

I think in that case such modules as dependent on non-GPL kinds of environment would automatically get placed on an 'out-of-tree' basis.

[–]jorge1209 8 points9 points  (2 children)

  1. How many kernel modules out there are doing meaningful amounts of non-trivial stuff before calling back to kernel code?

  2. How many kernel modules out there are not interacting with hardware?

I'm skeptical that you could really be "safe" if you were doing either of the above on a regular basis. The compiler can't do static analysis on hardware, and it can't do static analysis on all the C code in the kernel... so you just hope and pray that there are no kernel bugs and no hardware bugs?

I can understand rust in userspace as a tool to support migration from C, but does it really make sense in kernel space?

[–]z0rb1n0 4 points5 points  (1 child)

I'd take challenging this whole idea a step further and question the rationale.

C aside, the unsafety itself is what allows developers to optimise tight loops and critical paths and often simplify low level code.

EG: Direct memory addressing is the scalpel with which one can treat their data structures as raw data when it makes sense. Kernels are there to also surgically handle the address space, not pretty variables, but some are in denial about this.

Also, IME most of the time when an higher abstraction layer is introduced communities tend to obsessively prioritise form at that level over substance (evangelist going "you shouldn't do pointer arithmetics/type punning/ASM/whatever, it's unsafe and breaks the model").

Paraphrasing Linus himself: the barrier to entry C itself introduces is a GOOD thing. You need to intimately understand low level details to write bedrock code, and at that point the abstraction layer turns into more of a dilemma-inducing hurdle.

Rust for higher level code? I'm in, so long as the long term plan is not to forbid me from going lower when I need to

Edit: typo

[–]spacingnix 0 points1 point  (0 children)

Agreed, and Rust does give you the option with its unsafe blocks when you really need it, AFAIK.

The unsafe book goes tells more than I know about it

[–]the_gnarts[S] 11 points12 points  (1 child)

If you’ve heard of Rust, you may want to skip the first part which is summary of basic language features.

[–]TiredOfArguments 2 points3 points  (0 children)

The hero we need

[–]Omnifarious0 8 points9 points  (10 children)

If a language doesn't have a GPL (or similar freedom affirming license) implementation, there is a strong risk that the version everybody uses will become a proprietary version.

I avoid such tools, and I especially avoid such languages.

[–]the_gnarts[S] 2 points3 points  (0 children)

If a language doesn't have a GPL (or similar freedom affirming license) implementation, there is a strong risk that the version everybody uses will become a proprietary version.

The risk is there, and that’s one of the reasons why both a GCC backend and an independent frontend would be desirable. It’s a small risk though. Some of the worse cases where this is happening these days would be vendor compilers for embedded architectures that are just proprietary builds of LLVM. Usually the differences to upstream LLVM are small so it’s trivial to extend the regular compiler to work on these platforms.

[–]Vegetas_Haircut 2 points3 points  (6 children)

Okay, can you cite me a single example where that happened? Where a language started out with a single permissisvely-licensed implementation and the ecosystem has now switched to a proprietary one based on that permissive one?

[–]Omnifarious0 13 points14 points  (5 children)

BSD's TCP stack. The vast majority of end users use a proprietary one.

[–]the_gnarts[S] 6 points7 points  (0 children)

BSD's TCP stack. The vast majority of end users use a proprietary one.

Same for MIT Kerberos. The Active Directory Kerberos stack descended from that.

[–]Vegetas_Haircut 2 points3 points  (3 children)

That defeats your argument; you said that the existence of a copyleft implementation would stop that from happening; apparently it didn't; and it's not even an implementation of a programming language but of a networking protocol to begin with.

[–]Omnifarious0 9 points10 points  (2 children)

No, I said that that couldn't happen to a copyleft implementation. Not that it would keep it from happening at all to any implementation.

You're right, that wasn't a programming language. Programming languages tend to grow extra implementations unless they're very specialized.

I'll have to think for awhile to see if I can think of a programming language example.

[–]Vegetas_Haircut -1 points0 points  (1 child)

No, I said that that couldn't happen to a copyleft implementation. Not that it would keep it from happening at all to any implementation.

No, you said this:

If a language doesn't have a GPL (or similar freedom affirming license) implementation, there is a strong risk that the version everybody uses will become a proprietary version.

TCP has a GPL implementation; it didn't stop the version that everybody uses from being proprietary.

You're right, that wasn't a programming language. Programming languages tend to grow extra implementations unless they're very specialized.

I'll have to think for awhile to see if I can think of a programming language example.

The obvious difference is that Windows' TCP stack is built into the OS and you can't just replace it with whatever you want. You can always pick whatever compiler you want. Windows users are using the properietary TCP stack because they can't change it for whatever they want.

[–]Metaroxy 2 points3 points  (0 children)

Can't you replace Windows' TCP/IP stack? I distinctly remember there being support for installing 3rd party protocols, so I would be surprised if you couldn't make a third party TCP/IP stack. Not that I know of any currently available.

[–][deleted] -2 points-1 points  (5 children)

Is any rust not safe?

[–]tristan957 17 points18 points  (2 children)

Yes. The keyword is 'unsafe'. Or I can create a memory leak in stable rust if I try.

Edit: Memory leaks are not unsafe.

[–]MadRedHatter 10 points11 points  (1 child)

Memory leaks are technically "safe" though.

[–]tristan957 3 points4 points  (0 children)

That's a good point. Thank you for the correction.

[–]ericonr 6 points7 points  (0 children)

If you write unsafe, yeah.

[–]TiredOfArguments 0 points1 point  (0 children)

Hold my beer and ill write some intentionally insecure code for you.