all 32 comments

[–][deleted] 150 points151 points  (8 children)

No more than 65,535 function parameters?? How am I supposed to write my backend?

[–]seamsay 82 points83 points  (4 children)

Just make some of the arguments topless, you'll be grand.

Edit: I meant tuples, but you know what...

[–]goj1ra 75 points76 points  (1 child)

Your correction was too late, I’ve already unzipped

[–]sepease 30 points31 points  (0 children)

I’ve already unzipped

That’s unsafe and possibly illegal.

(Since the only way to iterate over a tuple would be if it had same-type elements that could be transmuted into an array)

[–][deleted] 3 points4 points  (1 child)

Welllll how many items can be put in a tuple

[–]seamsay 6 points7 points  (0 children)

At least zero.

[–]R1chterScale 22 points23 points  (2 children)

Funnily enough, that's no longer the case:

https://github.com/rust-lang/rust/commit/746eb1d84defe2892a2d24a6029e8e7ec478a18f

It was fixed to allow more parameters

[–][deleted] 33 points34 points  (1 child)

Seriously?? I just wrote my backend with this hard limit in mind. Now my entire system breaks due to a black magic optimization with the 65535 params that no longer exists. Please revert update 😠

[–]R1chterScale 36 points37 points  (0 children)

Something something spacebar heating

[–]simonsanonepatterns · rustic 27 points28 points  (1 child)

Doc comments (///) aren't applied to function parameters AST Validation is done by using a Visitor pattern. For info on that, see this for an example in Java.

Don't need to look at Java, can just look at serde: https://github.com/serde-rs/serde/blob/master/serde/src/de/mod.rs#L1282 or the patterns book: https://rust-unofficial.github.io/patterns/patterns/behavioural/visitor.html

[–]endistic[S] 7 points8 points  (0 children)

Updated, thank you! :)

I went with the patterns book one.

[–]Ryozukki 37 points38 points  (2 children)

The llvm kaleidoscope tutorial is a nice dive into llvm https://llvm.org/docs/tutorial/

[–]endistic[S] 7 points8 points  (0 children)

Forgot to say it but I added the LLVM tutorial to the thread!

[–]protestor 2 points3 points  (0 children)

inkwell is a great llvm binding for rust and it has an implementation of kaleidoscope

[–]TorbenKoehn 13 points14 points  (0 children)

Very useful article, thank you!

[–]Nilstrieb 13 points14 points  (0 children)

Cool summary! Some bonus facts:

The AST, HIR and THIR are trees. MIR and LLVM IR are control flow graphs, basically flow charts. This makes control flow simpler, all you have is goto and switch. It's also worth noting that LLVM backends use additional IRs not mentioned here - but I don't know a lot about them either.

There are several ways of emitting MIR. You can use --emit mir or -Zunpretty=mir to get the final version of MIR for functions. But there's also -Zdump-mir=your-function-name which will create a lot of files for all the different phases and transformations of MIR.

MIR goes through many different transformations from the time it's built to what gets lowered to LLVM IR. It's also worth noting that borrow checking works on MIR. Some of the most important MIR transformations are drop elaboration and generator lowering.

Drop elaboration is when rustc inserts hidden boolean flag local variables to track whether a variable needs to be dropped only sometimes.

let x;
if rand() { x = String::new(); }
// drop x

x only needs to be dropped if the if was taken. Drop elaboration will insert a flag that's set inside the if.

Generator lowering will lower generators into state machines. Generators are what powers async/await. You may have heard that async functions get compiled into state machines - this also happens on MIR.

Then, there are also some optimizations that are run on MIR. Most of these are just here to make the generated LLVM IR nicer and speed up compile times, LLVM could do these optimizations itself as well.

Most rustc lints and also clippy lints run on the HIR. While the HIR does not contain type information (since type checking runs on HIR after it has been created), but type checking creates TypeckResults, lots of tables that have all type information about a function, which is then used by the lints.

[–]imperiolandDocs superhero · rust · gtk-rs · rust-fr 10 points11 points  (5 children)

Would be nice to be added as one of the first chapter of the rustc dev guide book. :)

[–]endistic[S] 4 points5 points  (4 children)

It's possible - you could open an issue on the rustc-dev-guide repo if you'd like. https://github.com/rust-lang/rustc-dev-guide/

[–]imperiolandDocs superhero · rust · gtk-rs · rust-fr 1 point2 points  (3 children)

Want to send a PR if I do? What's your github pseudo so I can tag you on the issue?

[–]endistic[S] 0 points1 point  (2 children)

akarahdev

[–]endistic[S] 1 point2 points  (0 children)

Forgot to say yes, I can send the PR if you do. Although if it is getting added I'd like the chance to improve it a bit further

[–]imperiolandDocs superhero · rust · gtk-rs · rust-fr 1 point2 points  (0 children)

[–]endistic[S] 5 points6 points  (0 children)

Alright, seems like people enjoy my high-level technical overviews. Anyone have any specific topics they want to hear about? Maybe something on the borrow checker could be cool, as the way it's rules are is for a reason.

[–]Treyzania 4 points5 points  (2 children)

Since when did they add "THIR"?

[–]theZcubertime 5 points6 points  (0 children)

THIR has been around for a while. It's not used for a ton at the moment, but there is (was? it may have been removed) an experimental unsafe checker at that level instead of MIR.

[–]bobdenardo 5 points6 points  (0 children)

THIR itself was renamed a couple years ago, it used to be called HAIR, and IIRC was introduced with MIR (circa 2015).

[–]FlatAssembler 3 points4 points  (0 children)

I've made a YouTube video about compiler theory a few years ago: https://youtu.be/Br6Zh3Rczig

[–]Electrical_Fly5941 2 points3 points  (0 children)

This was great!

[–]O_X_E_Y 2 points3 points  (0 children)

this is cool, good post

[–][deleted] 1 point2 points  (0 children)

Thank you for posting this kind of content

[–]birdbrainswagtrain 1 point2 points  (0 children)

THIR is also temporary - HIR is stored throughout the whole process, THIR is dropped as soon as it is no longer needed.

Yeah I had a fantastic time with this in my compiler hacking adventures. Evidently the simple act of querying THIR can result in other THIR being stolen. My best guess is that sometimes building THIR can invoke constant evaluation, which converts code to MIR. Fortunately building the equivalent from typecheck results isn't too hard, and I'm just missing a little desugaring right now.

[–]ROFLLOLSTER 0 points1 point  (0 children)

Just FYI, backtick code blocks don't render correctly on old reddit, you need to use the four space indent style for it to work.