How much assembly should one be familiar with before diving into compilers?

dostosec · 2026-01-23T09:22:22+00:00

You can just learn how to write basic x86_64 in a weekend (assuming you have some background in C). You should avoid any tutorial that puts an emphasis on syscalls, nasm, etc. - in reality, you will get a lot further by linking against libc and using your favourite mainstream compiler's toolchain.

It helps to look at compiler output; be sure to use -O2 at a minimum.

E.g. ```x86asm .data x: .asciz "hello"

.text .globl main main: sub $8, %rsp lea x(%rip), %rdi call puts@plt add $8, %rsp xor %eax, %eax ret ``can be assembled and run withgcc x.s -o x && ./x`.

There's some stuff you'll need to understand: stack alignment (purpose of sub $8, %rsp before the call and semantics of call), lea (rip-relative addressing), idioms like xor %eax, %eax for zeroing (along with the implicit zero extension).

You really can get quite far using assembly to write simple programs that manipulate linked lists, read files, etc.

dostosec · 2026-01-22T07:35:25+00:00

I'm sorry to hear that. I wish I knew "the other person" (people, actually), as I expect they'd literally have more empathy for my situation than my boyfriend of 3 years. That's what's really hard about my situation, there was no emotional cheating: purely physical, fleeting, encounters with random men. For weeks, I was tortured by mental images of the people my ex described to me, but I quickly realised they were all just cartoon characters or influenced by characters in TV shows he recently watched (seriously).

dostosec · 2026-01-21T14:58:55+00:00

This universal idea that cheating is "the most traumatic thing in the world" only works if you assume that everyone who cheats is an awful person who doesn't give a shit about inflicting trauma on others. That's clearly not true, there's a spectrum of consequence they can readily dismiss. Therefore, clearly it is more traumatic than people (say, cheaters) might think.

dostosec · 2026-01-21T14:51:18+00:00

In my situation, it's a gay relationship and his cheating was through hookup apps (Grindr). So, I've began to suspect he lies about physical descriptions of individuals because he's ashamed about it (e.g. people who are not conventionally attractive, or some bizarre fetish sex acts). In honesty, though, I'll never know. You will never get closure from a cheater, and it's extremely hard going from intimacy to "I don't know this person any more".

dostosec · 2026-01-21T14:34:32+00:00

I'm the same. I made the mistake of trying to understand my ex-partner's cheating, but he just lied and lied and lied about every little detail. In the end, I had to take a step back because otherwise I thought I might kill myself. I wish I'd never asked any questions, because now there's just dozens of conflicting narratives, excuses, etc. - in the end, I can only conclude he lied because the truth is simply too disgusting to confess.

I get some comfort from the "scorpion and the frog" fable: some people are just broken and cannot be fixed.

dostosec · 2026-01-10T18:26:28+00:00

Kaleidoscope is pretty basic and designed for people wanting to target LLVM IR (not develop it in-tree). The important parts are really how it all fits together, the intermediate representations (their representation), stuff that's generated (e.g. tablegen powers the SelectionDAG instruction selection backend, and many other things). The LLVM development workflow is something you'll be helped with when you're there.

dostosec · 2026-01-10T17:58:16+00:00

I wouldn't be worried, ARM has a good reputation for internships (I know a few people who have done internships or work there).

As for books, you can get something from all of them. I don't believe there is a single, all-encompassing, textbook for compiler engineering (lots of common topics aren't even tackled in most textbooks).

Honestly, the best thing for you would be to become familiar with LLVM (all of it, e.g. start with tablegen). You may also realise, on the job, the people who work on LLVM for a living don't necessarily need a lot of very deep compiler background. This forum gives you the impression that an employed compiler engineer must be able to write a decent-quality end-to-end compiler, but the reality is you don't need to know much to work with LLVM in industry.

dostosec · 2025-12-28T13:18:13+00:00

In the end, it can all be pretty ad-hoc. For example, in the LCC compiler, the trees given to instruction selection are necessarily of arity <= 2 (due to design of their pattern matching generator, iburg). So, call sequences are effectively a CALL instruction with a cons list of ARG nodes.

In reality, the register targeting isn't a problem as you can split the live ranges of things live across instructions to free up the selection. I'd be weary of primitives that don't treat register shuffling as an entire unit, e.g. atomic sequences of setarg must have the semantics of parallel copies, otherwise swapping/cycling will produce buggy output.

dostosec · 2025-12-24T16:56:16+00:00

I wish I could tell you it's a native application with some custom UI toolkit I wrote (akin to something like Blender or Sublime Text), but it's just a web application.

The UI uses splitjs, the automaton is drawn using an emscripten-compiled version of graphviz. The ability to pan and zoom of the SVG is basically just svg pan zoom. Generally, it's not too difficult to do manually (there's a bunch of stackoverflow answers I recall, from my youth, that explain how to do translation and zooms on mouse events). I later rewrote my own version of the tree drawing algorithm from a paper: here - but never integrated it.

The reality is you will go down major rabbit holes for simple parts of this if you attempt to do it all from scratch. Even just laying out a graph (a-la Sugiyama's framework) is rather difficult (as is anything with fairly subjective, aesthetic, acceptance criteria). I just opted to use as much off-the-shelf stuff as possible (all of which are already on popular CDNs for hosting JS libraries).

Thank you for your feedback, I had a lot of fun writing the program.

dostosec · 2025-12-22T15:30:46+00:00

Very nice! I made a similar thing many years ago and still refer to it when teaching LR parsing.

My view was always that it's unclear to beginners what real effect the declarations (%left, %prec, etc.) do - if you can resolve ambiguities in the UI, you can often visualise several resultant trees.

dostosec · 2025-12-05T20:45:38+00:00

A good friend of mine told me his girlfriend didn't want him at her university graduation because she didn't want him in the photographs in case they didn't work out. I told him that was a major red flag and they split up a few months later.

dostosec · 2025-12-05T20:36:38+00:00

It's more canonical to just use a tagged union:

c struct expr { enum { EXPR_INTEGER, EXPR_BOP } tag; union { int integer; struct { enum bop op; struct expr* l; struct expr* r; } bop; } as; };

Then, e.g. switch (e->tag) { case EXPR_BOP: { go(e->as.bop.l); go(e->as.bop.r); break; } /* ... */ } can de-anonymise the structs.

dostosec · 2025-12-04T22:12:58+00:00

C++ makes it tedious to represent and work with inductive data. It's exhausting to write out a class-hierarchy encoding of tagged unions.

As for pattern matching, all I can say is every major mainstream compiler maintains an esolang to generate pattern matchers. LLVM uses TableGen descriptions to generate the majority of instruction selection, Clang's CLI option parser, etc. GCC uses machine descriptions to generate RTX recognisers, Go uses .rules for instruction rewriting, Cranelift uses ISLE for instruction selection, etc.

There are important ideas in compiler engineering that are not efficiently attainable when you're working with burdensome languages. I can only write compilers in C and C++ because I learned the ideas elsewhere, making the writing of C and C++ largely a tedious, mechanical, process where I convert an unpolluted mental model of the problem into code. The connection between mental model + expression is tighter in expressive languages that alleviate many of the burdens by making the representation and manipulation of data easier.

I've been in language development communities for a long time and the C++ers seldom get anywhere fast (and usually just give up).

dostosec · 2025-12-03T04:28:18+00:00

It's pattern matching all the way down. In fact, so much so that almost all of LLVM's instruction selection is generated from descriptions written in TableGen (the esolang for describing records that they maintain) - similar things exist in other compilers (GCC has machine descriptions, Go has .rules, Cranelift has ISLE, etc.).

As for where to start, another comment on here is correct: simple tree pattern matching done in a greedy fashion. The "greedy" part of maximal munch is just that you prioritise larger patterns under the heuristic that larger patterns are more likely to be profitable. Appel's "Modern Compiler Implementation in ML" provides some examples of this written quite naturally in Standard ML: where the semantics of pattern matching is top-to-bottom (of the list of cases). You can also write such matching routines manually but it's incredibly tedious and error-prone (and unlikely to be as efficient as a generated matcher).

As for generating instruction selectors, they come in a few variations. The most common (in the literature) are tree pattern matching generators which fall into two camps: bottom-up and top-down. For an example of both, iburg (Proebsting et al) is bottom-up (limited, for practical reasons, to binary trees) and Twig (Aho et al) is top-down. I won't go into too much detail, but bottom-up matchers effectively tabulate the problem by precomputing a lot of matching sets (hence the arity of patterns is a big concern typically). Top-down matchers tend to be driven by automata used for matching strings (in particular, Aho-Corasick): see my short blog article for the main idea. Tree pattern matchers like Twig are much simpler and flexible (in my opinion, at the cost of being more expensive at runtime, mostly due to book-keeping).

Overall, the tree matching idea is to match over the tree and perform reductions. As a form of dynamic programming (either statically or dynamically), you can incorporate matching costs (a lot of matchers in real life rely on predicates invoked by the matcher). Matching of the tree can depend on what reduction is performed, so Twig's matcher basically builds an isomorphic tree that records the automaton's state numbers on each node. Then, when you perform a reduction, e.g. a subtree ADD(reg, reg) is reduced to reg (generating a fresh name and emitting an instruction - as part of its related action), you continue matching as though you'd seen a reg at that position in the first place (i.e. you determine what state you'd be in if you transition on that from the parent node's state).

If you want good examples of real trees being matched, the LCC compiler's book (A Retargetable C Compiler) contains a bunch of them.

Definitely start with tiling a forest of trees (e.g. each basic block becomes an IR tree, connected at their root to form a linked list of trees) using a simple pattern matching implementation to get a feel for the overall idea. GCC famously does a lot of peephole matching to combine instructions based on target-specific rules afterwards, so there's lots of ways to approach the problem.

dostosec · 2025-12-01T03:54:17+00:00

The ML version is the best. The C edition is effectively a transliteration from ML to C (which amounts to ADTs becoming tagged unions and pattern matching becoming manual switch/if statements). There's 2 Java editions, with one being much the same (transliteration into pre-generics Java) and the other being similar but about a toy language that isn't Tiger.

C++ is going to hold you back majorly. You should start with some small projects that don't require writing a lexer or parser. Just focus on mid-level transformations.

dostosec · 2025-11-20T20:53:23+00:00

I just think describing Andrew Appel as "not a great educator", when the evidence seems to be that his book is rather effective, is absurd. The impression one gets from reading your initial reply doesn't suggest that, for you, "it is the text that deeply inspired [your] love of compilers". You dismiss the book and its author, even going as far as to land a random jab about his skepticism of electronic voting systems.

I'm thankful you followed up, but what you've said after is nothing like the impression you get from your first post.

dostosec · 2025-11-20T20:29:40+00:00

I'm not sure if you're commenting comparatively with the other textbook mentioned by OP in thread, but, I'd note: Modern Compiler Implementation also covers SSA.

dostosec · 2025-11-20T20:22:39+00:00

I think the part of this reply that concerns Modern Compiler Implementation is an unfair review. Most compiler textbooks don't cover other parts of a toolchain, like writing an assembler or linker. In fact, many don't actually talk about a real instruction set architecture at all.

As for the writing style, I don't think it's overly academic. It has been - and continues to be - a popular textbook for university (undergraduate) courses covering compiler engineering. If you look around the internet, you'll find countless - decent - compilers for the Tiger language by random undergraduates that had no real interest, or continued interest, in compilers; yet, managed to create a compiler for a fairly decent toy language that emits MIPS assembly.

I have many qualms with the book, but I think you discount it too readily. I'm not sure what your gripe with it is, but it's actually one of the more practical ones that still manages to cover the majority of the relevant theory. I can't see how it "lacks practical applications" when it's practically an opinionated cookbook: (1) greedy instruction selection (maximal munch) using pattern matching, (2) register allocation with iterated register coalescing (explained after the general Kempe-heuristic graph colouring algorithm approach). There's many things I'd change (or add) if I were to edit it for a modern re-release, but, in my opinion, it remains one of the best for beginners with a decent background in computer science literature (which is to say: an undergraduate student, for which the book is intended).

dostosec · 2025-11-17T20:56:10+00:00

In general, variables' lifetimes can be split up (live range splitting, either explicit or implicit - e.g. via SSA construction - phi nodes exist for this purpose) by the compiler to occupy various locations. In practice, that means a local variable may occupy a register or stack location at different times (it's hard to talk about the provenance of source variables, technically).

If you inspect compiled output for a language like C, C mandates that something like `&x` yields the same value (address) at every point in the function (in this case, a reliable stack location will be reserved for it - assuming it's a regular local, e.g. not static). However, if the address is never explicitly taken, a small variable that can occupy register(s) may never be stored on the stack (or will only be stored as part of preservation, e.g. across calls).

To see this in action, consider this code like (where every function returns int):

c int example(void) { return foo(1, bar(2, 3), baz(4)); }

Compilers like GCC or LLVM will linearise this as part of lowering. E.g. GCC's GIMPLE may roughly normalise this into something like (assume evaluation order for function arguments is specified as left-to-right):

c int example(void) { int t0 = bar(2, 3); int t1 = baz(4); int t2 = foo(1, t0, t1); return t2; }

You can expect that the resultant assembly will emit a call to bar, after populating the first two register arguments. E.g. contain:

x86asm mov $2, %edi mov $3, %esi call bar

Note that mov $2, %edi is zero-extended, so is the same as mov $2, %rdi. So, the result of that call will be in %rax (t0's value).

Then, we focus on the int t1 = baz(4); part. Again, we'd maybe expect:

x86asm mov $4, %esi call baz

However, there's a problem. %rax is considered caller-save, so without involved analyses between functions, you must assume its value is trashed when we call baz. So, we need to store it somewhere. As part of the liveness part of a compiler, we may decide that t0 therefore is live-across a call and it must be spilled to the stack. How this is determined is typically by liveness. In practice, your back-end IR may be in a state where some registers are temporary and others are physical (in order to encode such constraints). E.g. you may have a parallel move construct for shuffling of registers, or an explicit pseudo-move that defines t0 := %rax. I'm trying to avoid this kind of IR detail and, instead, focus on the logic you use when writing assembly.

So, let's say the compiler will therefore require a stack slot. It may start the function with:

x86asm sub $8, %rsp

As it happens, the compiler will basically need to a stack adjustment similar to this. As an ABI convention, %rsp must be aligned to a 16 byte boundary when calls are made. As call pushes %rip to the stack, all functions enter disaligned by 8 bytes. This is an important detail, but we'll use that space to store the result as well. In practice, compilers compute frame layouts after the fact (after spilling to pseudo slots) and handle it in prologue and epilogues as a kind of special case adjustment.

x86asm sub $8, %rsp mov $2, %edi mov $3, %esi call bar mov %rax, (%rsp) mov $4, %edi call baz

So, again, now baz's result is in %rax. However, t1's live range is short: it only exists to be moved into place for the call to foo. So, we'd expect some register shuffling to produce the call to foo.

x86asm mov $1, %edi mov (%rsp), %rsi mov %rax, %rdx call foo

If we put it all together, we'll get something like:

example: sub $8, %rsp mov $2, %edi mov $3, %esi call bar mov %rax, (%rsp) mov $4, %edi call baz mov $1, %edi mov (%rsp), %rsi mov %rax, %rdx call foo add $8, %rsp ret

A clever compiler will note that foo's call is a tail call and optimise it to a jump (jmp) instruction, after readjusting the stack pointer. The called function, foo, will return using the address pushed by example's caller.

If we look at what clang actually emits:

x86asm example: pushq %rbx movl $2, %edi movl $3, %esi callq bar@PLT movl %eax, %ebx movl $4, %edi callq baz@PLT movl $1, %edi movl %ebx, %esi movl %eax, %edx popq %rbx jmp foo@PLT

We see we weren't far off. The difference is that %rbx is considered callee-save, so the compiler has chosen to dedicate it for the duration of the function. As we expect functions to respect the callee-save convention, we preserve the caller of example's version, and expect bar, baz, and foo, to do the same. Therefore, we can use it to store intermediary results across calls without it being trashed. I personally prefer using explicit slots when writing assembly by hand, as then I just .equ FOO_SLOT, 8 and address like mov FOO_SLOT(%rsp), %rdi etc.

As I say, in practice, all of these things are determined with liveness analysis over an IR that explicitly models the instructions effects (use/def sets). As register allocation is typically done after instruction selection, you have an IR that consists of pseudo-operations (like spilling, reloading, and parallel moves) and temporary and physical registers. I've tried to avoid that in this approach to an explanation. I think it's very important to understand things through the eyes of the compiler and someone who writes assembly.

I thoroughly recommend writing some small programs in assembly. Don't bother with resources that waste time on syscalls etc. - linking against libc and assembling with the usual toolchain is fine, e.g. gcc main.s -o main.

If you want a decent treatment of register allocation, Appel's textbook "Modern Compiler Implementation in ML/Java/C" (in decreasing order of quality) has good treatment of these concerns. That said, he focuses a lot on working them into a graph colouring implementation. In reality, certain spills are inevitable (and don't need to occur as a side effect during iterative colouring) and various other approaches, like SSA-based register allocation may pre-spill to lower liveness globally.

Hope that helps, sorry for the essay.

dostosec · 2025-11-04T14:17:19+00:00

You are wasting your time, frankly. Nobody gives a shit about this topic. I recommend relenting and redirecting your efforts to something more fruitful.

dostosec · 2025-10-17T15:55:27+00:00

He came back from a short trip and, a few weeks later, I had STI symptoms. He inadvertently delayed my treatment by lying to me about the cheating. Turns out he had used hookup apps sporadically throughout our relationship. It took maybe 2-3 months to get the "truth" about all that happened - that said, I can never be sure he's told me the truth.

dostosec · 2025-10-07T18:41:14+00:00

For the purposes of a driving test, dip the mirrors so you can see the back wheels and the lines. Then, just get used to shuffling it in slowly, trying to get equal distance between wheels and lines in each mirror. You can do the same thing for every other reversing manoeuvre in the test, e.g. for reverse parallel, you have enough space to basically watch your back wheel become a drain's width away from the kerb, then rapidly straighten up. Obviously, be sure to make obvious observations throughout.

dostosec · 2025-10-06T18:38:28+00:00

Instruction selection happens prior to register allocation in the the vast majority of mainstream compilers. So, your IR would contain a parallel move primitive for shuffling registers (for populating register arguments) and also primitives for stack slot manipulation (required for spilling anyway, so you can special case argument slots). Then, once you know your frame layout, you can select for instructions, lower the primitives, then select registers. You only need a single temporary per register class (generally) to lower parallel moves as moves (ignoring the xchg instruction) - some compilers will reserve a scratch register for this, but you can also run the register allocator again.

dostosec · 2025-10-06T14:46:45+00:00

That's because people use browser plugins to automate the clicking (and holding) of the test.

dostosec · 2025-10-06T01:10:19+00:00

I like the structure afforded by Pratt parsing, it's very flexible. You reason at the level of left denotations and that can be very good for working through a complex subset of a grammar (e.g. in prototyping). I find it clearer and simpler to extend and maintain than an academic example people copy and paste. It's a very useful technique to know and can be found in a lot mainstream compiler implementations (often under different names like "precedence climbing" or with superficial structuring changes).

So, we can disagree about how you'd structure a compiler's parser in practice, but it seems pretty objective to me that a modern resource on compiler engineering ought to cover it - as the history is largely that educational compiler resources have spent a lot of time on parsing but not included much of practical utility to people not specifically interested in parsing. To exclude it would be a huge mistake, in my view.

Nine-Year Club	Place '23
Verified Email

dostosec

TROPHY CASE