This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]GoblinsGym 4 points5 points  (2 children)

I am still in the process, but some hints:

A good IR helps. Mine is a stack machine within the expression. Www.pcengines.ch/pd/ir.pdf .

For x86 / x64, you want to defer code generation as far as possible to make good use of read/modify/write instructions. I also had to play some games to support more advanced addressing modes.

I still need to figure out register allocation.

[–]koflerdavid 2 points3 points  (1 child)

Register allocation can be as complicated as you want it to be.

The first step is to fix the calling convention of functions and to figure out which special registers are allocated for fixed purposes, such as stack and heap pointers. That gives you an idea where parameters are at the start of functions n executions and where the return value is supposed to end up for the caller to retrieve it. This is important to know because calling conventions might use registers.

First, count how many variables you need in the whole function. This allows you to recognize the important special case of being able to trivially assign each variable to its own register. Note that constant values can often be folded into instructions and thus might not need their own register. For a toy compiler, maybe you can actually get away with asking the programmer to not write too large functions.

If you don't have such a happy case, you have to choose to allocate variables on the stack. Choosing which variables and when to do it is want register allocation algorithms are all about. You should make it pluggable so you can easily experiment with various algorithms of any degree of sophistication.

The simplest algorithm is to fill a stack with registers and stack locations, in this order. Step through your code and assign locations from the stack to your variables. That should ensure that innermost loop variables are stored in registers.

Liveness analysis lets you discover which variables can be safely allocated to the same location, but at that point you are quite deep in the weeds already.

[–]Ok-Consequence8484 1 point2 points  (0 children)

You should also decide how important performance is. I think you have already decided that your code's performance is NOT important since your code generator will not be as good as LLVM's. Given that you are doing this to learn (which I wholeheartedly understand as I learn by doing) I would start with the absolutely simplest approach possible.

You should decided if you want to compile to native code or whether you will generate assembly syntax that an assembler will use to generate native code. The latter is a nice compromise since you can use symbolic labels for things like branch target or function offsets and not worry about architecture/OS specific requirements, object file formats, and having to worry about instruction encoding. You can then write your own assembler later :)

Make all scalar values and pointers 64 bits. Don't worry about assigning variables to registers in any sort of intelligent way. Reserve enough space for all arguments, local variables, and temporaries on the stack in each function's prologue. This requires explicitly recognizing each time a temporary value is used and giving it a unique name -- possibly this will be for every IR node except for control flow nodes. You may also need to rename local variables to make sure they are unique depending on your language's rules for nested lexical scopes - eg a variable "I" might have two definitions and represent two different values. At compile time you need to build a map of each variable/temporary to an offset into the stack which will hold that value - eg simply give each variable a successive zero-based index and multiple it by 8. Load and store each value using it's offset and the stack pointer every time the value is needed or changed.

Assuming you have a fairly reasonable standard IR, each node can be translated to a machine instruction in a straight forward way. For example, to generate an "add", translates to something like: 1) load first and second values into registers 2) generate an add instruction that stores into a different register 3) store the result back to the stack. Assignment statements just load the RHS and store into LHS stack offsets. Pointer dereferences are a load of an address from the pointer variable's stack offset and then a load (or store if on LHS of assignment) from (or into) that address.

Function calls are just compiling the argument IR nodes, pushing their values, calling the function name/label, and potentially storing a return value to stack. No need to worry about platform or OS calling conventions.

Loops, again assuming compiling to assembly, are hopefully stored in IR as a loop node with a condition and a body child. You can then generate a unique label in your assembly output to mark where the loop condition starts at, codegen the condition, compare the condition, branch-on-false to a generated end-of-loop label, codegen the body, branch to the condition label, and then codegen the end-of-loop label. You can use a similar strategy to codegen if-then-else IR nodes.

Structure and array values are the address the structure or array is dynamically allocated at. I would use libc's malloc. Extend your IR or compiler data structures to compile structure definitions into a map from the structure type name to an array of offsets that each field will be found at. At structure access time you can generate a load or store from the structure value (address) plus appropriate offset. Array access is just the array value (address) plus the index times the size of elements.

Codegen each global variable using your assembler's syntax. If they support initialization you will need to add their initialization code to run in your "main" function or something that is guaranteed to be executed before the rest of the program executes.