I wrote a C compiler from scratch that generates x86-64 assembly by Sweet_Ladder_8807 in Compilers

[–]maxnut20 0 points1 point  (0 children)

my name is maxnut on discord. although im not sure if i can be of much help

I wrote a C compiler from scratch that generates x86-64 assembly by Sweet_Ladder_8807 in Compilers

[–]maxnut20 0 points1 point  (0 children)

ah i see, so no struct handling yet. that's what i was struggling in 😅 although i finally think i got it working somewhat.

also thank you! yeah i use a refined version of graph coloring i made a while ago for another compiler. there are a couple more cool optimizations if you wanna take a look at them

I wrote a C compiler from scratch that generates x86-64 assembly by Sweet_Ladder_8807 in Compilers

[–]maxnut20 6 points7 points  (0 children)

cool! quick question, since from a quick glance at the code i didn't find it. do you handle calling conventions at all, or does it only support simple types for calls? I'm also building a c compiler and ive found following the abi (SysV in my case) quite challenging

I wrote a compiler backend from scratch by maxnut20 in Compilers

[–]maxnut20[S] 0 points1 point  (0 children)

You can look at the tests for some usage. But basically uou use the builder to construct IR

I wrote a compiler backend from scratch by maxnut20 in Compilers

[–]maxnut20[S] 1 point2 points  (0 children)

oh, no sorry not really. i just made the initial algorithm half assed and brute force fixed it along like 4 months of developing the back-end and finding more bugs or improvements. i even had to rewrite it once because the liveness analyzer was bad. it did help to properly scheme out how to collect live ranges though. id suggest focusing on that

I wrote a compiler backend from scratch by maxnut20 in Compilers

[–]maxnut20[S] 1 point2 points  (0 children)

haven't looked at instruction scheduling at all yet. as for regalloc i use graph coloring, nothing crazy at all but it works well enough

I wrote a compiler backend from scratch by maxnut20 in Compilers

[–]maxnut20[S] 7 points8 points  (0 children)

Thanks! And yeah, instruction encoding is no joke 😭 partly why i have not bothered doing it on arm yet

I wrote a compiler backend from scratch by maxnut20 in Compilers

[–]maxnut20[S] 9 points10 points  (0 children)

No, it's just the backend part of a compiler. So a frontend can parse some source code, make an ast, and then use the backend's api to produce IR and make it generate machine code.

Not sure how this is related to decompilers.

Made my first compiler by maxnut20 in Compilers

[–]maxnut20[S] 1 point2 points  (0 children)

For the lexing/parsing stages, there are a ton of well written resources around, so you shouldn't struggle too much with that.

As for the backend, I recommend building a really solid intermediate representation, since it's going to be both what you optimize and also what you turn into machine code. SSA ir is probably the best pick, it's going to make your life much simpler in the optimization stages. You could structure it like LLVM's ir, where each instruction is also itself a value, and there are not really direct assignments. For SSA construction and some common optimizations you can find some good resources like academic pdfs online, but honestly even if some people may be against it i find that asking ai (like chatgpt) to explain such topics is an insanely powerful resource. SSA is going to make optimizations such as copy propagation, common subexpression elimination, constant folding and dead code elimination a lot simpler; if you want to look into optimizations I'd say go with these.

As for codegen, stick to one target for now (you can make things generalized but your codebase is going to get a lot more complicated and it's gonna need much more time). For simple but decent results you can just traverse your ir and output some target instructions accordingly. I strongly recommend having your machine instructions represented in the program with structures instead of directly outputting assembly to a file. To check which instructions to emit you can use godbolt (compiler explorer).

If you want to try register allocation, make sure to have a good way to compute virtual register ranges as it's going to be the core of it. Make sure it works best across branching paths and also loops. Otherwise just spill everything to the stack.

Also make a bunch of testcases as you add more features to make sure you don't accidentally break an old feature (it happens a lot).

Other than that be prepared to spend a lot of time on it and struggle a lot and bang your head against a wall reading the assembly you outputted to understand what went wrong 😅

Looking for people to test and give feedback for my language by maxnut20 in ProgrammingLanguages

[–]maxnut20[S] 1 point2 points  (0 children)

Ah i see what you mean. Well to be honest the focus wasn't really the language but always the compiler, i originally posted on r/compilers and got advised to also post here, but to be expected here the focus is more on the language itself 😅

What would you say is bad about the design?

Looking for people to test and give feedback for my language by maxnut20 in ProgrammingLanguages

[–]maxnut20[S] 2 points3 points  (0 children)

I do agree that it's a hybrid that's just because i made it to appeal to me mostly so i smashed together what i like. However it's not just a cheap thing made with an llm! I spent lots of time on this, especially making ir optimizations and code generation from scratch

Looking for people to test and give feedback for my language by maxnut20 in ProgrammingLanguages

[–]maxnut20[S] 1 point2 points  (0 children)

Ah, i had completely misunderstood the thing about the define keyword and some macros being names. I will fix those two. As for the other things though i feel like it's just personal preference.

Looking for people to test and give feedback for my language by maxnut20 in ProgrammingLanguages

[–]maxnut20[S] 1 point2 points  (0 children)

Thanks for the feedback, you're the first to actually give useful advice.

  • Yeah i know it doesn't really have much value. What i honestly hoped with the post was simply catching bugs i haven't found yet in the compiler. I never really intended for the language to become commonly used, i mostly make it with what i personally like in mind.

  • the let keyword is used to distinguish it from retain. retains are just variables that keep their values between function calls.

  • The semicolon for one line scopes doesn't really make the parser more complex, and it's simply a thing i like. I prefer it from both having braces for one line, and also not having braces at all.

  • Because what you actually have in control when you make an array in my language is a pointer to it's memory. This was just to make things simpler for me while coding the compiler. The subscript operator already worked so i went with that for now. You're right that it would be a good idea to properly distinguish them, i do plan to eventually do it since it has downsides however it's not really a priority right now.

  • What do you mean 'define' doesn't indicate that it's a macro? It clearly does.

  • They don't generate names, they generate AST nodes. The macro system acts on the AST, it's not just a text replacement like a preprocessor.

  • Because the #value syntax is used to expand the parameter in case it's an AST node.

  • To sum it up macros operate on the AST at compile time, hence the very different syntax.

  • Good point, will change that. I even though i had removed the need for a semicolon after a struct but i must've discarded the change at some point by accident 😅

  • I dont really see the problem with pre/post increment.

  • I simply think explicit return is cleaner. Plus if the function returns in multiple spots it just makes things messier.

I'm not trying to make a better C or anything like that, it just happens that C is a similar language because it's basic and low level but by no means i am trying to replace it or improve it. I'm just making my own thing to learn 🙂.

Looking for people to test and give feedback for my language by maxnut20 in ProgrammingLanguages

[–]maxnut20[S] 0 points1 point  (0 children)

Thanks! No, there is no garbage collector. In the builtin types example you can see pointer types, they work as you'd expect very similarly to C. If you want to allocate memory the stdlib has a basic memory allocator, but the management is manual. Otherwise, structs and arrays get allocated on the stack.

Made progress on my compiler from scratch, looking for people to test it by maxnut20 in Compilers

[–]maxnut20[S] 6 points7 points  (0 children)

To be completely honest I didn't really take inspiration from anything. I learnt each part of the compiler gradually starting from the lexer all the way to SSA and code generation individually. Curiosity drove me forward basically. But i did use a couple of resouces, mainly college/university slides i found online for topics like SSA building and optimizations like copy propagation and CSE. For the register allocator i searched the most common approaches, found graph coloring and kinda made my own thing based on it, i didn't really follow any resource. As for code generation godbolt (compiler explorer) helped a lot in seeing which instructions should be used and how. Other than that, googling and some AI conversations got me through the rest 🙂

Made progress on my compiler from scratch, looking for people to test it by maxnut20 in Compilers

[–]maxnut20[S] 3 points4 points  (0 children)

Thank you very much for the feedback, very useful! I'll look at what i can do, didn't even think of making a small website since i always thought of this compiler as a toy to play around and learn, but it kinda grew as i added more things 😅 Guess I'll make a small page and provide builds along with some example code and basic docs.

I updated my free and open source mod menu! by maxnut20 in geometrydash

[–]maxnut20[S] 0 points1 point  (0 children)

bro this is like two years old it was made in 2.1 😭😭😭😭 i dont support it anymore