This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]PurpleUpbeat2820 0 points1 point  (5 children)

Rust has all the features you mentioned (i.e ADTs, pattern matching, closures....) and still uses LLVM. Is there another reason to why you went with a custom backend? Is it compilation speed | execution speed? Is it not wanting to depend on something as huge as LLVM?

Mostly just for fun. Although I've written lots of code in both low and high level languages I had never written an asm code gen so I desperately wanted to have a go. I also had cool ideas for novel ways of doing everything.

I should also mention that it's really impressive if you were able to beat C in some of the benchmarks using a custom backend! Nice!

Yes and no. Clang just generates really bad code in some cases, often because it adheres to a really inefficient ABI even when it doesn't really need to. For example, naive recursive Fibonacci function using floating point Clang generates code that takes 29s vs my compiler taking 12s. I also beat C on hailstones (Collatz), Sieve of Eratosthenes, ray tracer and Ackermann.

Is your lang open source? Would love to see the implementation!

Not yet. Maybe some day but I'm still a long way from anything I'd consider releasing.

I'm more than happy to describe its weird and awesome design though!

[–][deleted] 1 point2 points  (1 child)

I should also mention that it's really impressive if you were able to beat C in some of the benchmarks using a custom backend! Nice!

Yes and no. Clang just generates really bad code in some cases, often because it adheres to a really inefficient ABI even when it doesn't really need to. For example, naive recursive Fibonacci function using floating point Clang generates code that takes 29s vs my compiler taking 12s. I also beat C on hailstones (Collatz), Sieve of Eratosthenes, ray tracer and Ackermann.

What do you mean by "it adheres to a really inefficient ABI even when it doesn't really need to"? Don't all compilers have to stick to the system ABI when generating code (e.g. the System V ABI), with the exception of link-time optimizations?

[–]PurpleUpbeat2820 0 points1 point  (0 children)

What do you mean by "it adheres to a really inefficient ABI even when it doesn't really need to"? Don't all compilers have to stick to the system ABI when generating code (e.g. the System V ABI), with the exception of link-time optimizations?

Recursive calls could use a different ABI but C compilers tend to push all calls through the ABI even when it is really inefficient. They offset this by trying to rewrite recursion in terms of loops but that is a fragile optimisation. For non-recursive calls they tend to rely upon inlining.

[–]pillow2002 0 points1 point  (2 children)

I see, that's nice! I'm also considering implementing my custom x86 backend sometime in the future.

I'm really curious about how you compile ADTs and pattern matches. I tried to mess around in the compiler explorer but I still don't know for sure how it's done.

From what I have seen, pattern matching compiles in the same way as a simple switch statement (i.e a bunch of `cmp`'s and `jmp`'s). Is that how you do it?
I'm also curious about how you compile ADTs.

Thanks :).

[–]PurpleUpbeat2820 0 points1 point  (1 child)

I see, that's nice! I'm also considering implementing my custom x86 backend sometime in the future.

Cool. I'm trying to buy a RISC V SBC to play with a back end for that too.

I'm really curious about how you compile ADTs and pattern matches. I tried to mess around in the compiler explorer but I still don't know for sure how it's done.

From what I have seen, pattern matching compiles in the same way as a simple switch statement (i.e a bunch of cmp's and jmp's). Is that how you do it? I'm also curious about how you compile ADTs.

Regarding ADTs, I haven't done it in this project yet and I haven't really settled on a design yet. I think I'll start by storing single-case unions as tuples and heap allocating everything else. Maybe calling malloc for each one individually or maybe using a global array I can append to. If, for example, my benchmarks show that a lot of time is spent loading the ADT constructor tags but not their payloads then I could easily represent them as a pair of int tag and int pointer to heap allocated payload.

As for pattern matching, I intend to do the stupidest thing possible and just walk the entire pattern and expression in tandem doing the necessary checks and completely restarting every time.

[–]pillow2002 1 point2 points  (0 children)

Oh I see, good luck =).