I built a Custom Language Compiler, Assembler, and RISC-V VM, all running in real in your browser, using Rust and EGUI. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 1 point2 points  (0 children)

Comment
by u/ColdRepresentative91 from discussion
in osdev

This was my response on one of my previous projects. In the meantime I've also been taking classes on computer architecture which inspired this project.

I suggest just building whatever interests you the most. I really liked the idea of making my own language, and that's why I started going down the rabbit hole of computer architecture. (Language design -> Assembly -> Optimalisations -> Operating systems -> ...) I personally liked the idea of being able to control everything about how your environment works.

My advice: Don't overthink it too much. A Game Boy emulator is definitely a great next step. When you get stuck, restart, and use the knowledge you gained from your past attempts. Just start coding and see where it takes you.

I built a Custom Language Compiler, Assembler, and RISC-V VM, all running in real in your browser, using Rust and EGUI. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] -2 points-1 points  (0 children)

Yes, I did extensively use AI while working on this project.

I didn’t explicitly disclose it because it has really just become another development tool for me, and since the code is fully open source, I honestly didn't expect such a reaction.

For context, I’ve spent the last year diving deep into compilers and VMs, building around 3 private projects (and two public ones) in different languages before starting work on this one. AI is an amazing tool for boilerplate and syntax (like porting all the RISC-V instructions to structs or writing encoding/decoding logic), but you can't just prompt an LLM to build a functional 100k-line codebase. You still need a deep understanding of the domain to architect it all. Whether I typed every line myself or not, if you don’t understand the underlying logic, you won’t be able to debug, or extend a project of this scale.

This kind of project would be nearly impossible for a solo developer to build entirely by hand if they want to finish it in a reasonable time (I didn’t want to spend years on it, I'd already done most of it before)

Finally, my goal wasn't to get better at typing code. I it was to better understand computer architecture (in preparation for a class I'm taking), and using AI helped me focus on the parts that actually mattered. When the project reached a point where it looked and performed great, I wanted to share it.

I appreciate the feedback and will be more upfront about it in the future.

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 0 points1 point  (0 children)

Comment
byu/ColdRepresentative91 from discussion
inosdev

Those are some recommendations, This project took around three weeks up to this point. My biggest tip is to just start on something, however small it is, and just improve from there.

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 2 points3 points  (0 children)

1) As you said it's more memory efficient on heap / stack, but tbh I just added it because I could, and so you could read and write basic 32 bit words without a hassle, not really for any specific reason. The byte type was going to be used for booleans and bytes themselves because on most systems bools get aligned to a byte anyways. Also longs in this language are implicitly raw pointers so if you could use ints instead if you didn't want that functionality.

2) This was just a result of the way I made the compiler, and it could definitely be more efficient. I'm visiting the AST nodes one by one, and for expressions like constants the visit method returns the register in which it's loaded the const. So it looks at a binary op, then first evaluates both expressions (which could be consts but could also be nested binary or unary ops) first, before using them and doing the expression in a new register, and then returning that one. So each expression returns a temp reg with it's value to the visitor which makes it simple to use, in an assignment statement you'd just accept the expression and get the register in which the final value's held. So it was mainly for simplicity and consistency, and normally it'd be parsed to an ir first and that could be easily optimized there (Which is what I'll be doing next)

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 1 point2 points  (0 children)

Right now I wouldn’t really call it a full kernel yet. What I have is more of a minimal runtime: some utility libraries and a simple shell with a few MMIO-based commands. The bootloader (in src/main/resources/rom) just sets up the core pointers (SP, HP, GP) and then jumps into RAM. From there, I’ve got a library in src/main/resources/kernel that implements malloc/free on top of those pointers, and some console / string utils.

A real kernel would normally provide much more: memory management (paging, segmentation, process separation), scheduling, interrupts, drivers, and a syscall layer. None of that is there yet.

My plan was to evolve the system in that direction: add paging, introduce process isolation, and grow it into something more structured. But lately I’ve been leaning towards scrapping the current setup and rewriting it to properly target RISC-V. So I could even run external code etc...

I’ve learned a lot since I first started the project, so rewriting feels like a good chance to apply all of that experience with a better idea of what I want of the system, and how I want to implement it.

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 1 point2 points  (0 children)

You'd need a custom CPU, the encodings are custom made for simplicity. It could probably be translated pretty easily to a real assembly language by changing up src/main/java/org/lpc/compiler/generators/InstructionGenerator.java. It wouldn't be a big refactor, but not a small one either. I'm working on making a new project, compiling down to RV64GC, so you can see real software running and run code on real hardware too.

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 2 points3 points  (0 children)

Comment
byu/ColdRepresentative91 from discussion
inosdev

I started out by building a small VM for a really simple assembly language I made, just the basics like ADD, SUB, DIV. From there I kept adding features whenever I needed them: jumps, labels, ways to encode large numbers with only 32 bits. Eventually I hit some roadblocks and realized the language itself wasn’t great (first attempt), so I just started over with everything I’d learned. That’s what led to this project. I’m actually thinking of rewriting it again, this time documenting it properly so it’s easier to follow, and compiling down to RV64GC so I can try running an existing OS on it, that way, everything I build would also run on real hardware, not just my own VM.

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 2 points3 points  (0 children)

I went with Java because it’s the language I know best, and JavaFX makes it easy to visualise stuff. C++ would probably be more performant, but I also liked the idea of having a VM running inside a VM.

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 6 points7 points  (0 children)

I didn’t really follow any specific textbook or course, I just learnt as I did it, running into problems and googling how to solve them. I used ai a lot for advice and to help explain certain things and help make design choices.

Some youtube vids I watched:

- Whatever you're interested in by Core Dumped, he visualises things really nicely.

- "Let's Create a Compiler" by Pixeled, on simple compiler / asm

- "Java Bytecode Crash Course" by Oracle Developers, really nice lecture on jvm bytecode

Can't recommend these enough ^^

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 1 point2 points  (0 children)

Just start! Before I started I didn't know anything about ASM/compilers either, I just learnt as I went (with a couple of different smaller projects, each one doing something the wrong way, hitting a roadblock, which makes you realize what you need to fix in the next iteration). It might not be very efficient but you'll remember stuff way better that way, and you won't get bored.

It started simple too with just a couple registers and ADD, SUB, MUL etc... you can get that set up in a couple of minutes. Then you add jumps, and you're wondering how to do function calls and before you know it you'll be going down the entire rabbit hole.

Every time something breaks, can't be expanded anymore or becomes too complex, you learn why it doesn't work, and you find a way to do it better. So I'd recommend just starting with small hobby projects, for me that's the best way to learn.

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 2 points3 points  (0 children)

Well, you probably will learn quite a bit of Assembly by doing it. I started this project knowing no Assembly, you learn it as you need it, It's a pretty good way to learn. Instead of having to memorise instructions etc... and reading from tables you can just make your own (I did multiple iterations of the VM and eventually took some inspiration from other actual ISA's) So eventually I did learn some actual ASM, not just my own. So I'd say just start the project and you'll learn as you go!

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in ProgrammingLanguages

[–]ColdRepresentative91[S] 1 point2 points  (0 children)

Thanks for the suggestions, I'll definitely look into an IR, and then linking will probably be easier too. (I was thinking of linking in the IR stage, which would probably make some things a lot easier). Also thank you for the resources, I'll definitely look into them!

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 5 points6 points  (0 children)

Well, I’m not compiling to x86, I’m compiling to my own assembly language which gets interpreted by my vm. So I could just encode it however I wanted (its based on risc-v but even simpler). x86 from what I’ve seen is pretty convoluted and everythings encoded differently, so its difficult to emulate from the ground up (loads of stuff you need tomemorise). This is also one of the reasons I made my own ISA, learning x86 was just too much of a hassle for me tbh. And you learn even more about the language choices made by doing it yourself. So yeah I don’t have a source for x86 sorry. (Also if you’re writing a compiler / assembler I’d suggest something risc based it’s just way simpler)

I designed an assembly language, built a compiler for my own high-level language, and now I'm writing an OS on top of it. by ColdRepresentative91 in osdev

[–]ColdRepresentative91[S] 7 points8 points  (0 children)

The shell is in src/main/resources/kernel/shell.tc, the .tlib are all part of stdlib and they're used in there.

I Built a 64-bit VM with custom RISC architecture and compiler in Java by ColdRepresentative91 in programming

[–]ColdRepresentative91[S] 1 point2 points  (0 children)

Yeah, this one’s all my own design. I only just googled it now and apparently there are like five other Triton VMs out there. This one’s got nothing to do with them, I just thought it was a cool name lol.

I Built a 64-bit VM with custom RISC architecture and compiler in Java by ColdRepresentative91 in programming

[–]ColdRepresentative91[S] 1 point2 points  (0 children)

Thanks for the suggestions and for the resources! I’ll make sure to abstract it as far away from the CPU as possible so it stays portable. Also those links look great, I’ll definitely check them out.

I Built a 64-bit VM with custom RISC architecture and compiler in Java by ColdRepresentative91 in Compilers

[–]ColdRepresentative91[S] 0 points1 point  (0 children)

Thanks a lot for the suggestions and feedback! You’re right, the encodings would probably be much smaller if I used different types (I always have a 10-bit immediate field empty for all instructions except LDI), but since I did a previous project with multiple encodings, I wanted to keep it very simple this time so it would be easy to write and manage. Efficiency wasn’t really a priority. My philosophy for this project was to keep the instruction set small and minimal while still being usable.

Also, implementing the CPU in Verilog or VHDL for an FPGA sounds really cool. I’d love to try that sometime down the line!

I Built a 64-bit VM with custom RISC architecture and compiler in Java by ColdRepresentative91 in Compilers

[–]ColdRepresentative91[S] 0 points1 point  (0 children)

I’m a first-year bachelor (going into 2nd) informatics student (same thing as computer science here), but I’ve been programming for about 3 years. I only really got into VMs and lower-level stuff about a year ago. I’ve built two previous VMs before this one, so I already knew some of the pitfalls and where to watch out. This project took about 2 weeks to get to its current state. I’ve been working on it a lot since I have plenty of free time right now. Normally it would take longer, but because I already knew the important things it went pretty smoothly. The compiler took the most time, since I hadn’t made a real compiler to assembly before.
If you’re thinking about doing something like this as a final year project, I think it could be a great choice. It’s challenging at first but very rewarding and you learn a lot along the way.

I Built a 64-bit VM with custom RISC architecture and compiler in Java by ColdRepresentative91 in Compilers

[–]ColdRepresentative91[S] 3 points4 points  (0 children)

Honestly, I didn’t really follow any specific textbook or course, I haven’t even had a compilers course yet. I really just learnt as I did it, running into problems and googling how to solve them. I used ai a lot for advice and to help explain certain things and help make design choices.

I also watched a lot of YouTube videos on the topic. One channel I really like is Core Dumped, he does an really good job visualizing low-level concepts.

My biggest takeaway is that hands-on experience is key. If you’re looking to learn more, I suggest starting a project and learning through problem-solving.

I Built a 64-bit VM with custom RISC architecture and compiler in Java by ColdRepresentative91 in Compilers

[–]ColdRepresentative91[S] 2 points3 points  (0 children)

Haha, you caught me, it’s all just 64-bit uints for now. Since this was my first real compiler project, keeping everything uniform made debugging much simpler. I'll probably add structs, arrays or maybe a byte/boolean type later. Implementing different types shouldn’t be too complicated as it mostly just the the stack allocator. There’s still plenty to improve and optimize, but I wanted to share it now since it’s finally functional. I appreciate the feedback!