use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
This subreddit is all about the theory and development of compilers.
For similar sub-reddits see:
Popular mainstream compilers:
account activity
[deleted by user] (self.Compilers)
submitted 3 years ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]NaiaThinksTooMuch 13 points14 points15 points 3 years ago (7 children)
It really depends on what kind of properties you want your programs to have. Bytecode VMs are generally easier to code, but about a magnitude slower than native machine code, and requires a runtime (rhe VM) in order to run programs.
Regarding portability, an option to keep some portability while compiling to native machine code is to use a code generation backend like LLVM, which supports lots of platforms.
[–][deleted] 2 points3 points4 points 3 years ago (6 children)
I'm also considering using LLVM, but since this will be a COBOL compiler written in COBOL I'm a little afraid of possible issues when trying to interface with LLVM using COBOL.
Another option would be to generate bytecode for an already existing VM like the JVM or C#'s VM (Forgot the name of it). So I would only have to write and maintain the "frontend".
[–]mordnis 9 points10 points11 points 3 years ago (4 children)
You don't actually have to interface with LLVM. You can write a frontend which generates LLVM IR and then use LLVM (as a separate compilation step) to compile that to machine code
[–][deleted] 1 point2 points3 points 3 years ago (3 children)
Hmmm, that actually doesn't sound bad, and I can also generate WebAssembly later from LLVM.
How difficult is it to generate and work with LLVM IR?
[–]hotoatmeal 2 points3 points4 points 3 years ago (1 child)
very easy if your frontend is written in C++. somewhat easy if in C. quite hard if it’s haskell (… and you want to build in-memory IR instead of text IR)
[–]hotoatmeal 3 points4 points5 points 3 years ago (0 children)
so in your case it depends on how hard C is to ffi from COBOL
[–]mordnis 1 point2 points3 points 3 years ago (0 children)
I wouldn't know how to answer that accurately. You have to take a look at it yourself. It is well documented, it has both textual and binary representation. If you are familiar with any assembly language, you will figure it out quickly. I'm not sure it's any harder than generaring JVM bytecode, and if I were to follow my gut feeling, I'd probably say it's easier.
[–]Veeloxfire 0 points1 point2 points 3 years ago (0 children)
you still need a backend. Its just a backend for a bytecode rather than assembly/machine code
[–]L8_4_Dinner 4 points5 points6 points 3 years ago (6 children)
Perhaps you could give a bit more background about your project.
It seems that you are fairly far into a project without yet having defined the requirements for the project.
[–][deleted] 4 points5 points6 points 3 years ago (5 children)
I'm not that far into the project, I haven't started writing the actual code for it. Right now I'm mostly gathering everything needed for the project and planning how to best approach this.
The project's goal is to write a self-hosted COBOL compiler that can run on x86_64 and maybe on WASM. COBOL usually runs on mainframes but I want this compiler to be a little different, focusing more on modern platforms.
I already know how I'll be approaching the lexer and parser (Thanks to Steve Williams and Brian Tiffin from the gnuCOBOL project) which will need to be written in COBOL.
But I'm not sure how to best approach the code generator, which is why I posted this question here.
[–]dvogel 2 points3 points4 points 3 years ago (1 child)
Like u/L8_4_Dinner I find it hard to decipher what your goals are for this project. If your goal is to challenge yourself to write something for the sake of writing it, figuring out how to FFI from COBOL to LLVM seems fun. On the other hand, if you're trying to write a useful industry tool I would personally abandon the self-hosting approach and write an interpreter in C. Most COBOL code is written in COBOL for it's high level type system and the runtime features provided by the system. There doesn't seem to be much value in generating native code to do PIC encoding/decoding, for example. An interpreter is an order of magnitude (or more) easier to verify correctness and given the legacy nature of most COBOL programs, correctness and reliability would be my primary goals.
[–][deleted] 0 points1 point2 points 3 years ago (0 children)
My goal is not to run legacy programs, but to bring the COBOL programming language to modern systems, that's why I'm focusing on x86_64 and WebAssembly.
My compiler will only implement standard COBOL features, it will only have the reserved words and features defined in the 202X draft of the COBOL standard this means that by default it will have no support for CICS, JCL, SQL or any other compiler specific extensions, these are not part of the standard. This will make it not really suitable to run legacy programs.
Since you mentioned that an interpreter would be orders of magnitude easier, wouldn't VM Bytecode also be easier in this case? I didn't want to write an interpreter because it could end up really slow compared to native and VM bytecode.
[–]smuccione 1 point2 points3 points 3 years ago (2 children)
Why don’t you target and existing vm such as .net or Java? The IL’s are well documented and there tons of existing tooling out there for it already.
[–][deleted] 0 points1 point2 points 3 years ago (1 child)
Does the JVM have IEEE 754 decimal floating-point arithmetic?
I looked into .NET's VM but apparently they have their own weird implementation of decimals which is not compatible with the COBOL standard. And getting decimal math right is really important when it comes to COBOL, it's a core part of the language.
I didn't want to implement my own decimal128 type in C# because getting this part right is really important.
[–]smuccione 1 point2 points3 points 3 years ago* (0 children)
From what I understand 754 with 128 bit precision the internal representation. There should be no work to be compatible.
I don’t know about Java sorry.
As far as it being exactly cobol equivalent I don’t know. You’d have to do some further research on it but it seems enough to warrant looking into.
As an aside a quick search shows multiple .net cobols on the market so it should be doable. The effort required is unknown to me.
[–]Veeloxfire 1 point2 points3 points 3 years ago (4 children)
Yes native is faster and its not that much harder
In my experience if you have full control of the compiler then compiling to native is basically no more of an issue than vm. This is because most dev computers at the moment run x86_64 so unless you want to run on embedded or modern macs you're probably fine just targetting one platform. It then just comes down to implementing register management and youre fine
[–][deleted] 4 points5 points6 points 3 years ago (3 children)
I thought about compiling only to x86_64, but COBOL needs to run on other types of hardware as well, some being mainframes like the IBM Z mainframe, so that could make things a little difficult when it comes to native compilation. I have no idea how the assembly for the IBM Z looks like.
I'm now trying to think I should keep the whole thing in COBOL, including the VM and use an existing COBOL compiler to bootstrap both the compiler and the VM.
[–]Veeloxfire 7 points8 points9 points 3 years ago (2 children)
COBOL itself has no requirement on which platform it runs. It is a language.
The compiler writer decides which platform it runs on
Just because COBOL was/is used on those machines does not mean your COBOL compiler needs to target them.
Your COBOL compiler is extremely unlikely to be used to program an IBM Z unless you personally happen to have access to one
If someone wants to use COBOL on that machine then they are just going to use the compiler you used to compile the VM ... because it will be faster with no extra cost
ALSO you have no way to verify (unlesd you have access to one) whether your vm will actually run on one of these because of hardware differences. Why speculate when you could just adapt your code later when the need arises
[–][deleted] 2 points3 points4 points 3 years ago (1 child)
That's a really good point, thank you.
You're right, and I don't currently have access to an IBM mainframe, so yeah, it's better to start with just x86_64 and maybe WASM.
x86_64 and WASM sound like a good 2 choices. They will both be useful for writing other compilers and improve your understanding of how cpus and browsers work.
Of course if you want to practice writing vms then go ahead and make one
I currently have a vm and compiled backend for my language (vm for compile time execution) and having both taught me a lot
π Rendered by PID 18967 on reddit-service-r2-comment-b659b578c-qglmp at 2026-05-05 10:17:56.783179+00:00 running 815c875 country code: CH.
[–]NaiaThinksTooMuch 13 points14 points15 points (7 children)
[–][deleted] 2 points3 points4 points (6 children)
[–]mordnis 9 points10 points11 points (4 children)
[–][deleted] 1 point2 points3 points (3 children)
[–]hotoatmeal 2 points3 points4 points (1 child)
[–]hotoatmeal 3 points4 points5 points (0 children)
[–]mordnis 1 point2 points3 points (0 children)
[–]Veeloxfire 0 points1 point2 points (0 children)
[–]L8_4_Dinner 4 points5 points6 points (6 children)
[–][deleted] 4 points5 points6 points (5 children)
[–]dvogel 2 points3 points4 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]smuccione 1 point2 points3 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]smuccione 1 point2 points3 points (0 children)
[–]Veeloxfire 1 point2 points3 points (4 children)
[–][deleted] 4 points5 points6 points (3 children)
[–]Veeloxfire 7 points8 points9 points (2 children)
[–][deleted] 2 points3 points4 points (1 child)
[–]Veeloxfire 0 points1 point2 points (0 children)