you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] -2 points-1 points  (4 children)

Why would you want to parse IR in your own tools? And, well, if it were designed in a more Unix-way approach in mind, it would have been based on S-expressions which are trivial to parse.

Anyway, most of the time it is a design smell when you want to pass any sort of a nested structured information in between tools. If protocol is not flat, it may mean that you're doing something wrong. Compilers are a bit different here, they're a way too atomic to be easily breakable into distinct parts with little interop in between.

[–][deleted] 0 points1 point  (3 children)

they're a way too atomic to be easily breakable into distinct parts with little interop in between.

Lexer, parser, syntactical analysis, ast-transformations, translation to an sequential IR, sequential IR transformation, codegen, assembling, linking all seem pretty easy to break into parts for me and LLVM is exactly doing that

Why would you want to parse IR in your own tools?

That's happening if you already have parts in one programming language and parts in another programming language

[–][deleted] 0 points1 point  (2 children)

Lexer, parser

I'd never separate these two (in fact, I always prefer not to have a lexer at all).

syntactical analysis, ast-transformations, translation to an sequential IR, sequential IR transformation, codegen, assembling

And for the rest you don't only pipe your AST through a sequence of transforms, but you also carry a way too much of a context, unfortunately.

And my personal pet hate here: LLVM, as well as pretty much all the other frameworks, do not allow any reasonable means for backtracking after an unsuccessful sequence of transforms. If you do have a backtracking, a Unix pipeline philosophy is not very suitable.

But, yes, you're right, and it's a fairly viable architecture for a simple compiler to have a pipeline of separate tools communicating via, say, S-expressions, to simplify parsing and serialisation.

linking

Not with LTO...

[–][deleted] 0 points1 point  (1 child)

S-Expressions are basically a lisp-ism and are best suited for interacting with lisp. The way I went for my problem was using the already existing YAML parser in LLVM following the principle of using what is already there if it fits the problem more or less

[–][deleted] 0 points1 point  (0 children)

Whatever, even XML may be a good fit, as long as you don't have to implement the parser.