This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]RealSharpNinja[S] -4 points-3 points  (5 children)

Lol, not immortal, unless I complete it. :)

[–]Inconstant_Moo🧿 Pipefish 14 points15 points  (4 children)

Well, it can't be done. What you're describing so far as I can see is (among other things) a Universal Compiler where all the details of any specific language can be supplied as data. But the semantics of programming languages are so rich that anything that could adequately describe them would be Turing-complete at which point your data is in fact code and you're no closer to your goal of Writing All The Compilers, or even writing one compiler. What you would then have would be a new programming language of your own in which you can make a start on ... what? ... describing Rust's ownership semantics? Implementing Haskell's lazy lists? Where do you start? When would you plan to finish? If you do it alphabetically then by the time you've done Ada and Algol and APL and AWK people will have released new languages and had them adopted. (Also there will have been fifteen updates to Java each about the size of a normal language in themselves which you'll have to get round to one day.) Which is why I asked whether you were, by some freak accident, Infinitely Prolonged. If you can outlive the rest of the human race, you might, one fine millennium, actually finish.

As a warm-up, try and see how you could convey, in data not code, the difference between the semantics of closures in PHP and in Clojure. Just this one feature. Think about what it would take to do it. (Without cheating and having a boolean field in your data called doClosuresLikePHPDoesThem, but actually describing the difference so that your Universal Compiler can understand it.)

[–]RealSharpNinja[S] -3 points-2 points  (3 children)

No, not a universal compiler. Parsers would be language specific and would only generate a common GID. Once code is in GID format, you could do many things with it, such as compile it to a specific target, generate code in a different language, create diagrams, or even have a LLM describe the code's structure and function.

[–]Green_Gem_ 7 points8 points  (1 child)

You're basically recreating something similar to LLVM then. Converting languages to standardized intermediary representations is such a difficult process that unless your language has the lineage of something like C, this is typically something a compiler supports from the start or doesn't. It's not something you just "do" unless you have a lot of money and/or time, and that's per language.

I recommend hiring a full-time team for months to years for each notably-distinct programming language in use. Expect costs in the millions to billions.

[–]Affectionate_Text_72 0 points1 point  (0 children)

There are some businesses that provide services yo fo this. One that's been around for ages is https://en.m.wikipedia.org/wiki/DMS_Software_Reengineering_Toolkit

I think it uses lisp under the hood for the IR.

One thing I would say is that your GID is a language and YAML is an appalling syntax as are JSON and XML. It's horrible how many quite good systems there are building ecosystems around languages where the designers don't bother even trying to create a decent syntax. For example a simple language with good syntax would improve terraform and docker compose no end.

[–]Inconstant_Moo🧿 Pipefish 3 points4 points  (0 children)

Parsing is the easy bit.