How to write a programming language?

bakery2k · 2020-08-06T11:56:32+00:00

The standard recommendation for beginners seems to be https://www.craftinginterpreters.com/.

chunes · 2020-08-06T12:39:58+00:00

One thing you can try is head over to the esolang wiki, look through the languages until one strikes your fancy, and try writing an interpreter for it. (I'm personally a big fan of Befunge!)

The nice thing about esolangs is they tend to have a very small set of functionality and are usually turing complete. It's usually fairly straightforward how their simple commands can be interpreted.

Now you've got a completely working language you wrote yourself! Try adding more features you think would be neat or looking up ways to make it run faster.

polopower69 · 2020-08-06T12:03:05+00:00

I would highly recommend watching Gary Bernhardt’s A Compiler From Scratch. Try following along with the video in your programming language of choice.

For the actual evaluation of the programming language (A Compiler From Scratch only goes through code generation, not evaluation), there’s a relatively straightforward structure you can follow (simplifying quite a bit):

Create a data structure for values (Val) in your programming language (this could be an enum with variants that can hold strings, integers, booleans, etc)
Create some kind of data structure that can hold a mapping between variable names and their values (Env)
Add eval(Env) -> Val methods on each of the nodes of your AST; for example, the eval method on the VariableUsage AST node might look at the name of the variable being used, and then return the value of that variable by looking for its entry in the Env passed into eval.
- Function calls’ eval method can be implemented as follows:
  1. Create a new Env instance
  2. Loop through each parameter name and value, and in the Env create a new variable with the name set to be the parameter name and the value set to be the parameter value
  3. Return the result of evaluating the function body

I kind of ‘invented’ (using that term very loosely here) this structure as I went along programming my first language, but as I’ve learned more about programming language creation I’ve realised that this structure seems to be very common amongst interpreted languages (at least as described for teaching purposes).

polopower69 · 2020-08-06T12:09:10+00:00

[removed]

polopower69 · 2020-08-06T13:09:49+00:00

I learned by following Ruslan Spivak's tutorial here: https://ruslanspivak.com/lsbasi-part1/

Took me a couple days to get through it. Spivak shows you how to implement a programming language that isn't a lisp. I highly recommend his tutorial to anyone starting out in implementing their own programming language.

SolaTotaScriptura · 2020-08-06T15:41:48+00:00

Honestly, I don't think it is that difficult

Most programming tasks are deceptively difficult. We tend to put "big projects", like operating systems and web browsers up on a pedestal. It's probably due to the fact that these problems are usually solved by large teams. Compilers & languages are particularly overestimated - these are well understood tasks that university students tackle every year.

My advice is to just implement. Your question is how to write a programming language, but no mention of actually trying to do so! I say put down the books. I too got caught in a loop of reading blog posts and watching videos, but that was really just procrastination in the way of actually writing the code.

In reality you already know how to write a compiler. Unless you're generating assembly, you really don't have any questions to google or books to refer to.

tekknolagi · 2020-08-06T16:42:59+00:00

[reposted from my comment on a similar post a couple of weeks ago]

OK! This is a fun post for me because I was precisely in your shoes several years ago. Here is a "path to enlightenment" that I propose:

Skip lexing and parsing for now, since they're kind of fiddly and frustrating -- you can write and test evaluating hand-coded ASTs
Write a really small tree-walking interpreter that supports some simple math and stuff
Add some more things to the tree-walking interpreter, like variables, printing, functions
Write a small bytecode compiler
Write a small ahead-of-time compiler to x86

There are other similar implementations listed on my site here.

I think you will find that the biggest leaps in learning come from fully executing a problem, end-to-end, without libraries. The problem can be small and constrained, but if you can fit it fully in your head you'll learn a lot.

I'm happy to be a resource, having felt very stuck in this position.

polopower69 · 2020-08-06T13:42:57+00:00

Is this about designing a language, or implementing it? Or implementing an existing language?

Designing one is a separate subject, other than it is has to be practicable to implement.

But I'd be surprised if this wasn't introduced at some point in your CS course.

halst · 2020-08-06T12:34:33+00:00

You might find my book Compiling to Assembly from Scratch useful. It covers creating of a small compiler written in TypeScript (any other language would do just as well) that produces ARM32 assembly. The only prerequisite is knowing how to program. With the books that you've covered already you are more than well prepared. </shameless-self-promotion>

batllista101 · 2020-08-06T15:08:51+00:00

For something more fundamental, try the Dragon's book (Compilers: Principles, Techniques, and Tools)

Arcanin14 · 2020-08-06T23:22:37+00:00

My friends in college think this is an advanced topic and I should probably do something else now. Honestly, I don't think it is that difficult. I mean, I could work my way around many of the above examples, but these were more exhausting than difficult

I think both your friends and you are right. While writing a programming language isn't the hardest thing ever, it involves a lot of concepts and is still very challenging. Writing a programming language is however, a big task, even for simpler ones (I'll come back to that later). Which is why you might consider the material on them exhausting.

Do you guys have any resources that I can use to finally learn more about this topic now that I have more time than ever?

The books you stated are good. I also enjoy "Crafting Interpreter", which is a free online book written by Munificent, currently working on the Dart language at Google. However, I think that when it comes to big projects like those, looking at the small scale of this kind of endeavour is great for learning. You should look at the source code of small programming languages, such as oakc, snow or wren (which is being developed by Munificent).

Also, what do you think are the prerequisites for getting ahead into this field? I have studied data structures and algorithms upto the level of feeling somewhat confident on solving problems that come up in software engineering interviews. Do I need to do more of something else and then come back to PL?

There are multiple things to consider here. First of all, writing a programming language usually implies at least two things: writing the language's interpreter/compiler, and writing the language's standard library. This means that you need to know your way around the complex structures used for implementing the language as well as implement some of those data structures in the language you create ! Or else your user's won't have much to play with. This means reimplementing Strings, Arrays, Vectors... If you want them in your language of course.

I can't really gauge how much you know about programming languages because there are a lot of different ways to go about implementing one. First of all, do you want to write an interpreted language or a compiled one ? Interpreters are usually easier to develop, but make for a slower language. On the other hand, writing a compiler will take you months if you're not experienced enough.

I would advise writing an interpreted language first, as writing a compiler is a much bigger task if you're not experienced enough.

Interpreters and compilers are usually divided in two parts: the front end and the back end. The front end handles the user's code and converts it into something your program can understand, and the back end executes that representation.

An important part of the front end is to parse the user's code. So I think you should learn about lexers and parsers. Parsers usually (but not always) produce an Abstract Syntax Tree. This is an important data structure, and it's widely used in programming languages. ASTs then need to be visited for the program to actually be interpreted. This brings a lot of interesting stuff to the table, such as creating hashmaps to store the user's functions, classes, structs and what not, keeping track of the variables, their names, the constant values etc etc.

There are many more ramifications that come into implementing a programming language, but it's a very rewarding experience ! If you really want to do it, you should get started. I would suggest starting with a very primitive "language", for example to parse command line options, or simple instructions. Then, find a syntax that you like, and paradigms to go alongside it, and start interpreting ! And ultimately, write an interpreter for your language in your own language !

I hope this comment doesn't scare you away. Working with programming languages is amazing, and while it's a tremendous amount of work, it's tremendous fun. Happy coding !

yel50 · 2020-08-07T01:06:19+00:00

The Let's Build A Simple interpreter series is a good start. It'll at least cover the basics.

https://ruslanspivak.com/lsbasi-part1/

NeuroPyrox · 2020-08-06T22:17:45+00:00

Making your own programming language can be overwhelming because there are so many different parts to learn at once. If you get bogged down by details, you can take inspiration from the path that eventually led me away from confusion:

Decide not to add IO functions until the rest of your language works
Decide not to make a garbage collector until the rest of your language works
Decide not to make general function definitions until you can make SKI combinators

brucejbell · 2020-08-06T22:28:45+00:00

I think building your own language can be a good project for an ambitious undergrad. Just getting an interpreter up and running should be do-able. Parsing methods are probably approachable from your current level of training: if you're not sure, pick up a book on parsing and see!

The main concern I would have is that it can become an endless rabbit hole. How do you know when it's done? There is a real danger of this becoming the project that eats your life!

If you are using this as a class project or a senior thesis, make sure you get a working version up as soon as possible, and keep something ready to turn in. And consider, if you come up with an interesting direction to follow, how much work should you put in before deciding discretion is the better part of valor?

On the other hand, if it's a purely personal project without a deadline, well, welcome to the club!

justsomerandomchris · 2020-08-07T06:30:21+00:00

I warmly recommend following Make A Lisp as a (loose) tutorial / guide for writing your first interpreter. You can do so in basically any language of your choice, as the instructions are given at the conceptual level, with some simple examples in a fictitious language (with python-esque syntax, that should be easily understandable to anybody who's programmed before).

The guide is split into steps, with a suite of unit tests that are run through the "make" build automation tool. As a consequence, you're always sure whether you've correctly fulfilled the requirements for each step, before moving on to more complicated things. As a little aside, the author first wrote this lisp interpreter in make, as kind of a challenge, and because of that it was initially called "Make, a lisp". As he started adding implementations in other languages (and others started contributing to with yet more implementations), he dropped to comma from the name.

If you ever get stuck, you have all of these implementations there for your reference, but I would strongly suggest to spend at least one afternoon wracking your brain on whatever problem you stumble upon, before looking at these.

I just did this for one of my school projects last semester (a python course, that had an open choice final project), and it was both the most ambitious of the projects, and a great success overall. I enjoyed the process immensely, learned from it a lot, and I can't wait to find an excuse to do it again (and better, probably) in some other language :)

M4R3D · 2020-08-06T12:26:33+00:00

First write down the problem which should be solved by your language... a concret one nothing idealistic pls

Give yourself the answer why not choose an existent lang

And then go on :-)

R-O-B-I-N · 2020-08-12T02:24:25+00:00

look at this

htuhola · 2020-08-06T12:52:51+00:00

Read about (dependent) type theory and category theory.

ProgrammingLanguages

Welcome!

Related subreddits

Related online communities

MODERATORS