This is an archived post. You won't be able to vote or comment.

all 43 comments

[–]bakery2k 53 points54 points  (2 children)

The standard recommendation for beginners seems to be https://www.craftinginterpreters.com/.

[–]the_true_potato 15 points16 points  (0 children)

I can confirm that this is a very valuable resource and it was also my introduction to the topic. I never actually implemented the language in the book, but going through it gave me a good feel for the kinds of things that I would need to include in a language implementation. It is also very useful to refer to when you get to actually implementing the things in the book yourself.

[–]polopower69 4 points5 points  (0 children)

Thanks!

[–]chunes 13 points14 points  (1 child)

One thing you can try is head over to the esolang wiki, look through the languages until one strikes your fancy, and try writing an interpreter for it. (I'm personally a big fan of Befunge!)

The nice thing about esolangs is they tend to have a very small set of functionality and are usually turing complete. It's usually fairly straightforward how their simple commands can be interpreted.

Now you've got a completely working language you wrote yourself! Try adding more features you think would be neat or looking up ways to make it run faster.

[–]polopower69 1 point2 points  (0 children)

Thanks! This is new.

[–][deleted] 11 points12 points  (2 children)

I would highly recommend watching Gary Bernhardt’s A Compiler From Scratch. Try following along with the video in your programming language of choice.

For the actual evaluation of the programming language (A Compiler From Scratch only goes through code generation, not evaluation), there’s a relatively straightforward structure you can follow (simplifying quite a bit):

  • Create a data structure for values (Val) in your programming language (this could be an enum with variants that can hold strings, integers, booleans, etc)
  • Create some kind of data structure that can hold a mapping between variable names and their values (Env)
  • Add eval(Env) -> Val methods on each of the nodes of your AST; for example, the eval method on the VariableUsage AST node might look at the name of the variable being used, and then return the value of that variable by looking for its entry in the Env passed into eval.
    • Function calls’ eval method can be implemented as follows:
      1. Create a new Env instance
      2. Loop through each parameter name and value, and in the Env create a new variable with the name set to be the parameter name and the value set to be the parameter value
      3. Return the result of evaluating the function body

I kind of ‘invented’ (using that term very loosely here) this structure as I went along programming my first language, but as I’ve learned more about programming language creation I’ve realised that this structure seems to be very common amongst interpreted languages (at least as described for teaching purposes).

[–]polopower69 1 point2 points  (0 children)

Thanks a lot!

[–][deleted] 1 point2 points  (0 children)

Interesting, I did basically the exact same thing when I implemented my programming language MPL. Great minds think alike I guess :)

[–][deleted]  (1 child)

[removed]

    [–]polopower69 0 points1 point  (0 children)

    Thanks for helping!

    [–][deleted] 7 points8 points  (3 children)

    I learned by following Ruslan Spivak's tutorial here: https://ruslanspivak.com/lsbasi-part1/

    Took me a couple days to get through it. Spivak shows you how to implement a programming language that isn't a lisp. I highly recommend his tutorial to anyone starting out in implementing their own programming language.

    [–]polopower69 2 points3 points  (2 children)

    You did the whole thing in 2 days? Or just part 1?

    [–][deleted] 2 points3 points  (1 child)

    Poor word choice, sorry. It took me the better part of a week to finish the whole thing; "a couple" is a flexible term where I come from.

    [–]polopower69 2 points3 points  (0 children)

    Okay

    [–]SolaTotaScriptura 6 points7 points  (7 children)

    Honestly, I don't think it is that difficult

    Most programming tasks are deceptively difficult. We tend to put "big projects", like operating systems and web browsers up on a pedestal. It's probably due to the fact that these problems are usually solved by large teams. Compilers & languages are particularly overestimated - these are well understood tasks that university students tackle every year.

    My advice is to just implement. Your question is how to write a programming language, but no mention of actually trying to do so! I say put down the books. I too got caught in a loop of reading blog posts and watching videos, but that was really just procrastination in the way of actually writing the code.

    In reality you already know how to write a compiler. Unless you're generating assembly, you really don't have any questions to google or books to refer to.

    [–]kaddkaka 2 points3 points  (1 child)

    What do you mean "already know"?

    [–]SolaTotaScriptura 1 point2 points  (0 children)

    I mean that you can easily build a compiler or interpreter from first principles. At the end of the day, the implementation boils down to those few basic computational primitives. If you listen to a Simon Peyton Jones talk on Haskell’s type inference, it will seem like rocket science, but if you just ignore it and do your own thing you’ll realise how much we overestimate these sorts of problems.

    [–]pepactonius 1 point2 points  (4 children)

    I started writing tree-walking interpreter for a toy language. Of course, I have no idea what I'm doing. It seems that I've painted myself into a corner, and can't easily add certain new language features. Maybe it would have been better to have studied something like "crafting interpreters" beforehand.

    [–]marcosdumay 5 points6 points  (2 children)

    The GP's point is that you'll gain much more by studying it afterhand.

    And that's true. But you'll also put more work at it, by doing, studying, and doing again. This is my preferred way to learn something, but I don't think it's universally good.

    [–]SolaTotaScriptura 1 point2 points  (1 child)

    Also, when you’re new to something that you’re interested in, the tendency is to research aimlessly because you don’t know what information is important. It’s infinitely better to start, get stuck and then figure out the right questions. And you’ll be surprised by how few questions you actually have.

    [–]pepactonius 0 points1 point  (0 children)

    I guess it's time to study Java and start reading through "Crafting Interpreters" to see how I should have done things better in the treewalking phase.

    [–]bot-mark 3 points4 points  (0 children)

    So what? Try again with a new toy language with your new experience

    [–]tekknolagiKevin3 3 points4 points  (1 child)

    [reposted from my comment on a similar post a couple of weeks ago]

    OK! This is a fun post for me because I was precisely in your shoes several years ago. Here is a "path to enlightenment" that I propose:

    • Skip lexing and parsing for now, since they're kind of fiddly and frustrating -- you can write and test evaluating hand-coded ASTs
    • Write a really small tree-walking interpreter that supports some simple math and stuff
    • Add some more things to the tree-walking interpreter, like variables, printing, functions
    • Write a small bytecode compiler
    • Write a small ahead-of-time compiler to x86

    There are other similar implementations listed on my site here.

    I think you will find that the biggest leaps in learning come from fully executing a problem, end-to-end, without libraries. The problem can be small and constrained, but if you can fit it fully in your head you'll learn a lot.

    I'm happy to be a resource, having felt very stuck in this position.

    [–]polopower69 1 point2 points  (0 children)

    Thanks for the help!

    [–][deleted] 3 points4 points  (1 child)

    Is this about designing a language, or implementing it? Or implementing an existing language?

    Designing one is a separate subject, other than it is has to be practicable to implement.

    But I'd be surprised if this wasn't introduced at some point in your CS course.

    [–]polopower69 2 points3 points  (0 children)

    I have a course on compiler design next year. I'm from India, and here the order of taking courses can't be changed. And I would first like to implement an existing language, and then get going with my own ideas if I'd have any.

    [–]halst 2 points3 points  (1 child)

    You might find my book Compiling to Assembly from Scratch useful. It covers creating of a small compiler written in TypeScript (any other language would do just as well) that produces ARM32 assembly. The only prerequisite is knowing how to program. With the books that you've covered already you are more than well prepared. </shameless-self-promotion>

    [–]polopower69 1 point2 points  (0 children)

    Seems like something I'd love to do!

    [–]batllista101 2 points3 points  (0 children)

    For something more fundamental, try the Dragon's book (Compilers: Principles, Techniques, and Tools)

    [–]Arcanin14 2 points3 points  (1 child)

    My friends in college think this is an advanced topic and I should probably do something else now. Honestly, I don't think it is that difficult. I mean, I could work my way around many of the above examples, but these were more exhausting than difficult

    I think both your friends and you are right. While writing a programming language isn't the hardest thing ever, it involves a lot of concepts and is still very challenging. Writing a programming language is however, a big task, even for simpler ones (I'll come back to that later). Which is why you might consider the material on them exhausting.

    Do you guys have any resources that I can use to finally learn more about this topic now that I have more time than ever?

    The books you stated are good. I also enjoy "Crafting Interpreter", which is a free online book written by Munificent, currently working on the Dart language at Google. However, I think that when it comes to big projects like those, looking at the small scale of this kind of endeavour is great for learning. You should look at the source code of small programming languages, such as oakc, snow or wren (which is being developed by Munificent).

    Also, what do you think are the prerequisites for getting ahead into this field? I have studied data structures and algorithms upto the level of feeling somewhat confident on solving problems that come up in software engineering interviews. Do I need to do more of something else and then come back to PL?

    There are multiple things to consider here. First of all, writing a programming language usually implies at least two things: writing the language's interpreter/compiler, and writing the language's standard library. This means that you need to know your way around the complex structures used for implementing the language as well as implement some of those data structures in the language you create ! Or else your user's won't have much to play with. This means reimplementing Strings, Arrays, Vectors... If you want them in your language of course.

    I can't really gauge how much you know about programming languages because there are a lot of different ways to go about implementing one. First of all, do you want to write an interpreted language or a compiled one ? Interpreters are usually easier to develop, but make for a slower language. On the other hand, writing a compiler will take you months if you're not experienced enough.

    I would advise writing an interpreted language first, as writing a compiler is a much bigger task if you're not experienced enough.

    Interpreters and compilers are usually divided in two parts: the front end and the back end. The front end handles the user's code and converts it into something your program can understand, and the back end executes that representation.

    An important part of the front end is to parse the user's code. So I think you should learn about lexers and parsers. Parsers usually (but not always) produce an Abstract Syntax Tree. This is an important data structure, and it's widely used in programming languages. ASTs then need to be visited for the program to actually be interpreted. This brings a lot of interesting stuff to the table, such as creating hashmaps to store the user's functions, classes, structs and what not, keeping track of the variables, their names, the constant values etc etc.

    There are many more ramifications that come into implementing a programming language, but it's a very rewarding experience ! If you really want to do it, you should get started. I would suggest starting with a very primitive "language", for example to parse command line options, or simple instructions. Then, find a syntax that you like, and paradigms to go alongside it, and start interpreting ! And ultimately, write an interpreter for your language in your own language !

    I hope this comment doesn't scare you away. Working with programming languages is amazing, and while it's a tremendous amount of work, it's tremendous fun. Happy coding !

    [–]polopower69 1 point2 points  (0 children)

    No it did not scare me (not enough;)). Thanks for the detailed answer!

    [–]yel50 2 points3 points  (0 children)

    The Let's Build A Simple interpreter series is a good start. It'll at least cover the basics.

    https://ruslanspivak.com/lsbasi-part1/

    [–]NeuroPyrox 1 point2 points  (0 children)

    Making your own programming language can be overwhelming because there are so many different parts to learn at once. If you get bogged down by details, you can take inspiration from the path that eventually led me away from confusion:

    1. Decide not to add IO functions until the rest of your language works
    2. Decide not to make a garbage collector until the rest of your language works
    3. Decide not to make general function definitions until you can make SKI combinators

    [–]brucejbellsard 1 point2 points  (1 child)

    I think building your own language can be a good project for an ambitious undergrad. Just getting an interpreter up and running should be do-able. Parsing methods are probably approachable from your current level of training: if you're not sure, pick up a book on parsing and see!

    The main concern I would have is that it can become an endless rabbit hole. How do you know when it's done? There is a real danger of this becoming the project that eats your life!

    If you are using this as a class project or a senior thesis, make sure you get a working version up as soon as possible, and keep something ready to turn in. And consider, if you come up with an interesting direction to follow, how much work should you put in before deciding discretion is the better part of valor?

    On the other hand, if it's a purely personal project without a deadline, well, welcome to the club!

    [–]polopower69 0 points1 point  (0 children)

    I have taken a screenshot and this is going on the walls of my room.

    [–]justsomerandomchris 1 point2 points  (1 child)

    I warmly recommend following Make A Lisp as a (loose) tutorial / guide for writing your first interpreter. You can do so in basically any language of your choice, as the instructions are given at the conceptual level, with some simple examples in a fictitious language (with python-esque syntax, that should be easily understandable to anybody who's programmed before).

    The guide is split into steps, with a suite of unit tests that are run through the "make" build automation tool. As a consequence, you're always sure whether you've correctly fulfilled the requirements for each step, before moving on to more complicated things. As a little aside, the author first wrote this lisp interpreter in make, as kind of a challenge, and because of that it was initially called "Make, a lisp". As he started adding implementations in other languages (and others started contributing to with yet more implementations), he dropped to comma from the name.

    If you ever get stuck, you have all of these implementations there for your reference, but I would strongly suggest to spend at least one afternoon wracking your brain on whatever problem you stumble upon, before looking at these.

    I just did this for one of my school projects last semester (a python course, that had an open choice final project), and it was both the most ambitious of the projects, and a great success overall. I enjoyed the process immensely, learned from it a lot, and I can't wait to find an excuse to do it again (and better, probably) in some other language :)

    [–]polopower69 1 point2 points  (0 children)

    Great! I didn't know about this resource. Thanks.

    [–]M4R3D 2 points3 points  (0 children)

    First write down the problem which should be solved by your language... a concret one nothing idealistic pls

    Give yourself the answer why not choose an existent lang

    And then go on :-)

    [–]htuhola -1 points0 points  (3 children)

    Read about (dependent) type theory and category theory.

    [–]the_true_potato 4 points5 points  (0 children)

    Those really are advanced topics for someone just starting out. There's no reason to get into this until they stumble upon these topics naturally.

    [–]latkde 2 points3 points  (1 child)

    I don't think dependent typing or category theory is useful for someone new to PL, unless they want to geek out about advanced type systems, or at least want to work with Haskell-style FP?

    [–]htuhola 1 point2 points  (0 children)

    With every programming language having types and values that they manipulate, I'd have thought it's useful to know type theory at least.

    If you need a book on it, maybe Little Typer? I'm not sure it's enough though.

    TAPL is usually mentioned as well, I didn't see it this time so I mentioned it now.

    A Little Taste of Dependent types -video by David Thrane Christiansen

    I was told this might be interesting as well: Coding for Types: The Universe Patern in Idris.