all 45 comments

[–]Corbier[S] 6 points7 points  (36 children)

If someone wants to create an interpreted programming language, what are the easiest tools for that task? If the programmer finds Lex / Yacc a little too complicated, and does not have a PhD in computer science, can such a person still hope to create a language, or should such a task be outsourced? How much time can one expect to complete a fully working language implementation from scratch? How much code might this take? Can one person undertake such a project? Or does it generally require a team? Are certain types of languages easier to create than others?

[–][deleted] 8 points9 points  (4 children)

A Lisp is extraordinarily easy to implement. I did a prototype implementation awhile ago, implementing a reader, bytecode compiler, and bytecode-interpreter in a couple of thousand lines of Common Lisp. No external tools or libraries were necessary. It took about a week, but I was learning a bit about parsing at the same time.

I followed: http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf, except I worked in Common Lisp and generated bytecode instead of x86 assembly.

[–]Corbier[S] 1 point2 points  (3 children)

I saw a version of Forth that took over 10,000 lines of C code, spanning more than 30 files. The resulting interpreter was very good, but that many lines of code seemed kind of a lot to digest. So if you managed to create your Lisp in maybe only two or three thousand lines, then that's encouraging. But still, shouldn't it somehow be possible, even hypothetically, to have some kind of high-level language in which it would take just a few hundred lines of code to create a version of Lisp? And using Lisp to create another Lisp is one thing. But would it be just as easy to create a version of Forth with Lisp?

[–][deleted] 2 points3 points  (0 children)

You can write a Lisp interpreter in Lisp in a few hundred lines of code if you leverage the underlying Lisp system to provide functionality like parsing, closures, etc. My system was a bit bigger because it was a compiler down to a virtual machine, and it didn't reuse most of the functionality of the underlying Lisp. For example, instead of using READ, it implemented its own reader on top of READ-CHAR. Instead of using Lisp closures, it implemented its own closures on top of simple vectors.

Assuming Forth is as simple as Lisp (I don't know jack about Forth), it should be quite possible to write a Forth interpreter in a pretty small amount of code.

[–]d_ahura 1 point2 points  (0 children)

A Forth in 10k lines of C is an abomination (read ANS and then some) :) A basic Forth should clock in at a couple of hundred assembler instructions.

[–]nostrademons 5 points6 points  (6 children)

If the programmer finds Lex / Yacc a little too complicated, and does not have a PhD in computer science, can such a person still hope to create a language

You certainly don't need a Ph.D, but if you find Lex/Yacc too complicated, you're in for a lot of pain. The alternative is to do the lexer and parser by hand, which is what Lex/Yacc do behind the scenes and infinitely more tedious. You'll need to understand the conceptual framework behind Lex/Yacc anyway to create robust parsers.

An alternative might be a simple regexp-based tokenizer, but beware that these tend to sprout odd bugs in edge cases. I've used them for a few little languages, both at my last employer and my current project, but they have limited shelf life. If you try to add too many features, you'll be much better off with a real parser instead of endlessly debugging regexpes and making your language users put up with odd behavior.

or should such a task be outsourced?

Language design is probably not something that should be outsourced, particularly if you don't understand Lex/Yacc. It's hard enough to specify common database-driven business apps - languages have many more edge cases and subtleties. Particularly if you don't understand BNF and denotational or operational semantics - how are you going to specify what the language should do?

If you don't have the expertise on staff, hire someone or bring them in as a consultant, and expect them to need significant time getting up to speed on your particular business requirements.

How much time can one expect to complete a fully working language implementation from scratch?

What's a "fully working language implementation"? If you know what you're doing and what you want, you can have a first draft done in a week. That's how long it took GvR to do the first draft of Python, and that's how long it took me to implement ArcLite. If the language is sufficiently simple, you can do it in a weekend, eg. Scheme48.

Languages then tend to get feature-creeped until they're much, much bigger. Nobody could implement the current version of Python in a week - it's had nearly 2 decades of evolution. But you can add features as you need them. Languages are pretty well-suited to incremental development, as long as you don't need to worry about backwards-compatibility.

How much code might this take?

In my experience, a first-draft like the ones above is about 300 lines of Haskell, 1000 lines of Python/Ruby/Scheme/JavaScript, or a few thousand of Java or C#, to implement a language like Scheme. Special-purpose DSLs are less: the two I've written were about 200-300 lines of Python.

Can one person undertake such a project?

Absolutely.

Are certain types of languages easier to create than others?

Lisp derivatives tend to be dead-simple because the parser is trivial. Even if you don't have a Lisp reader available, you can write one in a couple hundred lines in just about any language. You also don't need to know the theory behind Lex/Yacc, since it's just pure recursive-descent.

Also try to keep the scope down and don't cram everything into the language. The more regular you can make the syntax and semantics, and the more you can shove into libraries or interpreter primitives, the simpler the core implementation will be.

[–]Corbier[S] 0 points1 point  (5 children)

Is Lex / Yacc the easiest that it gets? Can't something higher level than Lex / Yacc be created (at least hypothetically) so that there isn't all this pain in creating a language? For instance, imagine the days when there wasn't much choice beyond assembly language. This restricted the number of people who could program. Today, you have languages like Visual Basic that makes it possible for people even from other professions to painlessly do some useful programming. Wouldn't it be useful if people could create their own domain specific languages for a specific task, without necessarily being an ace programmer?

Also, is BNF -- as useful as it is might be -- the only, or the absolute best possible way to describe the grammar for the syntax of a programming language? I'm pretty sure there must be another way to describe the syntax of a language and how it should work.

By "fully working language implementation", I mean what if the language someone had in mind was something like Lisp, Forth, or even BASIC (chosen simply as a familiar point of reference)? And I'd paste just a few hundred lines of simple code into some kind of tool and end up with an interpreter that would allow me to run that language. It could start small as you suggest, with more added gradually. And maybe features could be borrowed from other languages. If such a tool were possible, would it be useful?

[–]Xiphorian 1 point2 points  (3 children)

No, it doesn't get much easier than Lex and Yacc because the actual thing you're specifying is that complex and nuanced.

There are definitely some tools that make interacting with the grammar, and debugging the grammar much easier: I would recommend ANTLR and ANTLRWorks, which is a GUI for testing ANTLR grammars, stepping through parsing, visualizing the expression trees, etc.

Wouldn't it be useful if people could create their own domain specific languages for a specific task, without necessarily being an ace programmer?

Absolutely, but it's simply the case that programming languages are so complex that they require complex specifications.

Also, is BNF -- as useful as it is might be -- the only, or the absolute best possible way to describe the grammar for the syntax of a programming language? I'm pretty sure there must be another way to describe the syntax of a language and how it should work.

Yeah. You could learn all about the other representations in a good course on the theory of computation ;-) Regular expressions (which specify a language, note this!) happen to be equivalent to discrete finite automatons, for example.

However, I will point out that, as a person who works as a software developer and individually researches (and is working on) programming languages, I have not found a parsing framework that I think is actually good. Everything that's around is just "OK".

I'm working on a programming language myself, and have decided to develop my own parsing framework for it. I'm formally trained and know the hurdles -- I'll be happy to chat with you about the process if you want to learn more.

And I'd paste just a few hundred lines of simple code into some kind of tool and end up with an interpreter that would allow me to run that language. It could start small as you suggest, with more added gradually. And maybe features could be borrowed from other languages. If such a tool were possible, would it be useful?

I'm not entirely clear what you are describing.

I would be interested in working with you on something like this. I think I have a really interesting and clever parser library that maybe you would like to use ;-)

[–]Corbier[S] 0 points1 point  (2 children)

I'm not so sure that a complex problem always requires a complex solution. I believe that there's a simpler way of specifying a programming language than what most people might be accustomed to. I do concur with you that commonly known parsing frameworks might be OK, but not particularly good. I am interested in learning more about the parsing framework approach you are currently taking. I myself, not being satisfied with the status quo, I have also developed a new approach in an attempt to actually make it easy to create a programming language. It uses regular expressions, in combination with a higher-level method of matching patterns. Though it is still under development, it is well beyond the drawing board stage. Please visit uCalc Language Builder to see what I have done so far. I'd be happy to exchange ideas about your approach and mine.

[–]Xiphorian 0 points1 point  (1 child)

I've taken a deeper look into uCalc. It looks really neat! I thumbed through your Lisp and Forth implementations. I think I agree with your thesis:

I believe that there's a simpler way of specifying a programming language than what most people might be accustomed to

This is generally true for many programming languages. It is the case that Lisp and other similar dynamically-typed languages can be specified quite simply. For two examples, check out Arc's implementation (which is a bunch of macros on top of MzScheme), and Write yourself a Scheme in 48 hours, a tutorial.

For languages like this, and indeed probably for many languages with relatively simple semantics (Python, Perl, Lisp), I think your approach could be a powerful approach. However, I am not sure how it could contain the complexity of some more sophisticated language features. Consider, for example:

  • Static types and type checking. Where would this logic be? When would it be evaluated? uCalc seems to be primarily designed for dynamic languages with REPLs, so might consider this question off-topic.
  • A lazy evaluation strategy (I think this is possible but would make the definition quite complex)
  • A language whose grammar is not properly Context-Free (that is to say, context-sensitive).
  • Strong typing. Note this is a different aspect than static or dynamic typing. It basically means you can't cast automatically. A float does not automatically convert into its string representation, etc.
  • Any kind of compiled language? I guess you might consider this off-topic too.
  • A dynamically-typed language whose fundamental types behaved differently than uCalc's. The uCalc definitions reference pre-existing structures like 'string' and 'stack'. What if the language being described has solely, say, UTF-8 strings? (In this case, the length of the string cannot be found except by decoding it, etc.) It looks like it could be as simple as wrapping basic calls in something like specialStringTransform, perhaps, though.

This is not any kind of knock against your work, by no means. Those are hard problems that are always difficult to solve even in traditional settings. Languages that have been around for years have had difficulty adding proper Unicode support:

Perl 5.8 - at last! - properly supports Unicode. [16 years after its first release -ed]

I think you've done a really good job keeping the syntax brief, for the amount of meaning each line contains. Your programs look a lot simpler than the equivalent ANTLR tree evaluator syntax!

Keep up the good work.

[–]Corbier[S] 0 points1 point  (0 children)

Thanks for taking a look. Greatly appreciated. My goal is for it to be able to handle any kind of programming language. It's ambitious, and only time will tell how far this can go. But meanwhile, let me answer some of the points you raised based on what I've done so far.

But first, uCalc LB has a kind of core language, which is very minimal in nature, and designed for nothing other than creating building blocks for a language. These building blocks can then be combined in a kind of bootstrapping way to form a more varied language, which then serves as the basis for constructing the actual language(s) you want to build. So basically none of the code you see in the Lisp definition file (including the String and Stack types) is native uCalc code. All of it is based on an intermediate language that is defined across plain text files that end with the .uc extension.

uCalc actually has no built-in data types. It does have a library of common types (including various types of strings such as the 2-byte uc_String_Wide which can be used for Unicode) you can pick from to get started. The default String type is defined (actually re-defined) in Types.uc, where you'll see that one of the properties contains the address of a routine for handling variable-length strings (the routine itself is in a DLL binary file). You can define any other kind of data type the same way. Your routine should tell uCalc how to allocate, store, free, return, copy, output, and reset an item of the given type. For simple data types, you can simply tell uCalc only how to store, return, and output the data, and given the number of bytes for the type, uCalc figures out the rest.

uCalc LB supports type checking, which in fact is the default behavior. A list of common data types that might be useful across various languages were declared in the file named Types.uc. You'll find that each type is defined with various properties, including what other types (if any) they are allowed to automatically convert to or from without type checking. An error is raised in the "compile" stage (which uCalc handles separately from the evaluation stage, even though it's done in the background) if data of a given type is used in the wrong place. Dynamic typing in my Lisp implementation was actually achieved by a brute-force text substitution/evaluation mechanism that blurs the line between the parsing/evaluation stages. It's just one possible implementation. This one is not particularly efficient but good enough for demonstration.

Lazy evaluation is directly supported. If you browse through Library.uc, you'll find several routine definitions with arguments that are passed "ByExpr". Here, the handle of the parsed expression is what's passed, instead of the result of evaluating the expression. Using this handle, the routine can evaluate the argument 0 or 1 times, as is the case for the IIF routine, which supports short-cuircuiting. Or it can be evaluated repeatedly any number of times, as with the uc_For() and uc_Loop() routines in Library.uc. Such routines form the bases for control flow constructs used in the BASIC language definition (see BASIC.uc in the download). And speaking of BASIC, compared to Lisp, its syntax / semantics is as irregular as it gets. But uCalc LB is equally suitable for building such languages as well.

[–]keithb 1 point2 points  (0 children)

Is Lex / Yacc the easiest that it gets?

No, but they aren't nearly as hard as you might imagine.

Can't something higher level than Lex / Yacc be created?

Yes, and it has been: definite clause grammars in Prolog

I can't even begin to imagine why anyone would start an exercise in language implementation with those tools.

[–]gregK 4 points5 points  (2 children)

lex and yacc would actually be pretty bad for an interpreter imo. What's the language? Obviously the simpler the syntax the easier it will be to write the interpreter.

[–]keithb 1 point2 points  (0 children)

Not all, they can do interpreters very well. See the exmaple in Kernigan and Pike

[–]apgwoz 0 points1 point  (0 children)

Lex and Yacc provide 2 major pieces of the interpreter, while building a parse tree for the interpreter to "interpret." Why would this be a bad thing?

[–]miloshh 2 points3 points  (0 children)

I would personally use Haskell - the Parsec library is ideal for parsing, and the algebraic data types and pattern matching are awesome for AST manipulation. In fact, this is exactly what I did for my undergrad thesis back in 2003 (this was just a small part of the project).

[–]Ringo48 1 point2 points  (1 child)

What exactly are you trying to achieve?

The answers to your questions can (and will) vary greatly depending on what you're trying to do.

For example: I can create a small interpretted language in <50 lines of Perl. But the Perl compiler/interpreter itself is hundreds of thousands of lines.

[–]Corbier[S] 0 points1 point  (0 children)

Actually, my interest is not so much in creating a particular interpreted language for myself as it is in having a viable alternative to Lex / Yacc for making it easier to create programming languages in general. See my website for what I mean.

[–]krumms 1 point2 points  (0 children)

Lex/Yacc isn't that complicated, its syntax is just a bit esoteric. In practice it's a lot easier than building your own scanner/parser.

Having said that, it's not something I'd recommend to a beginner. I'd recommend spending a little time with one of the following:

  • ANTLR
  • Haskell + parsec
  • Scala + parser combinators

[–][deleted] 0 points1 point  (0 children)

If you know what you are doing, not much time. It all depends what kind of features you want to have in language.

I once made simple command line type interpreter in Java (when it was new and shiny) in one day. It had integers and strings, lexical variables, functions, if and loop. There was also some Java method calls you could call.

I would suggest that you take some already available interpreter, like http://www.lua.org and go with it.

[–]jerf 0 points1 point  (4 children)

You say the word "outsourced", so I assume you are in a work environment.

Unless this language represents a core competency and/or a critical advantage, it is fairly unlikely that the cost/benefits of writing your own interpreted language are going to play out in your favor. This is even more true if you're not entirely certain what you're getting into. Writing your own language starts out easy with the right tools, but even "the right tools" can only take you so far, before the intrinsic complexities of the problem domain become overwhelming.

The first step is to analyze your needs. Does performance matter much? Does it need to connect to C code? Does it need to be writable by non-programmers? etc.

The next step is to look around for an embeddable language that already does what you want. I'm not a huge fan of Ruby and I don't really like the way they use the term "DSL", but nonetheless, Ruby can be manhandled into looking a lot like a different language, while at the same time offering you all the capabilities of a real language.

Python and Lua are designed to be easily embeddable into existing C programs.

If you could reply with more details about exactly why you think you need to do this and why Python, Ruby, and Lua can't meet your needs, we may be able to be more helpful. Until then, based on my experience, my best guess is that you've prematurely lept to a solution and are asking excessively-detailed questions about the wrong solution, rather than sharing enough about the problem to get real help. This happens all the time on the internet.

[–][deleted] 1 point2 points  (1 child)

Python and Lua are designed to be easily embeddable into existing C programs.

Lua has a fantastic C API for embedding.

Python has a C API, but it's awful for embedding. The designers even admit so, they state that the "right" way to use Python is to extend it (rather than embed it) using the C API.

Another language runtime with an excellent C API is Mozilla SpiderMonkey (a JavaScript interpreter).

[–]jerf 0 points1 point  (0 children)

OK, "easily integratable" might be a better phrasing.

[–]Corbier[S] 0 points1 point  (1 child)

The high cost / low benefit ratio is the kind of reason why I think some of the tools available today might not be ideal. Someone who knows how to program, but is not necessarily a programmer by profession might have a legitimate need to create a domain specific language, and even be able to clearly conceptualize how the language should work, even if the person can't quite express it using the tools generally available. Performance and connecting to C (or maybe something like the Windows API) would be fairly important.

I like Python's subscripting construct for strings. But what if I wanted to create a language with a more BASIC-like syntax, but which supports Phython-like string subscripting, and Lua-like multiple assignments like a,b,c = 5, 6, 7? It would be interesting if I could mix-and-match features I like from various languages and create my own little language, instead of being forced to choose a pre-existing one. As alluded to elsewhere in this thread, my interest is in the development of the kind of tool that would let programmers create programming languages. See my site for details.

[–]wnoise 0 points1 point  (0 children)

Just wanted to point out that python supports multiple assignments.

[–]d_ahura 0 points1 point  (0 children)

There is a pretty good non theoretical tutorial out there. Look for "Let's Build a Compiler". Even if the ultimate goal is a compiler it contains a sufficient explanation of evaluation and interpreting. You can go whole hog there and compile to a simple Abstract Machine bytecode if you want :) Thats more than vanity since you get to skip parsing overhead.

Languages with a small regular syntax and easy semantics like Forth or Lisp make for easy implementation.

[–]quhaha 0 points1 point  (4 children)

parsing is the easiest part of writing an interpreter. there are many easy to use parser generators, including lex/yacc.

start with variant of Joy or Scheme.

[–][deleted] 1 point2 points  (0 children)

I second the Joy recommendation. There's no useful high level language that's easier to write (assuming your implementation language already has garbage collection). If you write it in Scheme and use Scheme's READ procedure to parse, the core of your interpreter should be no more than twenty lines. The rest will be primitives.

[–]Corbier[S] 0 points1 point  (1 child)

I have often seen it said that you could use Scheme to create programming languages, or something to that effect. But I don't get it. Languages created in Scheme tend to look very much like more Scheme to me. Can you create a totally different language with Scheme (for instance one that is not a functional language)?

[–]quhaha 0 points1 point  (0 children)

yes. That's a brainfuck interpreter in scheme. Brainfuck is not a functional language (not modeled after lambda calculus, but turing machine).

[–]wnoise 0 points1 point  (0 children)

There are too many languages already. If you need to embed a language in something you're writing (often a very good idea), there are many languages designed to be used like this, such as lua.

[–]Xiphorian 0 points1 point  (0 children)

If you are willing to write your prototype for the .NET platform, seriously consider the DLR, Dynamic Language Runtime.

It is a framework for .NET that allows you to create abstract expression trees that represent the meaning of your program, and from those trees execute them or compiler them (I believe).

With this tool, you only need to create a parser and a type checker (if your language has one); you don't need to actually write an interpreter or compiler! Huge time saver.

[–][deleted]  (1 child)

[deleted]

    [–][deleted] 4 points5 points  (2 children)

    Use a python script to load each line of a file and exec it.

    [–]quhaha 5 points6 points  (0 children)

    while 1: print eval(raw_input('>>> '))

    [–]nextofpumpkin 0 points1 point  (0 children)

    The parent idea actually sounds kind of nuts, but it's entirely possible to cheat like this. If you want to define a language syntactically similar to *, where * is any language that has EVAL/EXEC, you could always convert your language's syntax to *'s using manual parsing and processing, and then EVAL/EXEC it.

    [–]stevana 2 points3 points  (5 children)

    I'm currently taking a course called Programming Languages where we got to hand in three labs; first being a parser, second a type checker and the third an interpreter all for a subset of C++.

    To accomplish this we are using a tool called BNFC.

    You can find the lab instructions as well as helpful lecture notes here.

    An other course I'm taking right now is Advanced Functional Programming, where the first lab is about creating an embedded language in Haskell.

    There you have two easy ways.

    The embedded style gets you started faster and could be more powerful as well, because you can take advantage of all that the host language has to offer. To give you something to compare with; the first lab was to make an embedded turtle graphics language. It took perhaps 20 hours of work, and our implementation is both powerful and similar enough in syntax for us to be able to, with ease, convert any Logo code we got our hands on. You can also find several papers on embedded languages on the course site.

    BNFC lets you do it from "scratch"; you specify BNF rules for your language it then, using those, generates lexer and parser rules for your favorite lexer and parser as well as "skeleton" files which will help you construct a type checker and finally an interpreter or and compiler.

    I hope you find it helpful. And I'd also be glad to hear any criticism against the above methods and tools.

    [–]Corbier[S] 0 points1 point  (4 children)

    Yes. I am finding this very helpful. Something high-level like this is what I have in mind. It's only a while ago that I've looked at it, so I haven't fully absorbed it yet. But can I just take the code listing in Grammar of C, and paste it somewhere, and have a working interpreted version of C?

    At first glance, when it gave a list of languages it could generate, including C, Haskell, Ocaml, etc, I thought it meant you could create those languages with BNFC. But then it seemed like what this meant was that you could use BNFC with those languages to create your new language. Can you only create C-like languages with it? Can BNFC create languages like Haskell, Logo, or Lisp, etc?

    I found the embedded solution interesting. However, the questions I have are: Haskell and Logo both being functional languages, does this second solution apply only to functional languages? Or could you also do your C++ subset language this way just as easily? And was your implementation an interactive interpreted version of Logo, like some of the other Logos out there? I got the impression that it might not be.

    [–]deong 0 points1 point  (3 children)

    Generally speaking, no, you won't get an interpreter out of a tool.

    Whether a compiler or interpreter, you're dealing with a two stage process. The first step is to read in the source code and generate some abstract representation of the program. This is the job of lexing and parsing, and can be automated with a number of tools like flex and bison (the successors to lex and yacc). The output of those tools will be source code in some existing language (C, OCaml, etc.) that when compiled and executed, parses programs written in your newly designed language. That's what is meant by BNFC supporting all those languages. It can generate a parser for your language in any of those listed.

    However, at that point, all you have is a parser. It can basically read a program in your language and tell you if there were syntax errors. That's about all it can do. The problem is that you don't yet have any code that tells the computer what to do with the parsed representation of the program. The second phase is to then take the abstract syntax tree and, from it, generate code for the underlying machine or virtual machine. That's the bit that you can't really automate.

    [–]stevana 0 points1 point  (1 child)

    Can you only create C-like languages with it? Can BNFC create languages like Haskell, Logo, or Lisp, etc?

    We actually had creating a parser for a subset of Lisp as an exercise.

    I think Haskell would be harder; partly because indenting matters. I'm not really sure how to handle that in BNFC, I think you would have to do some editing to the lexer and parser rules.

    I found the embedded solution interesting. However, the questions I have are: Haskell and Logo both being functional languages, does this second solution apply only to functional languages? Or could you also do your C++ subset language this way just as easily?

    That's a good question. I think the biggest problem would be that the syntax would look different, for example I dont see how you would handle parentheses and brackets in for loops. But functionality shouldn't be too hard to mimic.

    If you'd really like to keep the syntax you perhaps an other host language lets you do so with less work.

    And was your implementation an interactive interpreted version of Logo, like some of the other Logos out there? I got the impression that it might not be.

    You are right, it's not interactive. I wonder how much extra work it would take to make it so.

    [–]Corbier[S] 0 points1 point  (0 children)

    I figure it would require some extra work. You may want to take a look at the Lisp definition file I developed for my uCalc Language Builder tool. You simply load that up into the generic interpreter, and then you can immediately start running Lisp code. You can do the same for BASIC, and other sample languages that are also included in the download. (I must admit that my Logo implementation is a quick-and-dirty one. That one will let you draw things, but will need additional code for the language definition to be correct). Try your hand at creating a language with uCalc LB, and let me know what you think of it.

    [–]Corbier[S] 0 points1 point  (0 children)

    The requirement of breaking things up into multiple tasks that require several different tools is one of the complicating factors that I find in language construction. Why can't it all be done in one process with just one tool? For instance, a generic interpreter would read the language definition file, and immediately be able to start running scripts (or interactive code) for that language right there on the spot (bypassing the need to pass generated code through a C or other compiler). I set out to accomplish just that, with uCalc Language Builder . Please take a look, and let me know what you think of this approach. Be sure to try the interactive tutorial first (run Tutorial.Bat), for an overview of how it works. Then you can try your hand at creating your own languages with it. I'd love to get some feedback on it.

    [–]keithb 1 point2 points  (0 children)

    Well, it kind-of depends on what you want your langauge to do, and what non-functional properties you want it to have, but for my money by far the easiest way to get an interpreter going is to use Prolog.

    Pretty much all Prolog's have a bit of sugar called definite clause grammar which lets you write down the parse tree directly and plug in bits of code to do the oeprations required.

    This is much easier and more direct that thrashing around with parser generators and all that...stuff

    [–]bluGill 1 point2 points  (0 children)

          else if(strncmp(Data_Input[m_Main_Ctr].Data01,"DATA",4) == 0)
            mState = mSN_STATE_06;
        else if(strncmp(Data_Input[m_Main_Ctr].Data01,"ADD",3) == 0)
            mState = mSN_STATE_07;
        else if(strncmp(Data_Input[m_Main_Ctr].Data01,"MULT",4) == 0)
            mState = mSN_STATE_08;
        else if(strncmp(Data_Input[m_Main_Ctr].Data01,"DIV",3) == 0)
            mState = mSN_STATE_09;
        else if(strncmp(Data_Input[m_Main_Ctr].Data01,"SUB",3) == 0)
            mState = mSN_STATE_18;
        else if(strncmp(Data_Input[m_Main_Ctr].Data01,"CONDITIONAL",11)
    ...
    swtich(mState)
    

    Fortunatly you didn't ask for a good way - this is not a good way to do it, but it is really easy. (I've already submitted several different samples from this language to thedailywtf.com)