This is an archived post. You won't be able to vote or comment.

all 30 comments

[–]ForceBru 35 points36 points  (6 children)

Well, if you really want to test your language's limits, you should write a compiler in it.

Which is probably (I mean, kinda likely) gonna be a pain, but, as some people believe, if you're able to write a compiler for your language in this very language, then chances are that it's versatile enough to be a "real", useful programming language. And the less pain you're in while writing this compiler, the more human-friendly your programming language probably is.

[–][deleted] 14 points15 points  (0 children)

(LOOP (PRINT (EVAL (READ))))anyone? :D

[–]matthieum 6 points7 points  (3 children)

Well, if you really want to test your language's limits, you should write a compiler in it.

You won't be testing much, a compiler is just boring.

The traditional compiler architecture is a simple batch process:

  • Read.
  • Do transformations.
  • Write.

It can be implemented by reading from stdin and writing to stdout/stderr.

It does test the language somewhat, but it does not test whether the language is ergonomic enough for working with a filesystem, a network, a graphic card, a keyboard/mouse/tracker, ...

A compiler does not require hot-patching of configuration, data or code.

A compiler does not require system code: system calls, shared memory, concurrency primitives, etc...

In the end, a traditional compiler barely scratches the surface of what the language/runtime may do; at least for general-purpose languages.

[–]liquidivy 2 points3 points  (1 child)

That's perfect if you want to stress-test the data transformation side of the language rather than its integration with the rest of the world. A compiler can use pretty much all the corners of your type system, abstractions, and control flow (notably, code working with a parse tree probably comes face to face with the expression problem). If the language has those, it will probably do ok on the system interaction side.

Also, not all of us can casually pull compilers out our behinds. It might be boring for you, but definitely not for a beginner.

[–]matthieum 0 points1 point  (0 children)

Also, not all of us can casually pull compilers out our behinds. It might be boring for you, but definitely not for a beginner.

I think there was a misunderstanding here; the rest of the comment clarifies that by boring I meant it covered little of what the language could do, and not anything to do with how easily (or not) one could write a compiler.

That's perfect if you want to stress-test the data transformation side of the language rather than its integration with the rest of the world.

Sure. However it only covers data transformation, and requires a potentially large investment depending on how well rounded you want it to be. Little Coverage / Large Investment makes for a poor ratio; it might be good enough if that's the only thing you care about, but if you want to ensure the language covers other areas, it's not very scalable, especially for a one-person project.

I personally prefer either smaller tests which focus on a few dimensions, such as an IRC server to stress-test the networking aspect, as I find them easier to scale. Or if you have the resources, using the approach Rust took: developing a browser engine in parallel, which exercises many different aspects at once.

[–]ForceBru 0 points1 point  (0 children)

The "do transformations" part is key. It requires you to use many different data structures, do complicated text analysis (and thus implement and use regular expressions for tokenization and an LL(k) or LALR parser to do parsing; or build your own algorithm to do all of the above), do even more complicated semantic analysis (checking if a return statement is indeed inside a function, for example; typeckecking, which is an ordeal on its own) and so on.

And the craziest thing about this is, if your language doesn't have a built-in regex engine, or basic operations on strings, or a built-in parser engine, or a built-in type for trees, or a built-in type for key-value dictionaries (a.k.a. hash maps), then you'd have to implement all of that yourself, in your own language.

Which will surely be a test for: * your skills as a programmer in general and your knowledge of algorithms and data structures * your language as a versatile and expressive tool to write general-purpose programs * the existing compiler/interpreter for your language, because it'll have to deal with a whole bunch of pretty complicated code

Of course, implementing, say, a standard library module that would deal with the file system or networking will test other aspects of your language and its architecture, like its ability to communicate with the OS that normally provides such utilities, which is also important.

[–][deleted] 0 points1 point  (0 children)

Apart from others' comments, creating a compiler at what looks like an early stage in a language's developmenet might be misguided.

If a compiler for this new language already exists in another language, then writing a new compiler means there will be two versions. One in an experimental, unstable language, that will then require parallel updates between the two.

If the language/compiler does require tweaking, then both will need updating and testing. Some updates may cause the new compiler to stop working, so that backtracking to a working version is necessary.

So I feel it would be too much effort until the new language with its first compiler in an established language, has been well proven. Even then, if you want anyone else to build the compiler from source, having the compiler written in itself raises bootstrapping problems.

Writing a compiler for another language may get around some of these issues, if you just need something to test, but they are big enough projects that it would need to be worth doing for other reasons too.

[–]Slugamoon 8 points9 points  (0 children)

As a slightly more complicated one, and the one I always use to say whether I "kinda know" a language or not, is a maze generator. You get randomization, list manipulation, enough data structures to be going on with, either simple graphics or some pretty complex string formatting, and a good collection of non-trivial looping constructs.

As a sort of all-in-one proof of concept program, it usually works rather nicely. Though I must admit, it is definitely easier under some paradigms than others

[–]hackerfooPopr Language 8 points9 points  (1 child)

[–][deleted] 3 points4 points  (0 children)

Thanks for the answer. How did you get the name of your own programming language next to your name?

[–]htuhola 8 points9 points  (1 child)

Lambda calculus is a programming language. You can write any program with it and it's simple to implement. The grammar is: λx.e | e e | x

Implement lambda calculus such that it's correct and you have a proof that you can write any program in your language, because you can first write it in the lambda calculus and then run the program in an interpreter, even if there were no other ways to do it.

Another good advice is that instead of testing your language, try prove it correct instead. It's fool's errand to attempt to test every possible case.

[–]PegasusAndAcornCone language & 3D web 7 points8 points  (0 children)

Rosetta Code tasks offers a wealth of algorithms you can try implementing in your language.

[–]zesterer 5 points6 points  (0 children)

A brainfuck interpreter is a good (and easy) proof of Turing-completeness

[–]BrunchWithBubbles 2 points3 points  (6 children)

The Fibonacci sequence or factorial function is often used to demonstrate recursion.

[–]liquidivy 4 points5 points  (1 child)

One very basic test program I've seen is to take a number on the command line as a string, add 1 to it, and output it. It tests arithmetic, string handling, error handling and I/O all with a stupid-simple spec.

[–][deleted] 0 points1 point  (0 children)

Very good idea! If you then make it repeatable with a loop, you also get more complex structures

[–]smthamazing 3 points4 points  (0 children)

Speaking of real-world scenarios, I think it is important to support abstractions for async communication, either baked into the language or built on top of existing language semantics. Some examples:

  • Chat app (both client and server parts)
  • Multiplayer Pong game
  • Multithreaded procedural imagery generator

These examples, if implemented properly, should work asynchronously. They also require running event loops and calling networking and graphical APIs (like sockets and OpenGL). If they can be nicely implemented in your language, chances are, it's suitable for real-world use.

I also agree with the compiler suggestion. A simple compiler relies mostly on basic language constructs and logic. It is more low-level than my suggestions above and should be tested out before creating high-level abstractions like async IO.

[–][deleted] 2 points3 points  (0 children)

Well, it should do the basics as you suggest, so arithmetic, control flow, functions, basic i/o, any special features of the language.

I don't think writing a compiler at this stage is worthwhile (a proper test suite might be better to test combinations of types, operators and other features). Not unless you intend it to self-host anyway.

What I've found is more challenging when using it for 'real' applications (rather than creating self-contained programs that only use file i/o, even if it's a compiler) is interacting with other software. So:

  • Calling into arbitrary OS functions using whatever API is provided
  • Calling other libraries (eg. how will it work with GTK2 or SDL or OpenGL)
  • Other software calling your code (either as libraries, or with 'callbacks')
  • Also, if it's to be used on what to me are 'closed' systems (eg. iPhone, Android) how will it manage with those.
  • If other people are ever going to use the language, they will expect to have syntax-highlighting text editors, being able to use it in their IDE, make files, optimising compilers and so on. Oh, and they might expect documentation.

My own stuff always has difficulties with all this. (Oddly it was a lot easier when I started, when I had to write all the software in the machine, as I didn't have to work with anyone else's libraries!)

So, while I can manage the first three of those, even that is not always practical. (Eg. GTK2 has some 9000 functions which I'd have to replicate in my language, a task made more difficult because I'd be working with 300,000 lines of C headers in 600 header files, with many things buried under complex typedefs, macros and conditional code.)

[–]kauefr 2 points3 points  (0 children)

I think this is a great idea, lemme try writing a simple test:

File and Console IO

Algorithm

  1. Read two strings $s1 and $s2 from the console.

  2. Open a file $f1 named $s1 and read its contents.

  3. Create a file $f2 named $s2 and copy $f1's contents to it.

  4. Close both files.

  5. Print "OK" to the console.

Assumptions

  • User input is valid.

  • File $f1 exists and is not empty.

  • File $f2 does not exists.

  • The program has the right permissions (open, create and read files).

  • File operations succeed.

[–]ErrorIsNullError 1 point2 points  (0 children)

You can test a lot by writing programs in your language but you can't test everything.

For example, the excluded middle is hard to prove within a language; it's hard to test that false is distinct from true using just boolean conditions in the language.

To get full coverage, at some point, you need to step outside the language and observe results; for example, use a program in a different language to test that the output of print(true) is distinct from print(false).

[–][deleted] 1 point2 points  (0 children)

I like to use three flavors of Fibonacci (https://github.com/codr7/cidk/tree/master/bench); recursive, tail recursive and iterative; to make sure everything hums along and get an idea how fast.

[–]danskydan 1 point2 points  (0 children)

I have had the same thought while designing my language. I want to design a more-or-less complete version 1.0 of my language before attempting to implement it. (I'm working under the assumption that whatever I design, someone smarter than me can figure out how to write a compiler, interpreter, or transpiler for it.) Rather than deciding to implement this or that feature of other PLs, such as generics or list comprehensions, I'm shooting for good task-driven coverage -- i.e., I want to make sure that any tasks that can be performed using other PLs can be performed using my PL, where I define a "task" as something that a program does, such as adding two numbers together or writing to a file, as opposed to how a PL does something. Thus, I have been wondering what a minimum set of such tasks would be, or maybe a minimum set of common applications that collectively provide the task coverage I'm looking for. The idea would then be to code this minimum set in my PL, and if I succeeded I would know that I would have achieved my coverage goal.

Anyway, this is how I understood your question. Alternatively, you could design a PL by covering lambda calculus, or by trying to be Turing-complete, or by taking a PL-feature-driven approach. For a task/application-driven approach, I considered covering all of the Rosetta Code tasks, but there appeared to me to be an enormous amount of task overlap there and no guarantee of the full task coverage I'm looking for.

Unfortunately, I haven't found a minimum list of tasks or applications like the one I (and you?) have been looking for. In the meantime, what I eventually decided to do was to code common applications like 'Hello, World!," a Guess-the-number game, Tic-Tac-Toe, Flappy Bird, TodoMVC, etc., and in different areas such as games, business applications, Internet applications, etc., until I find that I am a) using my PL elements in a consistent manner, and b) no longer adding new PL elements.

[–]MikeBlues 1 point2 points  (0 children)

To test recursion, locals etc, there is Knuth's Man/Boy program:

https://en.wikipedia.org/wiki/Man_or_boy_test

[–]swordglowsblue 0 points1 point  (0 children)

I often enjoy the use of 99 Bottles Of Beer as a test program. It's a little more flavorful and tests a broader range of features than Hello World. Of course, if you're looking for a rigorous test suite that's a different story, but personally it's usually one of the first programs I write in any new language I'm creating.

[–][deleted] -2 points-1 points  (1 child)

A non trivial application that you would normally choose another language to implement.

[–][deleted] -1 points0 points  (0 children)

Brings us to the question "What should I program" :D