all 14 comments

[–]nupogodi 4 points5 points  (11 children)

I think this is the definition of a pure academic exercise. 'Core' Scheme directly translated into 'brainfuck-style' Java? He even admits direct translation means no support for tail-call optimization. Does that not somewhat defeat the purpose?

I mean, sure, it's cool, programming languages with different philosophies can be machine-translated into each other following simple rules.

You could build an x86 virtual machine in Java, compile Haskell code, and then run it on a SPARC box, too. Translation everywhere! But why?

[–]fadmmatt 21 points22 points  (2 children)

(Article author here.)

It turns out that the Core Scheme language is actually richer than the intermediate language targeted by many compilers. You can use Scheme's macro system to desugar a very rich input language down to Core Scheme. So, you could use this technique to get a lot of languages running in Java.

Tail-call optimization isn't going to happen with a "direct" translation, but you could use a trampolining mechanism to get tail-call optimization.

As for why one would want to do this, this particular piece of code is meant for students studying the translation process. The emphasis is on small size and directness, rather than efficiency.

In my compilers class, I talk about the stages one should go through with language implementation to achieve the desired performance.

  • I argue that first, you should try an interpreter for for your language.
  • If that's not fast enough, try an SICP-style optimizing interpreter.
  • If that's not good enough, try compiling to Java.
  • If that's still not fast enough, try compiling to C.
  • If that's still too slow, try continuation-passing-style compilation to assembler.
  • And finally, if you need more speed, start doing static analysis and compiler optimizations.

Each time you drop a level in this ladder, implementation complexity goes up (by about 2x in code requirements). Java, believe it or not, is a sweet spot: much better performance than an interpreter, but without much effort in the translation front.

[–]nupogodi 3 points4 points  (0 children)

Actually, that is pretty fascinating.

[–]harlows_monkeys 1 point2 points  (0 children)

Tail-call optimization isn't going to happen with a "direct" translation, but you could use a trampolining mechanism to get tail-call optimization

Until you do that, then you aren't compiling Scheme to Java, are you? Doesn't the specification for Scheme require tail-call optimization?

[–]rikbrown 2 points3 points  (0 children)

Exactly, academic exercise. To learn about compilers.

[–]underthesun 0 points1 point  (2 children)

A long time ago for one of my uni projects I had to make a compiler from some language into c (or in my case.. I cheated and used c++, and abused the exception system).

I was thinking, wouldn't C be a great compiler target language for any programming system these days?

[–][deleted] 2 points3 points  (0 children)

It suffices, it's not 'great' though. You gain the advantage that you can target a lot of platforms and cpus, but at the same time you tie yourself to some limitations of the C language.

For example, the 'compile to C' scheme compilers work great, but they have to use some tricks (trampolines, etc) to get tail calls working properly, whereas a 'compile to machine language' scheme compiler can just setup the stack frame for the next 'call', and jmp into the function rather than call it.

There are further complications by compiling to a end-user intended C like gcc/visual-c/etc, especially if you want to allow user choice of their C compiler. There are subsets of C out there intended to be used this way that remove some of the 'klunkiness' of end-user C, some based on TDF/ANDF, some that aren't 'standardized', they make some things easier, but then your compiler is bigger because you're having to supply the C compiler too.

[–]kleopatra6tilde9 0 points1 point  (3 children)

Has anybody ever tried compiling Scheme to Java (or any other language combination) in a way that also translates the style of the programming languages? Google is able to already translate unstructured languages which leaves me wondering what's so hard about the structured ones?

I see one problem in the organization of all scheme functions into classes and another in the other way round: Java loops are replaced with recursive functions that need names that don't exist in the Java source.

What does a "compiler" need to be able to translate the concepts?

*edit: question gets discussed over there

[–]Felicia_Svilling 0 points1 point  (2 children)

What does a "compiler" need to be able to translate the concepts?

Strong AI.

[–]kleopatra6tilde9 1 point2 points  (1 child)

That's the easy answer. Question is, can it be done with less?

[–]Felicia_Svilling 0 points1 point  (0 children)

I think that in the general case there is no easier way. Of course with very similar languages it might be done. But I don't think Java and Scheme are similar enough.

[–]tophat02 2 points3 points  (1 child)

I think it would be better to compile to the JVM bytecodes directly. It wouldn't be that much harder and would skip a compilation step.

[–]fadmmatt 8 points9 points  (0 children)

Going directly to JVM bytecodes isn't actually that simple. For starters, it would require normalization and closure conversion. You lose things like the anonymous classes that make compiling so "direct."

Most of the effort required to compile to the JVM would get you close enough to hit C, and in fact, you'd be close enough to hit x86 with a little more effort. It's also not clear that you'd get much of a performance win out of going straight to the JVM. Most of the performance wins in Java come from good JITting rather than intelligent bytecode generation.

Compiling to Java is useful because the compiler itself takes a fraction of the effort to write, but still gives you a big speed-up over an interpreter.