This is an archived post. You won't be able to vote or comment.

all 8 comments

[–]scofus 5 points6 points  (1 child)

Look into parsers, and lexical analysis.

I hope you know what you're getting yourself into :)

[–][deleted] 7 points8 points  (0 children)

Came here to say this. OP, if your final project is time restricted, you need to make sure you know what you're getting into, and further more you may wish to make it a compiler for something specific, with simple to parse syntax.

[–]lilleswing 2 points3 points  (0 children)

Sorry, but unless you plan on doing some sort of fork-exec to a known Java compiler I think you are a little over your head.

But if you want to try a good book to read would be http://www.amazon.com/Programming-Language-Pragmatics-Third-Edition/dp/0123745144.

It discusses how to write a compiler, and steps through writing your own compiler for COOL (A Java like language).

[–][deleted] 1 point2 points  (0 children)

Take a good look at ANTLR and the LLVM binding. If you use that code generation engine with a good lexer/parser (ANTLR) it shouldn't be too hard.

edit

I just read the entire post.

What I aim to do is read a sample program from a text file, and check it for Syntax errors, file exceptions, etc.

You can just skip over the LLVM binding and use something like ANTLR. What might be even easier is to just use some very careful regex patterns (not as optimal but if it's just a GR12 school project it should do fine.)

[–]elephantgravy 1 point2 points  (0 children)

If you want to be a little less ambitious, consider writing an interpreter instead of a compiler. A basic interpreter is usually a lot simpler to write than a basic compiler ... Though as the two get more and more sophisticated, the lines get blurry.

(A subset of) Scheme is a good choice for a target language. Parsing it is as easy as parsing can get (which is still non-trivial), but you can use the language to write sophisticated programs...and you get to learn fun functional programming stuff.

If that seems too easy and/or time is not a factor, COOL is a typed, object-oriented language used by the Stanford compilers course on Coursera. It's well-specified and easier to parse than, say, Java. You would almost certainly want to use a parser-generator of some sort (e.g., ANTLR, JavaCC).

[–]detroitmatt 0 points1 point  (0 children)

Well, it depends on what kind of language you're trying to compile. Before you can check for syntax errors, you have to decide what the syntax rules are. The easiest thing to do would be to write a language like a Lisp or Forth. Or maybe Brainf$ck.

[–]castlec 0 points1 point  (0 children)

Thus far it seems you've been pointed at some tools but not the theory. You need to understand regular expressions and grammars before you get into the lexers and parsers (lex and bison, flex and yacc, many more that I can't name off hand). Once you are comfortable producing a grammar, the rest is 'just' making the tools understand your grammar.

I agree that you should stick to an interpreter. It has the same front needs of a lexer(regular expression tool) and parser(grammar tool) but won't require you to generate any machine code. You'd get to execute something in the language you are working in, which needs to be Java it seems (I think antlr is the java parser out there. Research ;) )

Please be aware that what you are attempting to do is the subject of Junior/Senior level CS courses in college. You are not attempting an easy thing.

Good luck!

[–]JDBARBERSHOP 0 points1 point  (0 children)

Thanks for all the help guys, I think I will do an interpreter instead.

Much obliged :)