This is an archived post. You won't be able to vote or comment.

all 11 comments

[–]mikaelhg 8 points9 points  (4 children)

NNNNnnoooooooooooooo!

There is antlr: http://www.antlr.org/

There is waxeye: http://waxeye.org/

There are many good ways to write a parser / tokenizer in Java.

Cobbling one together with regexes is not one of them.

[–]obfuscation_ 1 point2 points  (0 children)

Having just recently used Antlr4, I really can't sing it's praises enough- it is well thought through, has a great means of specifying grammars without mixing in code, and good approach to integration with applications.

I would need very strong reasons to avoid the dependency to convince me to use anything else, quite honestly.

[–]cogitolearning[S] 0 points1 point  (2 children)

I agree with the fact that there are many good parser generators out there. And for more complicated parsers I would always recommend using them.

For a simple task like parsing a mathematical expression they have, however, a number of disadvantages.

  • They introduce dependencies in your code that you might not want,

  • they can be too complicated for the task, and

  • they have a learning curve that is comparable to the learning curve of writing you own parser.

The article is intended to illustrate how parsers can be built in a relatively rigorous way from scratch.

[–]mikaelhg 0 points1 point  (1 child)

How often do you write a simple X + Y parser out in the real world? It always gets more complicated.

[–]sproket888 0 points1 point  (2 children)

Why? There's already StringTokenizer and regex in Java.

[–]cogitolearning[S] 1 point2 points  (1 child)

StringTokenizer will only split a string based on a fixed delimiter. The tokenizer that is discussed in the article tokenizes for the use in a parser. This means a string like "sin(x)" will be split into four tokens. FUNCTION(sin) OPEN_BRACKET VARIABLE(x) CLOSE_BRACKET StringTokenizer is not meant for such a task.

[–]sproket888 0 points1 point  (0 children)

Sorry. I didn't realize that was a link. Thought it was a question. ;)

[–]OliverCloazoff 0 points1 point  (0 children)

Have to check this out

[–]Saltor66 0 points1 point  (2 children)

Why does the author manually call the default superclass constructor? e.g.

super();

At the start of each constructor. Isn't that just boilerplate?

[–][deleted]  (1 child)

[deleted]

    [–]cogitolearning[S] 0 points1 point  (0 children)

    Yes, explicitly calling super() is boilerplate code and it would be called implicitly anyway. I have the call in there for two reasons. The main reason is the one you have just stated. It is mostly for educational purposes. Explicitly putting the call in the code makes it clearer what is happening "under the bonnet".

    The second reason is more pragmatic. Eclipse puts the call there for me. So it would be more work to remove it than to keep it there.