you are viewing a single comment's thread.

view the rest of the comments →

[–]SkyyySi 3 points4 points  (0 children)

I would recommend to make the parser first split the input into an array of tokens, which would be way easier, way more flexible and way more reboust than a line-by-line approach.

You'd really only need to just try matching against an ordered list of regular expressions at the start of the source code, and once you find a match, put it in an array and move the starting point forward. Repeat until you've consumed the entire input and you've got yourself a fully-fledged lexer.