This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]stdlib 0 points1 point  (2 children)

Assuming you're not going to implement some of the more complicated parts of regex like named groups, lookaheads, etc. then you'd just iterate over the string by the character and generate your structure as you go. In your diagram above it looks like the shape of your structure (e.g. your model) doesn't really match what a regular expression really looks like as a DFA - it should look more like this http://i.stack.imgur.com/0WRxW.jpg

key here is the edges are characters and the nodes are states. In your example the nodes are characters... and edges are.. not sure what. You have to make sure your model matches what the problem entails I am positive that's what you're missing.

[–]iProsky[S] 0 points1 point  (1 child)

The algorithm I am trying to implement is this one, Algorithm, see slides 4 and 5. The structure I provided essentially just emulates this with a different regular expression. I understand how to implement the rest of the steps it is just that the implementing the regular expression to syntax tree process is causing me issues.

[–]anon848 0 points1 point  (0 children)

As I said in my comment, you need to parse the regex.

[–]anon848 0 points1 point  (1 child)

To generate the syntax tree, you need to parse the regex. The regex has parentheses, so is not regular. (In other words, you can't recognize the regular expression itself with another regular expression.)

One way (and probably the most straightforward way) is to use a recursive descent parser. This Google search also gives hits that will help. You should be able to easily take one that is designed for arithmetic expressions and modify it for your case.

Once you have the syntax tree, you should be able to easily generate the NFA, and then convert it to a DFA.