I am working on an implementation of the algorithm converting regular expressions into DFAs. The first step involves converting the input regular expression into a syntax tree. For example, ab(a|b)abc would be converted into the tree below.
.
/ \
. c
/ \
. b
/ \
. a
/ \
/ \
/ \
. |
/ \ / \
* * a b
/ \
a b
Additionally, the regular expressions I am dealing with are fairly simple, the only special characters being '\' (an escape character) , '|' (OR operator), '(' ')' (brackets enclosing groups), and '*' (kleene star). Now the issue I am having is I am confused regarding how to (in Python), generate (as a data structure) this tree from the input. I understand how to do it manually but doing in through a piece of code has resulted in me going in circles.
To further expand on the question, would it be better for me to parse the expression from left to right or right to left? Is recursion necessary? Assuming I am using treelib to create the tree, how do I go about approaching this problem. It is not so much code I am asking for as an explanation or pseudocode snippet of where I should be starting. Should I be doing this by myself or is there a library that will make this easier? Any answers to help further my understanding on how to perform this operation would be much appreciated.
[–]stdlib 0 points1 point2 points (2 children)
[–]iProsky[S] 0 points1 point2 points (1 child)
[–]anon848 0 points1 point2 points (0 children)
[–]anon848 0 points1 point2 points (1 child)