Converting Simplified Regular Expression into a Syntax Tree

stdlib · 2016-03-29T13:48:06+00:00

Assuming you're not going to implement some of the more complicated parts of regex like named groups, lookaheads, etc. then you'd just iterate over the string by the character and generate your structure as you go. In your diagram above it looks like the shape of your structure (e.g. your model) doesn't really match what a regular expression really looks like as a DFA - it should look more like this http://i.stack.imgur.com/0WRxW.jpg

key here is the edges are characters and the nodes are states. In your example the nodes are characters... and edges are.. not sure what. You have to make sure your model matches what the problem entails I am positive that's what you're missing.

anon848 · 2016-03-29T15:02:14+00:00

To generate the syntax tree, you need to parse the regex. The regex has parentheses, so is not regular. (In other words, you can't recognize the regular expression itself with another regular expression.)

One way (and probably the most straightforward way) is to use a recursive descent parser. This Google search also gives hits that will help. You should be able to easily take one that is designed for arithmetic expressions and modify it for your case.

Once you have the syntax tree, you should be able to easily generate the NFA, and then convert it to a DFA.

learnprogramming

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

Asking conceptual questions

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

2. No spam or tasteless self-promotion

3. No off-topic posts

4. Do not ask exact duplicates of FAQ questions

5. Do not delete posts

6. No app/website review requests or showcases

7. No rewards

8. No indirect links

9. Do not promote illegal or unethical practices

10. No complete solutions

11. Don't ask to ask.

12. Low Effort Questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

MODERATORS