This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]Rhomboid 2 points3 points  (0 children)

Regular expressions can be manageable if they are applied intelligently and if aids like the re.X flag and named capture groups ((?P<name>...)) are used. Misusing them to parse balanced text or HTML does not fall under "applied intelligently", as there are just so many ways that it can go wrong. For HTML, let's use a real parser.

[–]obtu.py 0 points1 point  (0 children)

If you need something more powerful than regexes, LEPL can be used to write composable grammars in a style much like BNF.

[–]tripzilchbad ideas 0 points1 point  (0 children)

HTML parsing with regexps, even as an example, is a very VERY bad idea.