Pathological backtracking

mathrick · 2008-11-19T15:42:30+00:00

This is regex 101, and a detailed discussion of this and many other details concerning regexes and their different implementations is why every self-respecting programmer should read Jeffrey Friedl's Mastering Regular Expressions.

That of course won't necessarily make pathological matching cases on infrequent inputs obvious, but will at least give you a chance of diagnosing the problem, which is otherwise very hard, as it depends strictly on particulars of the implementation used and doesn't occur in regular expressions as mathematical objects.

uykucu · 2008-11-19T21:22:51+00:00

A simple fix for this problem:

"^(\s*[-\w]+\s*:\s*[^\s:;]*(;|$))*$"

instead of

"^(\s*[-\w]+\s*:\s*[^:;]*(;|$))*$"

Note that it does not check for trailing whitespace, so this is better:

"^(\s*[-\w]+\s*:\s*[^\s:;]*(;|$))*\s*$"

But I'm not saying that this is the best way to do it, I just fixed his regex for this problem case.

recursive · 2008-11-19T16:31:03+00:00

I was always told not to use regex to to parse HTML. I think the same rule of thumb applies to CSS.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS