Nested Constructs in Regular Expressions

Today I was looking for a way to do nested expression matching with regular expressions, and pretty much came up empty.  Then after a trip to the bookstore to pick up Mastering Regular Expressions by Jeffrey Friedl, I finally found it.

Interestingly, even now that I know what to search for :-), I can't find a single reference to this on the net or on MSDN.

With the .NET regular expression evaluator, there are (?<DEPTH>) and (?<-DEPTH>) constructs that you can use to match nested expressions; for example, if you want to find matching parentheses, or matching HTML tags. Here's a "simple" example that will match nested <div> tags:

<div>(?>(?<DEPTH>(<div>))|(?<-DEPTH>(</div>))|.?)*(?(DEPTH)(?!))</div>

Which will match the part in red below:

before <div>there are  
<div>some <div>red</div> text</div> after

This is pretty cool, I've got to say. I really can't do this justice; if you're interested, I recommend you pick up the book!

posted @ 2007-08-15 18:51 lywf 阅读(42) 评论(0) 编辑 收藏