Python 正则表达式【二】

关于前向,后向,匹配,非匹配

 

Matches if ... matches next, but doesn’t consume any of the string. This is called a lookahead assertion. For example, Isaac (?=Asimov) will match 'Isaac ' only if it’s followed by 'Asimov'.(?!...)Matches if ... doesn’t match next. This is a negative lookahead assertion. For example, Isaac (?!Asimov) will match 'Isaac ' only if it’s not followed by 'Asimov'.(?<=...)Matches if the current position in the string is preceded by a match for ... that ends at the current position. This is called a positive lookbehind assertion. (?<=abc)def will find a match in abcdef, since the lookbehind will back up 3 characters and check if the contained pattern matches. The contained pattern must only match strings of some fixed length, meaning that abc or a|b are allowed, but a* and a{3,4} are not. Note that patterns which start with positive lookbehind assertions will never match at the beginning of the string being searched; you will most likely want to use the search() function rather than the match() function:

 

>>> import re
>>> m = re.search('(?<=abc)def', 'abcdef')
>>> m.group(0)
'def'

 

This example looks for a word following a hyphen:

 

>>> m = re.search('(?<=-)\w+', 'spam-egg')
>>> m.group(0)
'egg'
(?<!...)Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length. Patterns which start with negative lookbehind assertions may match at the beginning of the string being searched.

 

 

前向,后向,匹配,非匹配示例代码:

 1 import re
 2 def testPrevPostMatch():
 3     # post match:       (?=xxx)
 4     # post non-match:   (?!xxx)
 5     # prev match:       (?<=xxx)
 6     # prev non-match:   (?<!xxx)
 7  
 8     #note that input string is:
 9     #src=\"http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA\"
10     qqPicUrlStr             = 'src=\\"http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA\\"'
11     qqPicUrlInvalidPrevStr  = '1234567http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA\\"'
12     qqPicUrlInvalidPostStr  = 'src=\\"http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA123'
13     canFindPrevPostP = r'(?<=src=\\")(?P<qqPicUrl>http://.+?\.qq\.com.+?)(?=\\")'
14     qqPicUrl = ""
15  
16     foundPrevPost = re.search(canFindPrevPostP, qqPicUrlStr)
17     print "foundPrevPost=",foundPrevPost
18     if(foundPrevPost):
19         qqPicUrl = foundPrevPost.group("qqPicUrl")
20         print "qqPicUrl=",qqPicUrl; # qqPicUrl= http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA
21         print "can found qqPicUrl here"
22  
23     foundInvalidPrev = re.search(canFindPrevPostP, qqPicUrlInvalidPrevStr)
24     print "foundInvalidPrev=",foundInvalidPrev; # foundInvalidPrev= None
25     if(not foundInvalidPrev):
26         print "can NOT found qqPicUrl here"
27  
28     foundInvalidPost = re.search(canFindPrevPostP, qqPicUrlInvalidPostStr)
29     print "foundInvalidPost=",foundInvalidPost; # foundInvalidPost= None
30     if(not foundInvalidPost):
31         print "can NOT found qqPicUrl here"
32  
33     return

 

Python中正则表达式关于引用named group的用法示例

 

 1 import re
 2 def testBackReference():
 3     # back reference (?P=name) test
 4     backrefValidStr = '"group":0,"iconType":"NonEmptyDocumentFolder","id":"9A8B8BF501A38A36!601","itemType":32,"name":"released","ownerCid":"9A8B8BF501A38A36"'
 5     backrefInvalidStr = '"group":0,"iconType":"NonEmptyDocumentFolder","id":"9A8B8BF501A38A36!601","itemType":32,"name":"released","ownerCid":"987654321ABCDEFG"'
 6     backrefP = r'"group":\d+,"iconType":"\w+","id":"(?P<userId>\w+)!\d+","itemType":\d+,"name":".+?","ownerCid":"(?P=userId)"'
 7     userId = ""
 8  
 9     foundBackref = re.search(backrefP, backrefValidStr)
10     print "foundBackref=",foundBackref; # foundBackref= <_sre.SRE_Match object at 0x02B96660>
11     if(foundBackref):
12         userId = foundBackref.group("userId")
13         print "userId=",userId; # userId= 9A8B8BF501A38A36
14         print "can found userId here"
15  
16     foundBackref = re.search(backrefP, backrefInvalidStr)
17     print "foundBackref=",foundBackref; # foundBackref= None
18     if(not foundBackref):
19         print "can NOT found userId here"
20  
21     return

 

 

 

posted on 2013-08-09 00:34  JasonKwok  阅读(381)  评论(0)    收藏  举报

导航