python之正则表达式

>>> import re
>>> str = 'i like penny'
>>> pa = re.compile(r'penny', re.I) #字符串前面的r表示该字符串为原生字符串,re.I表示匹配时不区分大小写。
>>> ma = pa.match('penny')
>>> ma.group()
'penny'

匹配单个字符:

>>> ma = re.match(r'.', 'a')
>>> ma.group()
'a'
>>> ma = re.match(r'.', 'ab')
>>> ma.group()
'a'
>>> ma = re.match(r'..', 'ab')
>>> ma.group()
'ab'

>>> ma = re.match(r'{[abc]}', '{a}')
>>> ma.group()
'{a}'
>>> ma = re.match(r'{[a-zA-Z0-9]}', '{a}')
>>> ma.group ()
'{a}'
>>> ma = re.match(r'\[[a-z]\]', '[a]')
>>> ma.group()
'[a]'

>>> ma = re.match(r'[A-Z][a-z]*', 'Penny')
>>> ma.group()
'Penny'
>>> ma = re.match(r'[_]+[\w]', '_penny')
>>> ma.group()
'_p'

>>> ma = re.match(r'^abc', 'abcd')
>>> ma.group ()
'abc'

>>> ma = re.match(r'[\w]*penny$', 'ilikepenny')
>>> ma.group()
'ilikepenny'

>>> ma = re.match(r'\Apenny[\w]*', 'pennyisbeautiful')
>>> ma.group ()
'pennyisbeautiful'

>>> ma = re.match(r'abc|d', 'abc')
>>> ma.group()
'abc'
>>> ma = re.match(r'abc|d', 'd')
>>> ma.group ()
'd'

>>> ma = re.match(r'[\w]{4,6}@(163|126).com', 'penny@163.com')
>>> ma.group ()
'penny@163.com'
>>> ma = re.match(r'[\w]{4,6}@(163|126).com', 'penny@126.com')
>>> ma.group ()
'penny@126.com'
>>> ma = re.match(r'<([\w]+>)[\w]+</\1', '<book>python</book>')
>>> ma.group()
'<book>python</book>'

>>> ma = re.match(r'<(?P<mark>[\w]+>)[\w]+</(?P=mark)', '<book>python</book>')
>>> ma.group()
'<book>python</book>'

 

1. search(pattern, string, flags=0)  在一个字符串中查找匹配

>>> str1 = 'i like penny,520'
>>> ma = re.search(r'\d+', str1)
>>> ma.group ()
'520'
>>> str1 = '520, i like penny, 520'
>>> ma = re.search(r'\d+', str1)
>>> ma.group()
'520'

2. findall(pattern, string, flags=0)  找到匹配, 返回所有匹配部分的列表

>>> ma = re.findall(r'\d+', str1)
>>> ma
['520', '520']

3. sub(pattern, repl, string, count = 0, flags=0)  将字符串中匹配正则表达式的部分替换为其他值

>>> ma = re.sub(r'\d+', '820', str1)
>>> ma
'820, i like penny, 820'
>>> ma = re.sub(r'\d+', '820', str1, count = 1)
>>> ma
'820, i like penny, 520'

4. split(pattern, string, maxsplit=0, flags=0)  根据匹配分割字符串, 返回分割字符串组成的列表

>>> str1 = 'c,python,c# java:C++'
>>> ma = re.split(r',| |:', str1)
>>> ma
['c', 'python', 'c#', 'java', 'C++']
posted @ 2016-08-11 00:38  PY_Haynes  阅读(152)  评论(0)    收藏  举报