day_2:re
常用的匹配规则
常用的匹配函数
re.match(正则表达式,匹配内容, 修饰符):从字符串开始匹配,匹配成功返回结果,失败返回None
import re content = 'Hello 1234567 World_This is a Regex Demo' print(len(content)) # 41 result = re.match('^Hello\s(\d+)\sWorld', content) print(result) # 输出SRE_Match对象:<_sre.SRE_Match object; span=(0, 25), match='Hello 1234567 World'> print(result.group()) # Hello 1234567 World print(result.group(1)) # 1234567 print(result.span()) # (0, 19)
import re content = 'Hello 1234567 World_This is a Regex Demo' result = re.match('^He.*(\d+).*Demo$', content) # 贪婪匹配:.*尽可能匹配多的字符 print(result.group(1)) # 7 result = re.match('^He.*?(\d+).*Demo$', content) # 非贪婪匹配:.*?尽可能匹配少的字符 print(result.group(1)) # 1234567
import re content = """Hello 1234567 World_This is a Regex Demo""" result = re.match('^He.*?(\d+).*?Demo$', content, re.S) print(result.group(1)) # 1234567 result = re.match('^He.*?(\d+).*?Demo$', content) # 非贪婪匹配:.*?尽可能匹配少的字符 print(result.group(1)) # AttributeError: 'NoneType' object has no attribute 'group'
转义符:字符串包含().\等使用\来转义或者在正则表达式前加r(原始字符串)
import re content = '(百度)www.baidu.com' result = re.match(r'(百度)www.baidu.com', content) print(result)
result = re.match('\(百度\)www\.baidu\.com', content) print(result.group())
re.search(正则表达式,匹配内容, 修饰符):匹配整个字符串,返回第一个匹配成功的结果,失败返回None
re.findall(正则表达式,匹配内容, 修饰符):匹配整个字符串,返回所有匹配成功的结果list,失败返回None
re.sub(正则表达式,'新的内容',匹配内容):用新的内容替换匹配到的字符串,返回替换后的字符串
import re content = '54aK54yr50iR54ix5L2g' content = re.sub('\d+', '', content) print(content) # aKyriRixLg
re.compile(正则表达式, 修饰符):把正则表达式编译成正则表达式对象
import re content = '2018-11-18 12:00' pattern = re.compile('\s\d{2}:\d{2}') result = re.sub(pattern, '', content) print(result) # 2018-11-18