re正则表达式

匹配模式

从字符串中全部查找内容，返回一个列表

import re
s = "hello_宇霖_hello"
print(re.findall("hello", s))

\w 查找字母（包含中文）或数字或下划线

import re
s = "hello_宇霖_hello"
print(re.findall("\w", s))

\W 查找非字母（包含中文）或数字或下划线

import re
s = "hello_宇霖_hello123!@#$"
print(re.findall("\W", s))

\s 查找空格符或\n或\t

import re
s = "hello 宇霖 hello\n\t"
print(re.findall("\s", s))

\S 查找非空格符或\n或\t

import re
s = "hello 宇霖 hello\n\t"
print(re.findall("\S", s))

\d 查找数字

import re
s = "hello 宇霖 hello123\n\t"
print(re.findall("\d", s))

\D 查找非数字

import re
s = "hello 宇霖 hello123\n\t"
print(re.findall("\D", s))

\A ^ 查找是否以什么开头的内容

import re
s = "hello宇霖_123hello\t\n"
print(re.findall("\Ahello", s))
print(re.findall("^hello", s))  # 常用

\Z $ 查找是否以什么结尾的内容

import re
s = "hello宇霖_123hello"
print(re.findall("o\Z", s))
print(re.findall("o$", s))  # 常用

查找换行符和制表符

import re
s = "hello宇霖_123hello\t\n"
print(re.findall("\n", s))
print(re.findall("\t", s))

. 只能匹配任意一个内容（非换行符）

import re
s = "h\nllo宇霖_123hallo\t\n"
print(re.findall("h.l", s))
# re.DOTALL : 修改非换行符BUG
s = "h\nllo宇霖_123hello\t\n"
print(re.findall("h.l", s, re.DOTALL))

[] 匹配字符组中的字符

import re
s = "hello宇霖_1A-2B-3C hello"
print(re.findall("[a-z]", s))  # 小写的a-z
print(re.findall("[A-Z]", s))  # 大写的A—Z
print(re.findall("[A-Za-z]", s))  # 大小写的A-Z a-z
print(re.findall("[A-Za-z0-9]", s))  # 大小写的A-Z a-z 0-9

[^] 匹配非字符组中的字符

import re
s = "hello宇霖_1A-2B-3C hello"
print(re.findall("[^0-9]",s))

匹配0个或多个（贪婪匹配）

import re
s = "hello宇霖_1A-2B-3C hellohhhohh"
print(re.findall("h*", s))

匹配1个或多个（贪婪匹配）

import re
s = "hello宇霖_1A-2B-3C hellohhhohh"
print(re.findall("h+", s))

? 匹配0个或1个（非贪婪匹配）

import re
s = "hello宇霖_1A-2B-3C hellohhhohh"
print(re.findall("h?", s))

h{2} h重复2次(hh)

import re
s = "hello宇霖_1A-2B-3C hellohhhohh"
print(re.findall("h{2}", s))

指定最少多少次，最多多少次

import re
s = "hello宇霖_1A-2B-3C hellohhhohh"
print(re.findall("h{1,2}", s))  # h{1,3} h hh hh

| 或

import re
s = "hello宇霖_1A-2B-3C hellohhhohh"
print(re.findall("h|o", s))

() 匹配括号内的表达式，也表示一个组

import re
s = "hello宇霖_1A-2B-3C hello hhhohh"
print(re.findall("h(...)o", s))

常用方法

search 找到1个后就停止查找，从字符串中进行查找，找到后返回的是一个对象，查看元素.group()

import re
s = "hello yulin hello"
print(re.search("hello", s).group())

match 找到1个后就停止查找，只从字符串的开头进行查找，找到后返回的是一个对象，查看元素.group()

import re
s = "hello yulin hello"
print(re.match("hello", s).group())

split 分隔可按照任意分隔符进行分隔

import re
s = "hello yulin,hello!world;50"
print(re.split("[#,!; ]", s))

sub 替换

import re
print(re.sub("10", "yulin", "10是个靓仔。"))

compile 定义匹配规则

import re
obj = re.compile("\w")
print(obj.findall("hello_yulin_hello"))

finditer 返回的是一个迭代器的地址

import re
g = re.finditer("\w", "hello_yulin_hello")
for i in g:
    print(i.group())

练习1：

import re
s = "1-2*(60+(-40.35/5)-(-4*3)"
# 找整数
print(re.findall("\d+", s))
# 找所有的数字（包含小数）
print(re.findall("\d+\.\d+|\d+", s))
# 找所有的数字（包含小数和负数）
print(re.findall("-?\d+\.\d+|-?\d+", s))

练习2：

匹配QQ号：从10000开始

import re
qq = input("请输入QQ号：")
print(re.findall("[1-9][0-9]{4,9}", qq))

posted @ 2020-11-02 08:50 Ylinn 阅读(194) 评论(0) 收藏举报

刷新页面返回顶部

Ylinn

re正则表达式

匹配模式

从字符串中全部查找内容，返回一个列表

\w 查找字母（包含中文）或数字或下划线

\W 查找非字母（包含中文）或数字或下划线

\s 查找空格符或\n或\t

\S 查找非空格符或\n或\t

\d 查找数字

\D 查找非数字

\A ^ 查找是否以什么开头的内容

\Z $ 查找是否以什么结尾的内容

查找换行符和制表符

. 只能匹配任意一个内容（非换行符）

[] 匹配字符组中的字符

[^] 匹配非字符组中的字符

匹配0个或多个（贪婪匹配）

匹配1个或多个（贪婪匹配）

? 匹配0个或1个（非贪婪匹配）

h{2} h重复2次(hh)

指定最少多少次，最多多少次

| 或

() 匹配括号内的表达式，也表示一个组

常用方法

search 找到1个后就停止查找，从字符串中进行查找，找到后返回的是一个对象，查看元素.group()

match 找到1个后就停止查找，只从字符串的开头进行查找，找到后返回的是一个对象，查看元素.group()

split 分隔 可按照任意分隔符进行分隔

sub 替换

compile 定义匹配规则

finditer 返回的是一个迭代器的地址

练习1：

练习2：

匹配QQ号：从10000开始

公告

split 分隔可按照任意分隔符进行分隔