python正则表达式

一、问题：
对返回的非json结果如何拿到里面想要的那部分内容

二、回答
可以通过正则表达式获取：findall
.* 匹配0~n个内容
.+ 起码匹配1个
() 分组

如对xml结果的数据进行处理，代码如下所示：

 1 def testPostXml():  
 2     '''  
 3     xml数据  
 4     :return:  
 5     '''    
 6     url = 'http://ws.webxml.com.cn/WebServices/MobileCodeWS.asmx'  
 7     xmlHeader = {'Content-type': 'text/xml'}  
 8     xmlData = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">' \  
 9               '<soap:Body><getMobileCodeInfo xmlns="http://WebXml.com.cn/">' \  
10               '<mobileCode>18819872234</mobileCode>' \  
11               '<userID></userID>' \  
12               '</getMobileCodeInfo>' \  
13               '</soap:Body></soap:Envelope>'  
14     response = requests.post(url=url,headers=xmlHeader,data=xmlData)  
15     xmlResult = response.text  
16     reg = '<getMobileCodeInfoResult>(.*)</getMobileCodeInfoResult>'  
17     result = re.findall(reg,xmlResult)  
18     print(result[0])
19 
20 >>> 运行结果如下：
21 >>> 18819872234：广东 茂名 广东移动全球通卡

这里就用了正则表达式：< getMobileCodeInfoResult>(.* )</getMobileCodeInfoResult>
采用findall分组的方式匹配getMobileCodeInfoResult中间的内容

三、扩展

还可以用search和compile的方式实现

 1 xmlContent = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><getMobileCodeInfoResponse xmlns="http://WebXml.com.cn/"><getMobileCodeInfoResult>18819872234：广东 茂名 广东移动全球通卡</getMobileCodeInfoResult></getMobileCodeInfoResponse></soap:Body></soap:Envelope>'  
 2 reg3 = '<getMobileCodeInfoResult>(.*)</getMobileCodeInfoResult>'
 3 
 4 print("方法一：findall，通过分组拿到里面的数据")  
 5 m3 = re.findall(reg3,xmlContent)  
 6 print(m3)
 7 
 8 print("方法二：search")  
 9 m = re.search(reg3, xmlContent)  
10 print(m.group(0))  
11 print(m.group(1))  
12   
13 print("方法三：compile")  
14 c = re.compile(reg3)  
15 r = c.search(xmlContent)  
16 print(r.group(0))  
17 print(r.group(1))
18 
19 >>> 运行结果如下：
20 >>> 方法一：findall，通过分组拿到里面的数据
21 >>> ['18819872234：广东 茂名 广东移动全球通卡']
22 >>> 方法二：search
23 >>> <getMobileCodeInfoResult>18819872234：广东 茂名 广东移动全球通卡</getMobileCodeInfoResult>
24 >>> 18819872234：广东 茂名 广东移动全球通卡
25 >>> 方法三：compile
26 >>> <getMobileCodeInfoResult>18819872234：广东 茂名 广东移动全球通卡</getMobileCodeInfoResult>
27 >>> 18819872234：广东 茂名 广东移动全球通卡

可以看到上面的search和compile的区别是：
search方法如果用compile先编译好的数据调用，则参数只需要传内容1个字段即可；
search方法如果用re调用，即没有事先用compile编译，则参数需要传正则表达式和内容2个字段。

posted @ 2023-11-28 13:22 秒秒开心阅读(2) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

秒秒开心

python正则表达式

公告