诚意
诚意如你,当一诚的态度对待

导航

 
1匹配一篇英文文章的标题 类似 The Voice Of China
#---->([A-Z][a-z ]+)+
2、匹配一个网址
类似 https://www.baidu.com http://www.cnblogs.com
#----->http[s]?://www\.\w+\.(com|cn)
  --------->^((https|http|ftp|rtsp|mms)?:\/\/)[^\s]+

3、匹配年月日日期 类似 2018-12-06 2018/12/06 2018.12.06
#------>[1-9]\d{3}[\-\/\.](1[0-2]|0?\d)[\-\/\.](3[01]|0?[1-9]|[12]\d)
---->

# \d{4}(\-|\/|.)\d{1,2}\1\d{1,2}
# \d{4}(?P<sep>\-|\/|.)\d{1,2}(?P=sep)\d{1,2}



4、匹配15位或者18位身份证号
#----->[1-9](\d{16}(\d|[xX])|\d{14})
----

# ^([1-9]\d{16}[0-9x]|[1-9]\d{14})$
# ^[1-9]\d{14}(\d{2}[0-9x])?$


5、从lianjia.html中匹配出标题,户型和面积,结果如下:
[('金台路交通部部委楼南北大三居带客厅 单位自持物业', '3室1厅', '91.22平米'), ('西山枫林 高楼层南向两居 户型方正 采光好', '2室1厅', '94.14平米')]
1 from urllib import request
2 import re
3 ret = request.urlopen('file:///E:/python/模块/模块re/lianjia.html')
4 res = ret.read().decode('utf-8')
5 # print(res)
6 pattern = '<div class="title">.*?data-sl="">(?P<name>.+?)</a>.*?<span class="divide">/</span>(?P<info>.*?)<span class="divide">/</span>(?P<space>.*?)<span'
7 rs=re.findall(pattern,res,re.S)     #re.S表示忽略换行符.是匹配除换行符之外的所有,加上re.S就是也匹配换行符
8 print(rs)

素材:lianjia.html

 1 <!DOCTYPE html>
 2 <html lang="en">
 3 <head>
 4     <meta charset="UTF-8">
 5     <title>Title</title>
 6 </head>
 7 <body>
 8 <div class="info clear">
 9     <div class="title">
10         <a class="" href="https://bj.lianjia.com/ershoufang/101103186217.html" target="_blank" data-log_index="1"
11            data-el="ershoufang" data-housecode="101103186217" data-is_focus="1" data-sl="">金台路交通部部委楼南北大三居带客厅   单位自持物业</a>
12         <span class="new tagBlock">新上</span></div>
13     <div class="address">
14         <div class="houseInfo">
15             <a href="https://bj.lianjia.com/xiaoqu/1111027381816/" target="_blank" data-log_index="1" data-el="region">延静西里 </a>
16             <span class="divide">/</span>3室1厅<span class="divide">/</span>91.22平米<span class="divide">/</span>南 北<span class="divide">/</span>简装<span class="divide">/</span>有电梯
17         </div>
18     </div>
19     <div class="flood">
20         <div class="positionInfo">低楼层(共15层)
21             <span class="divide">/</span>1984年建板塔结合
22             <span class="divide">/</span>
23             <a href="https://bj.lianjia.com/ershoufang/hongmiao/" target="_blank">红庙</a></div>
24     </div>
25     <div class="followInfo">859人关注<span class="divide">/</span>30次带看
26         <div class="timeInfo"><span class="timeIcon"></span>6天以前发布</div>
27         <div class="tag"><span class="subway">近地铁</span><span class="taxfree">房本满五年</span><span class="haskey">随时看房</span></div>
28         <div class="priceInfo">
29             <div class="totalPrice"><span>570</span></div>
30             <div class="unitPrice" data-hid="101103186217" data-rid="1111027381816" data-price="62487"><span>单价62487元/平米</span></div>
31         </div>
32     </div>
33 </div>
34 <div class="info clear">
35     <div class="title">
36         <a class="" href="https://bj.lianjia.com/ershoufang/101103188116.html" target="_blank" data-log_index="2"
37            data-el="ershoufang" data-housecode="101103188116" data-is_focus="1" data-sl="">西山枫林 高楼层南向两居 户型方正 采光好</a>
38         <span class="new tagBlock">新上</span><span class="yezhushuo tagBlock">房主自荐</span></div>
39     <div class="address">
40         <div class="houseInfo">
41             <a href="https://bj.lianjia.com/xiaoqu/1111027381123/" target="_blank" data-log_index="2" data-el="region">西山枫林三期 </a>
42             <span class="divide">/</span>2室1厅<span class="divide">/</span>94.14平米<span class="divide">/</span><span class="divide">/</span>简装<span class="divide">/</span>有电梯
43         </div>
44     </div>
45     <div class="flood">
46         <div class="positionInfo">中楼层(共10层)
47             <span class="divide">/</span>2006年建板楼
48             <span class="divide">/</span>
49             <a href="https://bj.lianjia.com/ershoufang/pingguoyuan1/" target="_blank">苹果园</a></div>
50     </div>
51     <div class="followInfo">630人关注<span class="divide">/</span>23次带看
52         <div class="timeInfo"><span class="timeIcon"></span>6天以前发布</div>
53         <div class="tag"><span class="taxfree">房本满五年</span><span class="haskey">随时看房</span></div>
54         <div class="priceInfo">
55             <div class="totalPrice"><span>495</span></div>
56             <div class="unitPrice" data-hid="101103188116" data-rid="1111027381123" data-price="52582"><span>单价52582元/平米</span></div>
57         </div>
58     </div>
59 </div>
60 </body>
View Code

 

posted on 2018-08-18 16:19  诚意  阅读(539)  评论(0)    收藏  举报