2019 年 10月 21 日随笔档案 - 数据--熊

2019年10月21日

摘要： python 爬虫和解析库的安装：pip install requests; pip install beautifulsoup4 requests 的几个常用方法： requests.request() #以下各方法的基础 requests.get(url,params=None,**kwarg 阅读全文

posted @ 2019-10-21 23:06 数据--熊阅读(709) 评论(0) 推荐(0)

re库

摘要： python 的re库为： raw string 类型（原生字符串类型，即不含转义字符）在字符串前面加 r'...'就行了 Re库主要功能函数 re.rearch(pattern,string,flags=0) 在一个字符串中搜索匹配表达式第一个位置，返回match对象 *pattern:正则表达阅读全文

posted @ 2019-10-21 21:16 数据--熊阅读(452) 评论(0) 推荐(0)

经典正则表达

摘要： ^[A-Za-z]+$ 由26个字母组成的字符串 ^[A-Za-z0-9]+$ 由26个字母和数字组成的字符串 ^-?\d+$ 整数形式的字符串 ^[0-9]*[1-9][0-9]*$ 正整数形式的字符串 [1-9]\d{5} 中国境内邮政编码，6位 [\u4e00-\u9fa5] 匹配中文字符 \ 阅读全文

posted @ 2019-10-21 20:10 数据--熊阅读(207) 评论(0) 推荐(0)

python爬虫学习1

摘要： 1 import requests 2 from bs4 import BeautifulSoup 3 import bs4 4 def gethtmltext(url): #获取html内容,利用try和except框架可以抛出异常 5 try: 6 r = requests.get(url,ti 阅读全文

posted @ 2019-10-21 10:00 数据--熊阅读(244) 评论(0) 推荐(0)

数据--熊

公告