摘要:
# pip install beautifulsoup4 from bs4 import BeautifulSoup html_doc = """ The Dormouse's story The Dormouse's story Once upon a time there were three little sisters; and their names were Elsie,... 阅读全文
posted @ 2019-03-30 11:56
hank-li
阅读(126)
评论(0)
推荐(0)
摘要:
# pip install beautifulsoup4 from bs4 import BeautifulSoup html_doc = """ The Dormouse's story The Dormouse's story Once upon a time there were three little sisters; and their names were Elsie, L... 阅读全文
posted @ 2019-03-30 11:53
hank-li
阅读(163)
评论(0)
推荐(0)
摘要:
import requests from lxml import etree import json class BtcSpider(object): def __init__(self): self.base_url = 'http://8btc.com/forum-61-' self.headers = { "User-Ag... 阅读全文
posted @ 2019-03-30 11:40
hank-li
阅读(146)
评论(0)
推荐(0)
摘要:
from lxml import etree html = """ 1 子 2 子 3 子 4 子 5 子 """ # 1.转类型 x_data ... 阅读全文
posted @ 2019-03-30 11:15
hank-li
阅读(82)
评论(0)
推荐(0)
摘要:
import re import requests # 安装支持 解析html和XML的解析库 lxml # pip install lxml from lxml import etree url = 'http://news.baidu.com/' headers = { "User-Agent": 'Mozilla/5.0 (Macintosh; Intel Mac OS X 1... 阅读全文
posted @ 2019-03-30 11:14
hank-li
阅读(104)
评论(0)
推荐(0)
摘要:
import re import requests url = 'http://news.baidu.com/' headers = { "User-Agent": 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Saf... 阅读全文
posted @ 2019-03-30 11:13
hank-li
阅读(199)
评论(0)
推荐(0)
摘要:
import re # 1.拆分字符串 one = 'asdsfsgsh' # 标准 是 s 为拆分 pattern = re.compile('s') result = pattern.split(one) # print(result) # 2.匹配中文 two = '网页是最新版本的,适配移动端' # python中 匹配中问 [a-z] unicode的范围 * + ? pat... 阅读全文
posted @ 2019-03-30 11:11
hank-li
阅读(128)
评论(0)
推荐(0)