2018 年 10月 12 日随笔档案 - ystraw

2018年10月12日

摘要： 1.获取子标签： thr_msgs = soup.find_all('div',class_=re.compile('msg')) for i in thr_msgs: print(i) first = i.select('em:nth-of-type(1)') print(first) >>> < 阅读全文

posted @ 2018-10-12 22:21 ystraw 阅读(7504) 评论(0) 推荐(0)

22-python爬虫解决gbk乱码问题

摘要：转载自： python爬虫解决gbk乱码问题今天尝试了下爬虫，爬取一本小说，忘语的凡人修仙仙界篇，当然这样不好，大家要支持正版。爬取过程中是老套路，先获取网页源代码 # -*- coding:UTF-8 -*- from bs4 import BeautifulSoup import reque 阅读全文

posted @ 2018-10-12 22:13 ystraw 阅读(586) 评论(0) 推荐(0)