2018年8月23日

爬虫-豆瓣活动页面(利用beautifulsoup定位资源)

摘要: from bs4 import BeautifulSoupimport requestsurl = 'https://beijing.douban.com/events/week-party'response = requests.get(url)# with open('douban_party. 阅读全文

posted @ 2018-08-23 21:27 luwanhe 阅读(150) 评论(0) 推荐(0)

爬虫-雪球网(使用beautifulsoup定位资源)

摘要: from bs4 import BeautifulSoupimport requestsheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chro 阅读全文

posted @ 2018-08-23 21:24 luwanhe 阅读(212) 评论(0) 推荐(0)

爬虫-链家网(利用beautifulsoup定位资源)

摘要: from bs4 import BeautifulSoupimport requestsurl = 'https://bj.lianjia.com/ershoufang/c1111027378138/?sug=%E6%B5%81%E6%98%9F%E8%8A%B1%E5%9B%AD%E4%B8%89 阅读全文

posted @ 2018-08-23 21:22 luwanhe 阅读(558) 评论(0) 推荐(0)

2018年8月22日

selenium豆瓣登陆

摘要: from selenium import webdriverimport timeimport requestsfrom lxml import etreeimport base64# https://market.aliyun.com/products/57124001/cmapi028447.h 阅读全文

posted @ 2018-08-22 09:12 luwanhe 阅读(159) 评论(0) 推荐(0)

西刺代理多进程爬取

摘要: import requestsfrom lxml import etreeimport timeimport multiprocessing# 耗时 84.26855897903442 5# 耗时 44.181687355041504 10# 耗时 29.013262033462524 20# 耗时 阅读全文

posted @ 2018-08-22 09:11 luwanhe 阅读(113) 评论(0) 推荐(0)

2018年8月19日

知乎信息爬取(存在bug,望大牛指点)

摘要: import requestsfrom lxml import etreeimport pymysqlclass MysqlHelper(object): def __init__(self): self.db = pymysql.connect(host='127.0.0.1', port=330 阅读全文

posted @ 2018-08-19 22:04 luwanhe 阅读(290) 评论(0) 推荐(0)

电影天堂的种子爬取(数据获取不全面,存在bug望各位指点)

摘要: import requestsfrom lxml import etreeimport pymysqlfrom urllib import parseclass MysqlHelper(object): def __init__(self): self.db = pymysql.connect(ho 阅读全文

posted @ 2018-08-19 22:03 luwanhe 阅读(2307) 评论(0) 推荐(0)

腾讯招聘爬取

摘要: import requestsfrom bs4 import BeautifulSoupimport datetimeimport reimport pymysqlimport datetime#数据库封装class Mydb(): def __init__(self): try: self.con 阅读全文

posted @ 2018-08-19 21:46 luwanhe 阅读(258) 评论(0) 推荐(0)

妹子图爬取

摘要: import requestsimport pymysqlfrom lxml import etree#数据库封装class MysqlHelper(object): def __init__(self): self.db = pymysql.connect(host='127.0.0.1', po 阅读全文

posted @ 2018-08-19 21:41 luwanhe 阅读(169) 评论(0) 推荐(0)

链家信息爬取

摘要: 一、数据库封装 import pymysqlclass MysqlHelper(object): def __init__(self): self.db = pymysql.connect(host='127.0.0.1', port=3306, user='root', password='abc 阅读全文

posted @ 2018-08-19 10:52 luwanhe 阅读(621) 评论(0) 推荐(0)

导航