爬虫 - 随笔分类 - 魅力宁波

Scrapy框架登陆抽屉并点赞新闻

摘要：import scrapy from scrapy.http import Request from scrapy.selector import Selector class ChoutiSpider(scrapy.Spider): name = 'chouti' allowed_domains = ['chouti.com'] start_urls = ['http... 阅读全文

posted @ 2017-11-13 23:12 魅力宁波

Scrapy爬虫框架

摘要：一、安装Scrapy框架 Linux/Mac：直接pip3 install scrapy windows：1、pip3 install wheel 2、安装对应python版本和位数的Twisted的whl包点击下载 3、安装pywin32 点击下载 4、pip3 install scrapy 二阅读全文

posted @ 2017-11-13 23:05 魅力宁波

异步非阻塞实现方案

摘要：Twisted示例 gevent+requests asyncio+requests asyncio+aiohttp tornado asyncio示例一 asyncio示例二 grequests 自定义异步IO模块 import select import socket import time c 阅读全文

posted @ 2017-11-08 16:19 魅力宁波

模拟网页版微信

摘要：views.py login.html index.html contact_list.html 阅读全文

posted @ 2017-11-07 19:50 魅力宁波

BeautifulSoup模块

摘要：一、BeautifulSoup模块基本使用 soup=BeautifulSoup(解析内容,'html.parser(解析器)') 生成document对象 soup=BeautifulSoup(解析内容,features='lxml') lxml与html.parser都是解析器，但是lxml是第阅读全文

posted @ 2017-11-07 18:56 魅力宁波

自动登陆Github

摘要：零、模拟登陆关键参数 user-agent Referer 跳转到该链接前的链接 content-type host cookie 一、自动登陆模式一：登陆成功后设置cookie 1 获取登陆页面的csrf_token 2 POST发送用户名，密码，token 3 获取cookie 模式二：访问登阅读全文

posted @ 2017-11-07 18:11 魅力宁波

requests模块

摘要：一、requests模块请求方法中的常用参数： url='' 指定访问链接地址 params={'key':'value',} 在链接中传数据 cookies={'key':'value'} 传cookie值 headers={'key':'value'} 传请求头数据 data={}|json数阅读全文

posted @ 2017-11-07 18:07 魅力宁波

美好生活

随笔分类 - 爬虫