2020 年 1月 2 日随笔档案 - chanyuli

2020年1月2日

摘要： ```python import requests from bs4 import BeautifulSoup import re from mysql_control import MySQL # 爬虫三部曲 # 1.发送请求 def get_html(url): response = requests.get(url) return response # 2.解析数据 def parse_da 阅读全文

posted @ 2020-01-02 19:10 chanyuli 阅读(166) 评论(0) 推荐(0)

爬取梨视频主页所有视频

摘要： ```python import requests import re import uuid from concurrent.futures import ThreadPoolExecutor pool = ThreadPoolExecutor(50) # 爬虫三部曲 # 1.发送请求 def get_html(url): print(f'start: {url}...') response = 阅读全文

posted @ 2020-01-02 19:09 chanyuli 阅读(208) 评论(0) 推荐(0)

爬取豆瓣top250电影的信息

摘要： ```python import requests import re headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36' } # 爬虫三部曲 # 1.发送请求 def get 阅读全文

posted @ 2020-01-02 19:06 chanyuli 阅读(263) 评论(2) 推荐(0)

request补充和bs4的五种过滤器

摘要： reques补充 Response的属性 bs4五种过滤器阅读全文

posted @ 2020-01-02 19:05 chanyuli 阅读(232) 评论(0) 推荐(0)

Chanyuli

chanyuli

公告