抓取knewone异步加载列表页数据
代码如下:
from bs4 import BeautifulSoup
import requests
import time
url = 'https://knewone.com/things?page='
def get_page(url,data = None):
web_date =requests.get(url)
soup = BeautifulSoup(web_date.text,'lxml')
imgs = soup.select(' a.cover-inner > img')
titles = soup.select('h4.title > a')
if data == None:
for img,title in zip(imgs,titles):
date = {
'img':img.get('src'),
'title':title.get('title'),
}
print(date)
def get_more_page(start,end):
for one in range(start,end):
get_page(url+str(one))
time.sleep(2)
get_more_page(1,30)
主要是找到异步加载的网址并构建函数控制抓取页面需要思考,也是仿的教程上的,

浙公网安备 33010602011771号