python实现网页爬虫示例

用python里面的 requests 与 BeautifulSoup 结合,实现网页爬虫示例。

示例一:抓取中国省份:

import requests
from bs4 import BeautifulSoup

page = requests.get('http://www.stats.gov.cn/tjsj/tjbz/tjyqhdmhcxhfdm/2021/index.html')  # Getting page HTML through request
soup = BeautifulSoup(page.content, 'html.parser')  # Parsing content using beautifulsoup

links = soup.select("table tbody tr.provincetr td a")  # Selecting all of the anchors with titles
first10 = links  # Keep only the first 10 anchors
for anchor in first10:
    print(anchor.text)  # Display the innerText of each anchor

 

posted @ 2022-10-26 19:31  熊仔其人  阅读(158)  评论(0编辑  收藏  举报