python爬网页图片

终于还是开始学python了

看着网上的教程把一个网页的图片抓下来。

原理很简单，先把网页代码保存下来，然后根据正则匹配，把图片的url保存下来。在用python的包包，把图片下载下来。

#coding=utf-8
import re
import urllib

def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html

def getImg(html):
    reg = r'src=.{0,200}JPEG"'
    imgre = re.compile(reg)
    imglist = re.findall(imgre,html)
    x = 0
    for imgurl in imglist:
        urllib.urlretrieve(imgurl[5:-1],'%s.jpeg' % x)
        x+=1

html = getHtml("https://mbd.baidu.com/newspage/data/landingsuper?context=%7B%22nid%22%3A%22news_5071651449410950599%22%7D&n_type=0&p_from=1")

getImg(html)

posted @ 2017-12-20 14:30 猪是得念来过倒阅读(177) 评论(0) 收藏举报

刷新页面返回顶部

猪是得念来过倒

python爬网页图片

公告