爬虫python3:TypeError: cannot use a string pattern on a bytes-like object

import re
from common_p3 import download

def crawl_sitemap(url):
    sitemap = download(url)
    links = re.findall('<loc>(.*?)</loc>',sitemap)
    print('links=',links)
    for link in links:
        print('link=',link)
        html = download(link)
    return

crawl_sitemap('http://example.webscraping.com/sitemap.xml')

TypeError: cannot use a string pattern on a bytes-like object

(主要是版本问题)
对于python3x
'sitemap = download(url)'应改为‘sitemap = download(url).decode('utf-8')’

 

posted @ 2017-12-27 11:44  秋华  阅读(601)  评论(0编辑  收藏  举报