Python 爬虫常见的坑和解决方法
1.请求时出现HTTP Error 403: Forbidden
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0'}  
req = urllib.request.Request(url=url, headers=headers)  
urllib.request.urlopen(req).read()  
详细:https://www.2cto.com/kf/201309/242273.html
2.保存html内容时出现Python UnicodeEncodeError: 'gbk' codec can't encode character
将
f = open("out.html","w")  
换成
f = open("out.html","w",encoding='utf-8')  
详细:http://www.jb51.net/article/64816.htm
 
                    
                     
                    
                 
                    
                
 
                
            
         
         浙公网安备 33010602011771号
浙公网安备 33010602011771号