python开发遇到的坑(1)xpath解析ValueError: Unicode strings with encoding declaration are not supported

Traceback (most recent call last):
  File "/Users/*******.py", line 37, in <module>
    BtcSpider().run()
  File "/Users/******.py", line 34, in run
    self.parse_data(data)
  File "/Users/******.py", line 21, in parse_data
    xpath_data = etree.HTML(data)
  File "src/lxml/etree.pyx", line 3161, in lxml.etree.HTML
  File "src/lxml/parser.pxi", line 1872, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

  爬了一个论坛,网页是<meta http-equiv="Content-Type" content="text/html; charset=gb2312"> 但是Mac爬取的网页utf-8解码才正确,但是在 xpath 解析的时候出现上面问题,

xpath 解析的时候 encode 一下就可以了,看代码:

xpath_data = etree.HTML(data.encode('utf-8'))

  问题解决啦

posted @ 2018-12-18 19:46  夏落若的博客  阅读(4638)  评论(1编辑  收藏  举报