摘要:
#!/bin/python#coding=utf-8import urllib,xlrd,lxml.html,re,pymongo,xlwtfail=open('fail','w')def getDocument(url,code='utf-8'): try: doc=lxml.html.fromstring(urllib.urlopen(url).read().decode(code)) print 'utf-8' except: doc=lxml.html.fromstring(urllib.urlopen(url).read
阅读全文