识别图片验证码的三种方式(scrapy模拟登陆豆瓣网)

1.通过肉眼识别，然后输入到input里面

from PIL import image Image

request.urlretrieve(url,'image') #下载验证码图片

image = Image.open('image') #程序内部打开图片

image.show() #将图片显示出来

captch = input("请输入验证码") #输入你看到的验证码

2.通过阿里云里面的付费识别：https://market.aliyun.com/products/57124001/cmapi031940.html?spm=5176.730005.productlist.d_cmapi031940.46da3524VkSlar&innerSource=search_%E6%99%BA%E8%83%BD%E5%9B%BE%E5%83%8F%E6%8A%80%E6%9C%AF#sku=yuncode2594000001 #自动识别豆瓣网图形验证码

3.使用谷歌的开源库tesseract，但是识别率比较低

import pytesseract
from PIL import Image

pytesseract.pytesseract.tesseract_cmd = r"D:\Tesseract-OCR\tesseract.exe"
image = Image.open("c.PNG")
text = pytesseract.image_to_string(image,lang='chi_sim')
print(text)

posted @ 2019-03-14 10:42 乔儿阅读(546) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

乔儿

识别图片验证码的三种方式(scrapy模拟登陆豆瓣网)

公告