从图片中提取文字
import pytesseract from PIL import Image pytesseract.pytesseract.tesseract_cmd = r"C:\Users\xxx\AppData\Local\Programs\Tesseract-OCR\tesseract.exe" picture = r"C:\Users\xxx\Downloads\tupianshibie2.png" image = Image.open(picture) text = pytesseract.image_to_string(image) print(text)
一.安装:
1.安装pytesseract
pip install pytesseract
2.下载安装tesseract-qcr
如tesseract-ocr-w64-setup-v5.3.0.20221214.exe
https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.3.0.20221214.exe
3.脚本中配置:
pytesseract.pytesseract.tesseract_cmd = r"C:\Users\xxx\AppData\Local\Programs\Tesseract-OCR\tesseract.exe"
二.常见错误:
1.pytesseract.pytesseract.TesseractError: (2, 'Usage: pytesseract [-l lang] input_file')
需要配置pytesseract.pytesseract.tesseract_cmd为tesseract.exe,而不是pytesseract.exe
2.pytesseract.TesseractNotFoundError: tesseract is not installed or it‘s not in your path
安装了pytesseract后却没有安装tesseract-qcr
浙公网安备 33010602011771号