一、使用pytesseract识别图片中的问题-11

1、安装pytesseract

  • 目录:d:\python\lib\site-packages
C:\Users\jieqiong>pip install pytesseract
Collecting pytesseract
  Downloading https://files.pythonhosted.org/packages/8b/0d/6efe2a9bddf1b1efe82a86fdd057f4affaeebd14347f32d03bbbbc45821c/pytesseract-0.3.9-py2.py3-none-any.whl
pytesseract requires Python '>=3.7' but the running Python is 3.6.5
You are using pip version 9.0.3, however version 22.2.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

C:\Users\jieqiong>python -m pip install --upgrade pip
Collecting pip
  Downloading https://files.pythonhosted.org/packages/a4/6d/6463d49a933f547439d6b5b98b46af8742cc03ae83543e4d7688c2420f8b/pip-21.3.1-py3-none-any.whl (1.7MB)
    100% |████████████████████████████████| 1.7MB 64kB/s
Installing collected packages: pip
  Found existing installation: pip 9.0.3
    Uninstalling pip-9.0.3:
      Successfully uninstalled pip-9.0.3
Successfully installed pip-21.3.1
You are using pip version 21.3.1, however version 22.2.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

C:\Users\jieqiong>pip install pytesseract
Collecting pytesseract
  Using cached pytesseract-0.3.9-py2.py3-none-any.whl (14 kB)
Requirement already satisfied: Pillow>=8.0.0 in d:\python\lib\site-packages (from pytesseract) (8.4.0)
Collecting packaging>=21.3
  Downloading packaging-21.3-py3-none-any.whl (40 kB)
     |████████████████████████████████| 40 kB 653 kB/s
Collecting pytesseract
  Downloading pytesseract-0.3.8.tar.gz (14 kB)
  Preparing metadata (setup.py) ... done
Using legacy 'setup.py install' for pytesseract, since package 'wheel' is not installed.
Installing collected packages: pytesseract
    Running setup.py install for pytesseract ... done
Successfully installed pytesseract-0.3.8

 

2、运行报错

D:\imooc\selenium\read_image.py

# coding=utf-8

# 识别图片的包
import pytesseract
# 取图片的包
from PIL import Image

# 需要一个图片的对象,并打开图片
image = Image.open("D:/imooc/imooc2.jpg")

# 运用包,将对象image转换成字符串
text = pytesseract.image_to_string(image)
print(text)
PS D:\imooc\selenium> python .\read_image.py
TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.

 

3、安装tesseract-ocr

Tesseract-OCR 安装、中文识别与训练字库 - 简书 (jianshu.com)

C:\Users\jieqiong>tesseract -v
tesseract 4.00.00alpha
 leptonica-1.74.1
  libgif 4.1.6(?) : libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.20 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.3 : libopenjp2 2.1.0

 

4、修改后的代码

# coding=utf-8

# 识别图片的包
import pytesseract

tesseract_cmd =  'D:\Python\Tesseract-OCR'

# 取图片的包
from PIL import Image

# 需要一个图片的对象,并打开图片
image = Image.open("D:/imooc/imooc2.jpg")

# 运用包,将对象image转换成字符串
text = pytesseract.image_to_string(image)
print(text)
PS D:\imooc\selenium> python .\read_image.py
0 6 6: 4
7.bmp

94-9 7 1
22.bmp l

 

posted @ 2022-09-23 16:23  酱汁怪兽  阅读(196)  评论(0)    收藏  举报