基于python3.6.6的scrapy环境部署+图像识别插件安装

一、Python3.6.6安装
1、安装依赖的二进制软件包
yum -y install zlib zlib-devel bzip2 bzip2-devel ncurses ncurses-devel readline readline-devel openssl openssl-devel openssl-static xz lzma xz-devel sqlite sqlite-devel gdbm gdbm-devel tk tk-devel gcc gcc-c++
2、将Python-3.6.6.tgz放到/usr/local/src目录下
3、解压软件包
tar xf Python-3.6.6.tgz
4、进入Python-3.6.6
cd Python-3.6.6
5、检查环境配置
./configure --prefix=/usr/local/python3
6、编译安装
make && make install
ln -s /usr/local/python3/bin/python3 /usr/bin/python3
7、添加python3的环境变量
vim /etc/profile.d/python3.sh加入如下内容
export PATH=$PATH:$HOME/bin:/usr/local/python3/bin
重读配置文件
source /etc/profile.d/python3.sh
查看下版本:python3 --version
# python3 --version
Python 3.6.6

二、安装scrapy框架
pip3 install lxml
pip3 install wheel
pip3 install scrapy

三、安装selenium, PhantomJS
wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
tar -xvjf phantomjs-2.1.1-linux-x86_64.tar.bz2
cp -R phantomjs-2.1.1-linux-x86_64 /usr/local/share/
ln -sf /usr/local/share/phantomjs-2.1.1-linux-x86_64/bin/phantomjs /usr/local/bin/
pip3 install selenium

四、安装tesserocr, PIL
yum install -y tesseract tesseract-devel leptonica-devel
git clone https://github.com/tesseract-ocr/tessdata.git
mv tessdata/* /usr/share/tesseract/tessdata
pip3 install tesserocr pillow

五、测试一下
>>> import PIL import Image
>>> from PIL import Image
>>> import tesserocr
>>> p=Image.open('/opt/20180823090940.png')
>>> s=tesserocr.image_to_text(p)
>>> print(s)
5890

posted @ 2018-08-23 15:50  fansik  阅读(282)  评论(0编辑  收藏  举报