Integrating OpenCV python tool into one SKlearn MNIST example for supporting prediction

背景

https://www.cnblogs.com/lightsong/p/14469252.html

如上博客对应进展是，集成hub数据，基于MNIST数据构建手写数字识别模型，得到逻辑回归模型的预测准确度。

如上模型，仅仅是训练出来，但是如何应用此模型进行预测，还需要引入工具，对任意手写图像进行处理，规整化为MNIST数据标准格式，

从网站上看到标准数据格式为：

shape为（28， 28）
灰度表示
背景为纯黑，书写痕迹为白向

通过学习OpenCV接口，实现这种图像处理功能。

https://app.activeloop.ai/datasets/popular

数据源名称为 activeloop/mnist

OpenCV

http://www.opencv.org.cn/opencvdoc/2.3.2/html/doc/tutorials/tutorials.html

https://docs.opencv.org/master/d1/dfb/intro.html

OpenCV 是一种开源的库，包括数百种计算机视觉算法。

包括以下模块：

核心模块 - 定义基本的数据结构和基本的函数，被所有的上层模块使用。
图像处理 - 图像过滤和几何图像变换，色彩空间转换。
视频分析 - 移动估计，背景提取，目标检测。
。。。

OpenCV (Open Source Computer Vision Library: http://opencv.org) is an open-source library that includes several hundreds of computer vision algorithms. The document describes the so-called OpenCV 2.x API, which is essentially a C++ API, as opposed to the C-based OpenCV 1.x API (C API is deprecated and not tested with "C" compiler since OpenCV 2.4 releases)

OpenCV has a modular structure, which means that the package includes several shared or static libraries. The following modules are available:

Core functionality (core) - a compact module defining basic data structures, including the dense multi-dimensional array Mat and basic functions used by all other modules.

Image Processing (imgproc) - an image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on.

Video Analysis (video) - a video analysis module that includes motion estimation, background subtraction, and object tracking algorithms.

Camera Calibration and 3D Reconstruction (calib3d) - basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction.

2D Features Framework (features2d) - salient feature detectors, descriptors, and descriptor matchers.

Object Detection (objdetect) - detection of objects and instances of the predefined classes (for example, faces, eyes, mugs, people, cars, and so on).

High-level GUI (highgui) - an easy-to-use interface to simple UI capabilities.

Video I/O (videoio) - an easy-to-use interface to video capturing and video codecs.

... some other helper modules, such as FLANN and Google test wrappers, Python bindings, and others.

The further chapters of the document describe functionality of each module. But first, make sure to get familiar with the common API concepts used thoroughly in the library.

python wrapper for OpenCV

https://docs.opencv.org/master/d6/d00/tutorial_py_root.html

此工具核心实现是C++, 为扩大应用场景，开放了若干种语言API， python是其中之一。

如下为 python API支持的所有功能。

涵盖了主页上列举的所有库提供的功能。

同时我们发现有趣的功能，此库还提供了机器学习算法，包括以下三种模型。

K-Nearest Neighbour

Learn to use kNN for classification Plus learn about handwritten digit recognition using kNN

Support Vector Machines (SVM)

Understand concepts of SVM

K-Means Clustering

Learn to use K-Means Clustering to group data to a number of clusters. Plus learn to do color quantization using K-Means Clustering

python API介绍

Introduction to OpenCV

Learn how to setup OpenCV-Python on your computer!

Gui Features in OpenCV

Here you will learn how to display and save images and videos, control mouse events and create trackbar.

Core Operations

In this section you will learn basic operations on image like pixel editing, geometric transformations, code optimization, some mathematical tools etc.

Image Processing in OpenCV

In this section you will learn different image processing functions inside OpenCV.

Feature Detection and Description

In this section you will learn about feature detectors and descriptors

Video analysis (video module)

In this section you will learn different techniques to work with videos like object tracking etc.

Camera Calibration and 3D Reconstruction

In this section we will learn about camera calibration, stereo imaging etc.

Machine Learning

In this section you will learn different image processing functions inside OpenCV.

Computational Photography

In this section you will learn different computational photography techniques like image denoising etc.

Object Detection (objdetect module)

In this section you will learn object detection techniques like face detection etc.

OpenCV-Python Bindings

In this section, we will see how OpenCV-Python bindings are generated

Image Processing in OpenCV

https://docs.opencv.org/master/d2/d96/tutorial_py_table_of_contents_imgproc.html

我们的问题面向图片的特征预处理，所以我们聚焦在图像处理章节。

需求的第一点，将彩色转换为灰度图片，需要用到第一个点改变色彩空间（Changing Colorspaces）
缩放图片大小，需要用到第二点，几何变换（Geometric Transformations of Images）

Changing Colorspaces

Learn to change images between different color spaces. Plus learn to track a colored object in a video.

Geometric Transformations of Images

Learn to apply different geometric transformations to images like rotation, translation etc.

Image Thresholding

Learn to convert images to binary images using global thresholding, Adaptive thresholding, Otsu's binarization etc

Smoothing Images

Learn to blur the images, filter the images with custom kernels etc.

Morphological Transformations

Learn about morphological transformations like Erosion, Dilation, Opening, Closing etc

Image Gradients

Learn to find image gradients, edges etc.

Canny Edge Detection

Learn to find edges with Canny Edge Detection

Image Pyramids

Learn about image pyramids and how to use them for image blending

Contours in OpenCV

All about Contours in OpenCV

Histograms in OpenCV

All about histograms in OpenCV

Image Transforms in OpenCV

Meet different Image Transforms in OpenCV like Fourier Transform, Cosine Transform etc.

Template Matching

Learn to search for an object in an image using Template Matching

Hough Line Transform

Learn to detect lines in an image

Hough Circle Transform

Learn to detect circles in an image

Image Segmentation with Watershed Algorithm

Learn to segment images with watershed segmentation

Interactive Foreground Extraction using GrabCut Algorithm

Learn to extract foreground with GrabCut algorithm

图片数字特征 -- 补充知识

https://zhuanlan.zhihu.com/p/267130193

我们需要的是灰度图, 图片每一个像素点，只含有一个维度的特征， 0 -255，表示像素点的明暗特征，即白和黑，以及介于黑白之间的色，不能表示彩色。

彩色图是三通道的，即图片每一个像素点，都含有三个维度的特征 RGB

图像分为二值图，灰度图，伪彩色图，真彩色图。

二值图：图像的像素只有 0 和 1。0 位黑色，1 为白色。

灰度图：图像的像素值有256种（0 - 255）。这种图像的RGB(红绿蓝)，对应的值是相等的。

伪彩色图：RGB对应有256种颜色的索引，通过对应的颜色板去确定颜色的深浅。

真彩色图：对应的RGB颜色直接取对应的值，即为真彩色图。

https://zhuanlan.zhihu.com/p/36592188

此文中对MNIST数据的探索，表明像素点是灰度数据，非二值数据。

Changing Colorspaces

https://docs.opencv.org/master/df/d9d/tutorial_py_colorspaces.html

Changing Color-space

There are more than 150 color-space conversion methods available in OpenCV. But we will look into only two, which are most widely used ones: BGR ↔ Gray and BGR ↔ HSV.

For color conversion, we use the function cv.cvtColor(input_image, flag) where flag determines the type of conversion.

For BGR → Gray conversion, we use the flag cv.COLOR_BGR2GRAY. Similarly for BGR → HSV, we use the flag cv.COLOR_BGR2HSV. To get other flags, just run following commands in your Python terminal:

>>> import cv2 as cv

>>> flags = [i for i in dir(cv) if i.startswith('COLOR_')]

>>> print( flags )

Geometric Transformations of Images

https://docs.opencv.org/master/da/d6e/tutorial_py_geometric_transformations.html

对图像进行缩放，插值选取有讲究。缩小使用 INTER_AREA ，放大使用 INTER_CUBIC

Scaling

Scaling is just resizing of the image. OpenCV comes with a function cv.resize() for this purpose. The size of the image can be specified manually, or you can specify the scaling factor. Different interpolation methods are used. Preferable interpolation methods are cv.INTER_AREA for shrinking and cv.INTER_CUBIC (slow) & cv.INTER_LINEAR for zooming. By default, the interpolation method cv.INTER_LINEAR is used for all resizing purposes. You can resize an input image with either of following methods:

import numpy as np

import cv2 as cv

img = cv.imread('messi5.jpg')

res = cv.resize(img,None,fx=2, fy=2, interpolation = cv.INTER_CUBIC)

#OR

height, width = img.shape[:2]

res = cv.resize(img,(2*width, 2*height), interpolation = cv.INTER_CUBIC)

Lenna show -- 图像处理demo

https://zhuanlan.zhihu.com/p/49957946

安装

OpenCV 的安装，不同平台不同版本会有一些差异。安装前需要装好 numpy，强烈建议先安装好 Anaconda，然后直接通过命令安装：
pip install opencv-python
如果你运气好，代码里运行 import cv2 没报错就是成功了。

但大多数时候可能不行。你可以考虑去这里下载安装文件：

https://www.lfd.uci.edu/~gohlke/pythonlibs/#opencv

然后通过命令从本地安装：
pip install opencv_python‑3.4.3‑cp37‑cp37m‑win_amd64.whl
这里下载的安装文件版本号务必要和你本机的 Python 版本和位数相对应。

如果安装时还有其他问题，可在网上直接搜索报错，通常都会有解决方案，这里不一一赘述。

图像处理 -- 读写

cv读取出来的数据是以 numpy对象存储，可以借助其能力对图像做切割等处理。

我们用图像处理的经典范例 Lenna 来做测试

可自行搜索这幅图像的来头
import cv2 as cv
# 读图片
img = cv.imread('img/Lenna.png')
# 图片信息
print('图片尺寸:', img.shape)
print('图片数据:', type(img), img)
# 显示图片
cv.imshow('pic title', img)
cv.waitKey(0)
# 添加文字
cv.putText(img, 'Learn Python with Crossin', (50, 150), cv.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 4)
# 保存图片
cv.imwrite('img/Lenna_new.png', img)
OpenCV-Python 中的图像数据使用了 numpy 库的 ndarray 类型进行管理，便于进行各种数值计算和转换。

图像处理 -- 色彩转换

常见的图像处理：

import numpy as np
# 灰度图
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
cv.imwrite('img/Lenna_gray.png', img_gray)
# 二值化
_, img_bin = cv.threshold(img_gray, 127, 255, cv.THRESH_BINARY)
cv.imwrite('img/Lenna_bin.png', img_bin)
# 平滑
img_blur = cv.blur(img, (5, 5))
cv.imwrite('img/Lenna_blur.png', img_blur)
# 边缘提取
_, contours, _ = cv.findContours(img_bin, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
img_cont = np.zeros(img_bin.shape, np.uint8)    
cv.drawContours(img_cont, contours, -1, 255, 3) 
cv.imwrite('img/Lenna_cont.png', img_cont)

这几种都属于数字图像处理的常用方法。OpenCV-Python 基本都封装好的接口，只需一两行代码就能完成，在实际项目开发中非常方便。

改造成果

数据处理

数据准备，提供彩色的输入 3

https://github.com/fanqingsong/code_snippet/blob/master/machine_learning/hub/images/three.png

经过处理后，得到符合 MNIST 标准的图像

https://github.com/fanqingsong/code_snippet/blob/master/machine_learning/hub/images/three_normal.jpg

Code

https://github.com/fanqingsong/code_snippet/blob/master/machine_learning/hub/mnist_sklearn.py

其中最后几行包含了图像预处理

# 读图片
img = cv.imread('images/three.png')
# 灰度图
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
# 二值化
_, img_bin = cv.threshold(img_gray, 127, 255, cv.THRESH_BINARY)
# Resize
img_resized = cv.resize(img_bin, (28,28), interpolation=cv.INTER_AREA)
img_normal = 255 - img_resized
# 保存图片
cv.imwrite('images/three_normal.jpg', img_normal)

# flat to one dimension
one_digit_features = img_normal.reshape((-1))

完整code

"""
Test mnist learning on sklearn with hub data
"""

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.utils import check_random_state

from hub import Dataset
import cv2 as cv


print(__doc__)

# Author: Arthur Mensch <arthur.mensch@m4x.org>
# License: BSD 3 clause

# Turn down for faster convergence
train_samples = 5000

print("loading mnist data from hub...")
mnist = Dataset("activeloop/mnist")  # loading the MNIST data lazily

X = mnist["image"].compute()
y = mnist["label"].compute()

print("------- X.shape", X.shape)
print("------- y.shape", y.shape)


print("now train mnist model...")

random_state = check_random_state(0)
permutation = random_state.permutation(X.shape[0])

print("---------- permutation")
print(permutation)

X = X[permutation]
y = y[permutation]

print("------- X.shape", X.shape)
print("------- y.shape", y.shape)

X = X.reshape((X.shape[0], -1))

print("------- X.shape", X.shape)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, train_size=train_samples, test_size=10000)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Turn up tolerance for faster convergence
clf = LogisticRegression(
    C=50. / train_samples, penalty='l1', solver='saga', tol=0.1
)
clf.fit(X_train, y_train)

sparsity = np.mean(clf.coef_ == 0) * 100
score = clf.score(X_test, y_test)

# print('Best C % .4f' % clf.C_)
print("Sparsity with L1 penalty: %.2f%%" % sparsity)
print("Test score with L1 penalty: %.4f" % score)

# now enter predicting part
# 读图片
img = cv.imread('images/three.png')
# 灰度图
img_gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
# 二值化
_, img_bin = cv.threshold(img_gray, 127, 255, cv.THRESH_BINARY)
# Resize
img_resized = cv.resize(img_bin, (28,28), interpolation=cv.INTER_AREA)
img_normal = 255 - img_resized
# 保存图片
cv.imwrite('images/three_normal.jpg', img_normal)

# flat to one dimension
one_digit_features = img_normal.reshape((-1))

result = clf.predict([one_digit_features])
print("------ for three digit picture, the predicted result is ", result)

输出

从最后一行的输出，可以看出，确实预测正确，为3

root@xxx:~/win10/mine/code-snippet/machine_learning/hub# python mnist_sklearn.py

Test mnist learning on sklearn with hub data

loading mnist data from hub...
------- X.shape (70000, 28, 28, 1)
------- y.shape (70000,)
now train mnist model...
---------- permutation
[10840 56267 14849 ... 42613 43567 68268]
------- X.shape (70000, 28, 28, 1)
------- y.shape (70000,)
------- X.shape (70000, 784)
Sparsity with L1 penalty: 68.89%
Test score with L1 penalty: 0.8389
------ for three digit picture, the predicted result is [3]

posted @ 2021-03-04 16:42 lightsong 阅读(119) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Stay Hungry,Stay Foolish!

lightsong

{Web: [React, Vue, NodeJS, HTTP]，DevOps:[Jenkins,Docker,K8S], Languages:[Python, JS, C, Lua, Shell, Groovy]}