Keras 实现验证码识别系统

在这篇教程中，我们将使用 Keras（一个高层神经网络 API）来构建一个验证码识别系统。Keras 是基于 TensorFlow 的高级 API，简化了深度学习模型的构建和训练过程。通过使用卷积神经网络（CNN），我们可以高效地处理图像数据，识别验证码中的字符。

我们将使用 Keras 构建一个卷积神经网络，进行验证码识别任务，并且展示如何训练和评估模型。

环境准备
首先，你需要确保安装了 TensorFlow 和其他一些图像处理库：
更多内容访问ttocr.com或联系1436423940
pip install tensorflow numpy matplotlib opencv-python pillow
tensorflow：用于深度学习模型构建和训练。

numpy：用于数据处理。

matplotlib：用于结果可视化。

opencv-python：用于图像处理。

pillow：用于图像读取和操作。

数据集准备与图像预处理
为了训练模型，我们需要准备一个包含验证码图像的训练数据集。在数据预处理过程中，我们将进行灰度化、二值化、调整图像大小以及归一化操作。

(1) 图像预处理
我们首先将图像转换为灰度图，然后应用二值化方法，并将图像的大小调整为统一的尺寸。这里使用 OpenCV 和 Pillow 库来完成图像的读取和处理。

import cv2
import numpy as np

def preprocess_image(img_path, img_size=(64, 64)):
# 读取图像
img = cv2.imread(img_path)

# 转换为灰度图
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 使用 Otsu 的方法进行二值化
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# 调整图像大小
resized_img = cv2.resize(binary, img_size)

# 归一化图像
normalized_img = resized_img / 255.0

return normalized_img

示例图像路径

img_path = 'captcha_images/test1.png'
processed_img = preprocess_image(img_path)

显示处理后的图像

import matplotlib.pyplot as plt
plt.imshow(processed_img, cmap='gray')
plt.show()
在这个代码片段中，我们使用 OpenCV 处理图像，转换为灰度图，进行二值化处理，并且将图像大小统一为 64x64。

构建 CNN 模型
使用 Keras 构建卷积神经网络（CNN）。CNN 是处理图像的强大工具，它能够自动提取图像特征。在这个例子中，我们将构建一个简单的 CNN 模型来识别验证码中的字符。

(1) 定义模型

from tensorflow.keras import layers, models

def build_cnn_model(input_shape=(64, 64, 1), num_classes=36):
model = models.Sequential()

# 第一层卷积层
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
model.add(layers.MaxPooling2D((2, 2)))

# 第二层卷积层
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))

# 第三层卷积层
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))

# 扁平化层
model.add(layers.Flatten())

# 全连接层
model.add(layers.Dense(128, activation='relu'))

# 输出层
model.add(layers.Dense(num_classes, activation='softmax'))

# 编译模型
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

return model

构建模型

model = build_cnn_model()

查看模型架构

model.summary()
这个模型包括三个卷积层和池化层，最后通过一个全连接层进行输出。输出层的大小为 num_classes，对应验证码字符集的类别数，通常我们假设验证码字符包括 0-9 和 A-Z 共 36 个字符。

数据加载与训练
为了训练模型，我们需要从文件系统加载验证码图像，并将标签进行 One-hot 编码。训练时我们将数据按批次输入到模型中进行学习。

(1) 数据加载与标签编码
我们将数据集中的标签转换为 One-hot 编码，并对图像进行适当的预处理。为了方便数据加载，我们可以使用 Keras 的 ImageDataGenerator 来批量加载图像。

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import to_categorical
import os

假设图像存储在 'captcha_images' 文件夹中

def load_data(image_dir, img_size=(64, 64)):
images = []
labels = []
for filename in os.listdir(image_dir):
if filename.endswith('.png'):
img_path = os.path.join(image_dir, filename)
img = preprocess_image(img_path, img_size)
images.append(img)

        # 提取标签，假设标签是文件名的前几个字符
        label = filename.split('.')[0]
        labels.append(label)

# 转换为 numpy 数组
images = np.array(images)
labels = np.array(labels)

# 对标签进行 One-hot 编码
num_classes = 36  # 假设验证码由 0-9 和 A-Z 组成
labels = to_categorical(labels, num_classes=num_classes)

return images, labels

加载训练数据

images, labels = load_data('captcha_images')

拆分训练集与验证集

from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(images, labels, test_size=0.2, random_state=42)
在上面的代码中，我们加载了图像数据并将标签进行 One-hot 编码，然后使用 train_test_split 将数据集分为训练集和验证集。

训练模型
训练模型时，我们将使用 categorical_crossentropy 作为损失函数，并选择 Adam 优化器来优化模型。这里我们也设置了一个适当的批量大小和训练周期数。

(1) 训练模型

history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val))

绘制训练和验证准确率

import matplotlib.pyplot as plt
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0, 1])
plt.legend(loc='lower right')
plt.show()
这个代码段展示了如何训练模型并绘制训练过程中的准确率变化曲线。

进行预测
训练完成后，我们可以用模型对新的验证码进行预测。对于每一张新的验证码图像，模型将输出一个字符的类别预测结果。

(1) 进行预测

def predict_captcha(model, img_path):
# 预处理图像
processed_img = preprocess_image(img_path)
processed_img = np.expand_dims(processed_img, axis=0) # 扩展批次维度
processed_img = np.expand_dims(processed_img, axis=-1) # 添加通道维度

# 预测
prediction = model.predict(processed_img)
predicted_label = np.argmax(prediction, axis=1)

return predicted_label

预测一个新的验证码图像

predicted_label = predict_captcha(model, 'captcha_images/test1.png')
print(f'Predicted label: {predicted_label}')
这里，我们首先将图像进行预处理，然后传入模型进行预测。通过 argmax 函数，我们获取预测的标签类别。

posted @ 2025-04-06 16:12 ttocr、com 阅读(25) 评论(0) 收藏举报

刷新页面返回顶部