用 TensorFlow 和 CNN 进行验证码识别

在这篇文章中，我们将使用 TensorFlow 和卷积神经网络（CNN）来实现一个验证码识别系统。CNN 是处理图像识别任务的核心方法，它能够自动从图像中提取特征，并进行有效的分类。我们将使用 CNN 来识别验证码中的字符。

环境准备
首先，你需要确保安装了 TensorFlow 和一些其他图像处理库：

pip install tensorflow opencv-python numpy matplotlib pillow
tensorflow：用于深度学习模型的构建与训练。

opencv-python：用于图像处理。

numpy：用于数值计算。更多内容访问ttocr.com或联系1436423940

matplotlib：用于结果可视化。

pillow：用于图像加载和处理。

数据集准备与图像预处理
我们首先需要准备验证码数据集，这些数据集包括带有字符的图像。我们将对图像进行一些预处理操作，主要包括：灰度化、二值化、图像大小调整和归一化处理。

(1) 图像预处理

import cv2
import numpy as np
import os
from tensorflow.keras.preprocessing.image import img_to_array

def preprocess_image(img_path, img_size=(64, 64)):
# 读取图像
img = cv2.imread(img_path)

# 转换为灰度图
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 二值化处理
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# 调整图像大小
resized_img = cv2.resize(binary, img_size)

# 归一化处理
normalized_img = resized_img / 255.0

# 转换为数组
img_array = img_to_array(normalized_img)

return img_array

示例图像路径

img_path = 'captcha_images/test1.png'
processed_img = preprocess_image(img_path)

显示预处理后的图像

import matplotlib.pyplot as plt
plt.imshow(processed_img, cmap='gray')
plt.show()
这段代码展示了如何读取图像，进行灰度转换和二值化操作，并且将图像调整为固定尺寸。我们还将图像归一化到 [0, 1] 范围，以便更好地进行神经网络训练。

标签编码与数据准备
我们需要对标签进行编码，将字符标签转换为数值格式。然后，我们将图像数据和标签数据进行分割，准备好训练集和测试集。

(1) 标签编码

from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

假设验证码包括0-9和A-Z（总共36个字符）

def encode_labels(labels, num_classes=36):
label_encoder = LabelEncoder()
labels_encoded = label_encoder.fit_transform(labels)
labels_onehot = to_categorical(labels_encoded, num_classes=num_classes)
return labels_onehot, label_encoder

读取图像数据和标签

def load_data(image_dir, img_size=(64, 64)):
images = []
labels = []
for filename in os.listdir(image_dir):
if filename.endswith('.png'):
img_path = os.path.join(image_dir, filename)
img = preprocess_image(img_path, img_size)
images.append(img)

        # 提取标签
        label = filename.split('.')[0]
        labels.append(label)

images = np.array(images)
labels = np.array(labels)

# 对标签进行One-hot编码
labels_onehot, label_encoder = encode_labels(labels)

return images, labels_onehot, label_encoder

加载数据集

image_dir = 'captcha_images'
X, y, label_encoder = load_data(image_dir)
这里，我们通过 LabelEncoder 和 to_categorical 对标签进行 One-hot 编码，然后准备训练和测试数据集。

构建 CNN 模型
现在我们来构建卷积神经网络（CNN），该网络将用于验证码字符的识别。我们将使用 Keras（TensorFlow 的高层 API）来定义 CNN 模型。

(1) 定义 CNN 模型

from tensorflow.keras import layers, models

def build_cnn_model(input_shape=(64, 64, 1), num_classes=36):
model = models.Sequential()

# 第一层卷积层
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
model.add(layers.MaxPooling2D((2, 2)))

# 第二层卷积层
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))

# 扁平化层
model.add(layers.Flatten())

# 全连接层
model.add(layers.Dense(128, activation='relu'))

# 输出层
model.add(layers.Dense(num_classes, activation='softmax'))

# 编译模型
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

return model

构建CNN模型

model = build_cnn_model()

查看模型架构

model.summary()
在这个模型中，我们使用了两层卷积层（Conv2D），每个卷积层后面都有一个最大池化层（MaxPooling2D）来降低特征图的维度。最后，我们添加了一个全连接层，并使用 Softmax 激活函数作为输出层来预测每个字符的概率。

训练 CNN 模型
我们使用 TensorFlow 提供的 API 进行训练。在训练过程中，我们会使用交叉熵损失函数（categorical_crossentropy），并选择 Adam 优化器。

(1) 训练模型

拆分训练集和测试集

from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

训练模型

history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val))

绘制训练过程中的准确率变化

import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
plt.show()
在这段代码中，我们将数据集拆分为训练集和验证集，并开始训练模型。训练过程中的准确率变化会被绘制出来。

模型评估与测试
在训练完成后，我们可以在测试集上评估模型的性能，并查看其准确率。

(1) 评估模型

评估模型在测试集上的表现

test_loss, test_acc = model.evaluate(X_val, y_val)
print(f"验证集上的损失: {test_loss:.4f}")
print(f"验证集上的准确率: {test_acc:.4f}")
这段代码通过调用 evaluate 方法来计算模型在验证集上的损失和准确率。

对新图像进行预测
一旦训练完成，我们可以使用该模型对新的验证码图像进行预测。

(1) 进行预测
def predict_captcha(model, img_path, label_encoder):
img = preprocess_image(img_path)

# 扩展维度并进行预测
img = np.expand_dims(img, axis=0)  # 增加批量维度
prediction = model.predict(img)

# 获取预测标签
predicted_label_encoded = np.argmax(prediction, axis=1)
predicted_label = label_encoder.inverse_transform(predicted_label_encoded)

return predicted_label[0]

预测新的验证码

new_image_path = 'captcha_images/test1.png'
predicted_label = predict_captcha(model, new_image_path, label_encoder)
print(f'预测的验证码是: {predicted_label}')
这段代码通过预处理图像，然后传递给模型进行预测。通过 argmax 获取概率最大的标签，并解码为原始字符标签。

posted @ 2025-04-06 13:29 ttocr、com 阅读(48) 评论(0) 收藏举报

刷新页面返回顶部