使用 Java 和 Deeplearning4j 实现图像验证码识别

验证码识别是图像分类中的一个应用场景,常用于自动化测试、爬虫识别等领域。本文将演示如何用 Java 语言和深度学习框架 Deeplearning4j 实现一个简单的图像验证码识别系统。

  1. 准备开发环境
    使用 Maven 搭建项目,添加以下依赖到 ​​pom.xml​​:


    org.deeplearning4j
    deeplearning4j-core
    1.0.0-beta7


    org.nd4j
    nd4j-native-platform
    1.0.0-beta7

    更多内容访问ttocr.com或联系1436423940
    org.datavec
    datavec-api
    1.0.0-beta7

    2. 加载验证码数据
    假设你已经有一个图片文件夹 ​​captcha_samples/​​,图片名形如 ​​7K2B_1.png​​,其中 ​​7K2B​​ 是验证码内容。
    我们使用 ​​ParentPathLabelGenerator​​ 来从文件名中提取标签信息。
    File parentDir = new File("captcha_samples");
    FileSplit fileSplit = new FileSplit(parentDir, NativeImageLoader.ALLOWED_FORMATS);
    BalancedPathFilter pathFilter = new BalancedPathFilter(new Random(), NativeImageLoader.ALLOWED_FORMATS);
    InputSplit[] inputSplits = fileSplit.sample(pathFilter, 0.8, 0.2);
    InputSplit trainData = inputSplits[0];
    InputSplit testData = inputSplits[1];3. 图像转换与标签编码
    int height = 60;
    int width = 160;
    int channels = 1;
    int outputNum = 36; // 0-9, A-Z
    int labelLength = 4;

ImageTransform transform = new ResizeImageTransform(width, height);

ImageRecordReader recordReader = new ImageRecordReader(height, width, channels, new CaptchaLabelGenerator());
recordReader.initialize(trainData);

DataSetIterator trainIter = new RecordReaderDataSetIterator.Builder(recordReader, 32)
.classification(1, outputNum * labelLength)
.build();自定义 ​​CaptchaLabelGenerator​​ 用于从文件名提取每个字符的索引。
4. 构建 CNN 模型
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new Adam(0.001))
.list()
.layer(new ConvolutionLayer.Builder(3, 3).nIn(channels).nOut(32).activation(Activation.RELU).build())
.layer(new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}).build())
.layer(new ConvolutionLayer.Builder(3, 3).nOut(64).activation(Activation.RELU).build())
.layer(new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}).build())
.layer(new DenseLayer.Builder().nOut(256).activation(Activation.RELU).build())
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.MCXENT)
.activation(Activation.SOFTMAX)
.nOut(outputNum * labelLength).build())
.setInputType(InputType.convolutionalFlat(height, width, channels))
.build();

MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
model.setListeners(new ScoreIterationListener(10));5. 训练模型
int epochs = 10;
for (int i = 0; i < epochs; i++) {
model.fit(trainIter);
}6. 测试验证码识别
NativeImageLoader loader = new NativeImageLoader(height, width, channels);
INDArray image = loader.asMatrix(new File("captcha_samples/7K2B_1.png"));
image = image.divi(255);

INDArray output = model.output(image);
int[] predictedIndices = output.argMax(1).toIntVector();

String[] chars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ".split("");
StringBuilder sb = new StringBuilder();
for (int i = 0; i < labelLength; i++) {
int idx = predictedIndices[i];
sb.append(chars[idx % outputNum]);
}

System.out.println("Predicted: " + sb.toString());

posted @ 2025-05-03 13:41  ttocr、com  阅读(72)  评论(0)    收藏  举报