使用 Rust 和 Tesseract 实现图像验证码识别

一、准备工作
安装 Rust

curl https://sh.rustup.rs -sSf | sh
安装 Tesseract OCR

Ubuntu

sudo apt install tesseract-ocr
更多内容访问ttocr.com或联系1436423940

macOS

brew install tesseract
二、创建项目

cargo new captcha_ocr
cd captcha_ocr
编辑 Cargo.toml,添加依赖:

[dependencies]
regex = "1"
三、编写识别代码
编辑 src/main.rs:

use std::process::Command;
use std::fs;
use std::path::Path;
use regex::Regex;

/// 调用 tesseract 并读取输出
fn recognize_captcha(image_path: &str) -> String {
let output_base = "rust_output";
let whitelist = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";

let status = Command::new("tesseract")
    .arg(image_path)
    .arg(output_base)
    .arg("-l")
    .arg("eng")
    .arg("-c")
    .arg(format!("tessedit_char_whitelist={}", whitelist))
    .status()
    .expect("failed to execute tesseract");

if !status.success() {
    return "识别失败".to_string();
}

let txt_file = format!("{}.txt", output_base);
let raw_text = fs::read_to_string(&txt_file).unwrap_or_default();
let _ = fs::remove_file(&txt_file);

let re = Regex::new(r"[A-Z0-9]").unwrap();
re.find_iter(&raw_text)
    .map(|m| m.as_str())
    .collect::<Vec<&str>>()
    .join("")

}

fn main() {
let image_path = "captcha1.png"; // 替换为实际路径
let result = recognize_captcha(image_path);
println!("识别结果: {}", result);
}
四、运行程序

cargo run
输出示例:

识别结果: 8H9Z
五、扩展功能:批量处理
添加以下代码替换 main 函数:

use std::fs::read_dir;

fn batch_recognize(folder: &str) {
let paths = read_dir(folder).unwrap();

for entry in paths {
    let entry = entry.unwrap();
    let path = entry.path();
    if path.extension().and_then(|s| s.to_str()) == Some("png") {
        let filename = path.file_name().unwrap().to_string_lossy();
        let result = recognize_captcha(path.to_str().unwrap());
        println!("{} -> {}", filename, result);
    }
}

}

fn main() {
batch_recognize("captchas"); // 目录名可自定义
}
六、保存识别结果到文件
添加写入 CSV 的功能:

use std::fs::File;
use std::io::Write;

fn save_results_to_csv(folder: &str, output_csv: &str) {
let paths = read_dir(folder).unwrap();
let mut file = File::create(output_csv).unwrap();
writeln!(file, "filename,text").unwrap();

for entry in paths {
    let entry = entry.unwrap();
    let path = entry.path();
    if path.extension().and_then(|s| s.to_str()) == Some("png") {
        let filename = path.file_name().unwrap().to_string_lossy();
        let result = recognize_captcha(path.to_str().unwrap());
        writeln!(file, "{},{}", filename, result).unwrap();
    }
}

}

fn main() {
save_results_to_csv("captchas", "results.csv");
}

posted @ 2025-07-01 12:56  ttocr、com  阅读(13)  评论(0)    收藏  举报