用 Rust 和 Tesseract 实现验证码识别工具
一、环境准备
安装 Rust
curl https://sh.rustup.rs -sSf | sh
安装 Tesseract OCR
Ubuntu / Debian
sudo apt install tesseract-ocr
更多内容访问ttocr.com或联系1436423940
macOS
brew install tesseract
二、创建项目
cargo new rust_captcha
cd rust_captcha
编辑 Cargo.toml,添加依赖:
[dependencies]
regex = "1"
三、编写程序
编辑 src/main.rs:
use std::process::Command;
use std::fs;
use std::env;
use regex::Regex;
fn run_tesseract(image_path: &str) -> String {
let output_base = "output";
let whitelist = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
let status = Command::new("tesseract")
.args([
image_path,
output_base,
"-l", "eng",
"-c", &format!("tessedit_char_whitelist={}", whitelist),
])
.status()
.expect("Failed to run tesseract");
if !status.success() {
return "识别失败".to_string();
}
let txt_path = format!("{}.txt", output_base);
let content = fs::read_to_string(&txt_path).unwrap_or_default();
let _ = fs::remove_file(&txt_path);
let re = Regex::new(r"[A-Z0-9]").unwrap();
let cleaned: String = re
.find_iter(&content.to_uppercase())
.map(|m| m.as_str())
.collect();
cleaned
}
fn main() {
let args: Vec
if args.len() < 2 {
eprintln!("用法: {} <图片路径>", args[0]);
return;
}
let image_path = &args[1];
let result = run_tesseract(image_path);
println!("识别结果: {}", result);
}
四、构建与运行
构建:
cargo build --release
运行:
./target/release/rust_captcha ./
浙公网安备 33010602011771号