使用 Nim 和 Tesseract 识别验证码图片

一、环境准备
安装 Nim:

Ubuntu/macOS

curl https://nim-lang.org/choosenim/init.sh -sSf | sh
安装 Tesseract OCR:
更多内容访问ttocr.com或联系1436423940

macOS

brew install tesseract

Ubuntu

sudo apt install tesseract-ocr
安装编译工具(如果缺失):

Ubuntu

sudo apt install gcc
二、编写识别脚本
创建文件:captcha_ocr.nim

import osproc, os, strutils

proc cleanText(s: string): string =

只保留大写字母和数字

result = ""
for c in s:
if c in {'A'..'Z'} or c in {'0'..'9'}:
result.add(c)

proc recognizeCaptcha(imagePath: string) =
let outputBase = "output"
let whitelist = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"

构建命令行调用 Tesseract

let cmd = "tesseract " & imagePath & " " & outputBase &
" -l eng -c tessedit_char_whitelist=" & whitelist
discard execShellCmd(cmd)

let outputFile = outputBase & ".txt"
if fileExists(outputFile):
let raw = readFile(outputFile)
let cleaned = cleanText(raw)
echo "识别结果: ", cleaned
removeFile(outputFile)
else:
echo "识别失败:输出文件未生成"

示例调用

recognizeCaptcha("captcha1.png") # 替换为你的图像文件路径
三、运行程序
编译运行:

nim c -r captcha_ocr.nim
示例输出:

识别结果: 3KZ8
四、可拓展功能
你可以扩展以下功能:

遍历文件夹中的验证码图像:

for file in walkFiles("captchas/*.png"):
recognizeCaptcha(file)
输出结果保存到 CSV 文件:

writeFile("results.csv", "filename,text\n")
for file in walkFiles("captchas/*.png"):
let result = recognizeCaptcha(file)
appendFile("results.csv", file & "," & result & "\n")
与 Nim 图像库如 nimPNG、nimMagick 结合,实现图像预处理(灰度化、阈值等)

posted @ 2025-06-29 18:57  ttocr、com  阅读(24)  评论(0)    收藏  举报