折腾笔记[56]-使用kimi批量进行英文文献翻译

摘要

在macOS上配置kimi-cli的libretranslate-en-to-zh技能,通过本地LibreTranslate服务将英文技术文档批量翻译为简体中文.

声明

本文人类为第一作者, 龙虾为通讯作者.本文有AI生成内容.

libretranslate-en-to-zh技能

SKILL.md

<hr>
<p>name: libretranslate-en-to-zh</p>
<h2 id="description-translate-english-text-or-files-to-simplified-chinese-using-a-local-libretranslate-service-use-when-the-user-asks-to-translate-english-technical-documents-manuals-or-plain-text-to-chinese-zh-hans-via-libretranslate-or-when-working-with-cognex-insight-documentation-and-similar-industrial-technical-english-content-that-needs-chinese-translation-">description: Translate English text or files to Simplified Chinese using a local LibreTranslate service. Use when the user asks to translate English technical documents, manuals, or plain text to Chinese (zh-Hans) via LibreTranslate, or when working with Cognex InSight documentation and similar industrial/technical English content that needs Chinese translation.</h2>
<h1 id="libretranslate-en-zh-hans">LibreTranslate EN → ZH-Hans</h1>
<p>Translate English content to Simplified Chinese via a local LibreTranslate instance.</p>
<h2 id="prerequisites">Prerequisites</h2>
<ul>
<li>LibreTranslate installed (e.g. <code>uv tool install libretranslate</code>)</li>
<li><code>en -&gt; zh-Hans</code> language pack present at <code>~/.local/share/argos-translate/packages/translate-en_zh-1_9/</code></li>
<li>Default endpoint: <code>http://127.0.0.1:5555</code> (port 5000 is often occupied by macOS AirTunes)</li>
</ul>
<h2 id="quick-start">Quick Start</h2>
<h3 id="translate-text-inline">Translate text inline</h3>
<pre><code class="lang-bash"><span class="hljs-keyword">python3</span> ~/.config/agents/skills/libretranslate-<span class="hljs-keyword">en</span>-<span class="hljs-keyword">to</span>-zh/scripts/translate.<span class="hljs-keyword">py</span> \
  --text <span class="hljs-string">"System Requirements"</span>
</code></pre>
<h3 id="translate-a-single-file">Translate a single file</h3>
<pre><code class="lang-bash">python3 ~/.config/agents/skills/libretranslate-en-to-zh/scripts/translate<span class="hljs-selector-class">.py</span> \
  --file <span class="hljs-selector-tag">input</span><span class="hljs-selector-class">.txt</span> --out output.txt
</code></pre>
<h3 id="batch-translate-a-directory">Batch translate a directory</h3>
<pre><code class="lang-bash">python3 ~<span class="hljs-regexp">/.config/</span>agents<span class="hljs-regexp">/skills/</span>libretranslate-en-to-zh<span class="hljs-regexp">/scripts/</span>translate.py \
  --dir sections<span class="hljs-regexp">/en/</span> --outdir sections<span class="hljs-regexp">/zh/</span>
</code></pre>
<h2 id="service-lifecycle">Service Lifecycle</h2>
<ol>
<li><strong>Check</strong> if LibreTranslate is running on <code>localhost:5555</code></li>
<li><strong>Start</strong> automatically if not running (background daemon, logs to <code>/tmp/libretranslate.log</code>)</li>
<li><strong>Wait</strong> up to 30s for readiness</li>
<li><strong>Translate</strong> via <code>/translate</code> API (<code>source=en</code>, <code>target=zh-Hans</code>)</li>
</ol>
<p>To stop the service manually:</p>
<pre><code class="lang-bash"><span class="hljs-selector-tag">pkill</span> <span class="hljs-selector-tag">-f</span> "<span class="hljs-selector-tag">libretranslate</span> <span class="hljs-selector-tag">--host</span> 127<span class="hljs-selector-class">.0</span><span class="hljs-selector-class">.0</span><span class="hljs-selector-class">.1</span> <span class="hljs-selector-tag">--port</span> 5555"
</code></pre>
<h2 id="post-translation-review-checklist">Post-Translation Review Checklist</h2>
<p>Machine translation of technical docs requires human review. Verify:</p>
<table>
<thead>
<tr>
<th>Category</th>
<th>Rule</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr>
<td>Product names</td>
<td>Keep English</td>
<td>In-Sight → <strong>In-Sight</strong> (not &quot;内视&quot;)</td>
</tr>
<tr>
<td>Function names</td>
<td>Keep English</td>
<td>ApplyAcquisitionSettings → <strong>ApplyAcquisitionSettings</strong></td>
</tr>
<tr>
<td>Model numbers</td>
<td>Keep original</td>
<td>In-Sight 3800 → <strong>In-Sight 3800</strong></td>
</tr>
<tr>
<td>Acronyms</td>
<td>Keep or add note</td>
<td>COM → <strong>COM</strong> (串口)</td>
</tr>
<tr>
<td>Units</td>
<td>Keep symbols</td>
<td>mm, ms, FPS</td>
</tr>
</tbody>
</table>
<h2 id="troubleshooting">Troubleshooting</h2>
<table>
<thead>
<tr>
<th>Symptom</th>
<th>Cause</th>
<th>Fix</th>
</tr>
</thead>
<tbody>
<tr>
<td>Port 5000 refused / 403</td>
<td>macOS AirTunes occupies 5000</td>
<td>Use <code>--port 5555</code></td>
</tr>
<tr>
<td><code>Chinese not available</code></td>
<td>Missing <code>en-&gt;zh</code> language pack</td>
<td>Download <code>translate-en_zh-1_9.argosmodel</code> and unzip to <code>~/.local/share/argos-translate/packages/</code></td>
</tr>
<tr>
<td><code>filelock Timeout</code></td>
<td>Stale lock file</td>
<td><code>rm ~/.local/share/argos-translate/minisbd/*.lock</code> then restart</td>
</tr>
<tr>
<td>Empty response</td>
<td>Service still loading</td>
<td>Wait 10-20s for model warmup</td>
</tr>
</tbody>
</table>
<h2 id="python-api-for-embedding-">Python API (for embedding)</h2>
<pre><code class="lang-python">from scripts.translate <span class="hljs-built_in">import</span> translate_text, translate_file, translate_dir

<span class="hljs-comment"># Single string</span>
<span class="hljs-attr">zh</span> = translate_text(<span class="hljs-string">"Release History"</span>)

<span class="hljs-comment"># Single file</span>
<span class="hljs-attr">zh</span> = translate_file(<span class="hljs-string">"input.txt"</span>, <span class="hljs-string">"output.txt"</span>)

<span class="hljs-comment"># Whole directory</span>
translate_dir(<span class="hljs-string">"sections/en/"</span>, <span class="hljs-string">"sections/zh/"</span>)
</code></pre>

scripts/translate.py

#!/usr/bin/env python3
"""
LibreTranslate EN->ZH helper script.

Usage:
    python translate.py --text "Hello world"
    python translate.py --file /path/to/input.txt
    python translate.py --file /path/to/input.txt --out /path/to/output.txt
    python translate.py --dir /path/to/en_dir/ --outdir /path/to/zh_dir/
"""

from __future__ import annotations
import argparse
import http.client
import json
import os
import pathlib
import subprocess
import sys
import time

DEFAULT_HOST = "127.0.0.1"
DEFAULT_PORT = 5555
SOURCE_LANG = "en"
TARGET_LANG = "zh-Hans"


def is_service_running(host: str = DEFAULT_HOST, port: int = DEFAULT_PORT) -> bool:
    """Check if LibreTranslate is responding."""
    try:
        conn = http.client.HTTPConnection(host, port, timeout=2)
        conn.request("GET", "/languages")
        resp = conn.getresponse()
        conn.close()
        return resp.status == 200
    except Exception:
        return False


def ensure_service(host: str = DEFAULT_HOST, port: int = DEFAULT_PORT) -> None:
    """Start LibreTranslate if not already running."""
    if is_service_running(host, port):
        return

    libretranslate = os.path.expanduser("~/.local/bin/libretranslate")
    if not os.path.exists(libretranslate):
        libretranslate = "libretranslate"

    print(f"[libretranslate-en-to-zh] Starting LibreTranslate on {host}:{port} ...")
    log_path = "/tmp/libretranslate.log"
    with open(log_path, "a") as logf:
        subprocess.Popen(
            [libretranslate, "--host", host, "--port", str(port)],
            stdout=logf,
            stderr=subprocess.STDOUT,
            start_new_session=True,
        )

    # Wait up to 30s for service to become ready
    for _ in range(30):
        time.sleep(1)
        if is_service_running(host, port):
            print(f"[libretranslate-en-to-zh] Service ready at http://{host}:{port}")
            return

    print(f"[libretranslate-en-to-zh] ERROR: Service failed to start. Check {log_path}")
    sys.exit(1)


def translate_text(text: str, host: str = DEFAULT_HOST, port: int = DEFAULT_PORT) -> str:
    """Translate a single string."""
    ensure_service(host, port)

    payload = json.dumps({
        "q": text,
        "source": SOURCE_LANG,
        "target": TARGET_LANG,
        "format": "text",
    })
    headers = {"Content-Type": "application/json"}

    conn = http.client.HTTPConnection(host, port, timeout=60)
    try:
        conn.request("POST", "/translate", body=payload, headers=headers)
        resp = conn.getresponse()
        data = resp.read().decode("utf-8")
        conn.close()

        if resp.status != 200:
            raise RuntimeError(f"HTTP {resp.status}: {data}")

        result = json.loads(data)
        if "translatedText" not in result:
            raise RuntimeError(f"Unexpected response: {data}")
        return result["translatedText"]
    except Exception:
        conn.close()
        raise


def translate_file(input_path: str, output_path: str | None = None,
                   host: str = DEFAULT_HOST, port: int = DEFAULT_PORT) -> str:
    """Translate the contents of a file."""
    p = pathlib.Path(input_path)
    if not p.exists():
        raise FileNotFoundError(f"File not found: {input_path}")

    text = p.read_text(encoding="utf-8")
    translated = translate_text(text, host, port)

    if output_path:
        out_p = pathlib.Path(output_path)
        out_p.parent.mkdir(parents=True, exist_ok=True)
        out_p.write_text(translated, encoding="utf-8")
        print(f"[libretranslate-en-to-zh] Written: {output_path}")

    return translated


def translate_dir(input_dir: str, output_dir: str,
                  host: str = DEFAULT_HOST, port: int = DEFAULT_PORT) -> None:
    """Translate all .txt files in a directory, preserving structure."""
    src = pathlib.Path(input_dir)
    dst = pathlib.Path(output_dir)
    if not src.is_dir():
        raise NotADirectoryError(f"Not a directory: {input_dir}")

    files = list(src.rglob("*.txt"))
    if not files:
        print("[libretranslate-en-to-zh] No .txt files found.")
        return

    ensure_service(host, port)

    for f in files:
        rel = f.relative_to(src)
        out_file = dst / rel
        print(f"[libretranslate-en-to-zh] Translating: {f} -> {out_file}")
        try:
            translate_file(str(f), str(out_file), host, port)
        except Exception as exc:
            print(f"[libretranslate-en-to-zh] FAILED: {f} ({exc})")

    print(f"[libretranslate-en-to-zh] Done. Translated {len(files)} file(s).")


def main() -> None:
    parser = argparse.ArgumentParser(description="LibreTranslate EN->ZH helper")
    parser.add_argument("--text", help="Text to translate")
    parser.add_argument("--file", help="Input file path")
    parser.add_argument("--out", help="Output file path (optional)")
    parser.add_argument("--dir", help="Input directory (translates all .txt files)")
    parser.add_argument("--outdir", help="Output directory (used with --dir)")
    parser.add_argument("--host", default=DEFAULT_HOST, help="LibreTranslate host")
    parser.add_argument("--port", type=int, default=DEFAULT_PORT, help="LibreTranslate port")
    args = parser.parse_args()

    if args.text:
        print(translate_text(args.text, args.host, args.port))
    elif args.file:
        result = translate_file(args.file, args.out, args.host, args.port)
        if not args.out:
            print(result)
    elif args.dir:
        if not args.outdir:
            parser.error("--outdir is required when using --dir")
        translate_dir(args.dir, args.outdir, args.host, args.port)
    else:
        parser.print_help()
        sys.exit(1)


if __name__ == "__main__":
    main()

运行效果

  1. 运行python translate.py --text "Attention is all you need"输出"你只需要注意一点"
  2. 运行python translate.py --file input.txt --out output.txt翻译单个文件
  3. 运行python translate.py --dir en/ --outdir zh/批量翻译目录下所有.txt文件
  4. 服务自动管理:检测→启动→等待就绪→翻译,无需手动干预
posted @ 2026-05-02 20:59  qsBye  阅读(8)  评论(0)    收藏  举报