batch wide find - replace[转]

batch wide find - replace

最后更新：2008-09-09, Ver 2.4.6.0909

简介

bwfr
　　- 支持多国语言的字符串批量查找和替换
　　- 批量字符集编码转换

纯 unicode 规则匹配内核，真正支持各国语言文字的正则匹配。
支持高级正则表达式（ARE）。
带有兼容性检查的字符集编码转换功能。同时支持 GNU libiconv（iconv.dll）和 Windows 自带的字符集编码转换 API。
支持一次性指定多个查找/替换对。
支持环境变量展开，可以在查找/替换对中使用系统环境变量。
支持一次性指定多个文件通配符和文件列表。
支持管道模式，与其它命令协同工作；支持半管道模式，从文件中获取输入，但将结果写到标准输出。
支持包含子目录。
支持普通匹配、正则匹配、可忽略大小写、可跨行匹配。替换时可以使用正则的子表达式。
同时支持 posix 标准的扩展正则表达式及 perl 风格的正则匹配。
可以格式化替换内容为全大写或全小写，便于在批处理中对环境变量和命令行参数做大小写一致化处理。
支持 DOS（Windows）、Macintosh 和 unix 风格的换行符，可选择自动识别（默认）或手动指定。
统计功能，列出每个文件中的替换次数、总替换次数等。
支持Win32和纯DOS环境（纯DOS环境中需要 HX DOS Extender 支持）。

更新历史

2008-09-09, Ver 2.4.6.0909

UPD: 增强了正则行首锚点 '^' 的适应性。

2008-09-08, Ver 2.4.5.0908

FIX: 纠正了正则表达式零长匹配时会出现无限循环的问题（例如，在内容“aaa.bbb”中将“[^.]*”替换为“z”）。
FIX: 纠正了正则行首锚点解析不正确的问题（例如，在内容“ccc”将“^c”替换为“z”）。

2008-09-07, Ver 2.4.4.0907

修正了 -r, -rnnl 参数以及 -ric, -rnnlic 参数含义相互对调的问题。

2008-09-06, Ver 2.4.3.0906

修正了 -encin 为 utf-8 或 ucs-2 时，-encout 不能自动判定的问题。

2008-08-16, Ver 2.4.1.713

发布

所谓 Unicode 正则匹配

bwfr 内部使用完全基于 UNICODE 的高效正则引擎，能够完成各种国际化条件的正则匹配。

以下举例说明：

小写类能够正确匹配各国小写字母。如：中文全角字母“ａｂｃｄ”；希腊字母“α、β、ω”；俄文字母“ж、я、щ” 等等。
大写类能够匹配各国大写字母。如：全角字母“ＡＢＣＤ”；希腊字母“Α、Β、Ω”；俄文字母“Ж、Я、Щ”等等。
字符类能匹配各国字母（中日韩的汉字也属于该类）。
标点类能够匹配各国标点符号，如：“、，。……『』”等等。
所有操作都支持宽字符，例如，表达式：“[我你他她它]们”将被正确处理。

其它（数字、空白符、词边界等等）以此类推，全方位支持多语言。以上字符分类的规则遵循UNICODE标准分类（UNICODE General Category Values）具体请参考：http://www.unicode.org/versions/Unicode4.0.0/ch04.pdf。

关于 UNICODE 正则的操作效率，这个引擎比我能找到的所有开源的非 UNICODE 匹配引擎至少快一倍左右（因为所有字符类的匹配都是直接查表映射的，都是标准 O(1) 算法）。

不过这只是匹配引擎的效率，由于所有文件在“匹配->替换”前后都要做一次编码转换，所以典型应用下，wfr 效率比 fr 低一些。wfr 并不是 fr 的升级版，能够用 fr 搞定的任务不推荐用 wfr 完成。

关于字符集编码

wfr 支持分别指定三个参量的字符集编码：

输入编码：输入文件或管道中内容的字符集编码。默认为当前系统的默认代码页。
输出编码：完成搜索->替换操作后，要输出到目标文件或管道中的编码。默认与输入编码相同。
参数编码：指定搜索和替换参数文件内容使用何种字符集编码。默认为当前系统的默认代码页。

例如：“wfr *.txt *.htm -r -argfile:patterns.txt -encarg:big5 -encin:gbk -encout:utf-8 -s” 把当前目录和所有子目录下的任何 txt 及 htm 文件中，满足 patterns.txt 内指定条件的内容全部替换；文件的编码为：gbk，patterns.txt使用 big5 编码，替换后将文件保存为 utf-8 编码。

咋一看指定参数编码好像没用，其实不然。例如在异种语言环境下（如在中文系统下操作韩文）的搜索替换；在DOS下使用wfr（DOS的系统默认代码页永远是ASCII）等场合，指定参数编码是很有意义的。

关于字符集转换功能库，如果当前系统搜索路径中存在 GUN 的 libiconv 库（iconv.dll），则优先使用 libiconv，否则使用操作系统自带的字符集转换 API。这样做的考虑如下：

更严谨和稳定的结果：iconv 比 Windows 的字符集转换 API 更严格，不会出现意想不到的乱码和连篇问号等情况。
更有保障的支持：Windows 能否成功完成指定字符集编码转换，很大程度上取决于用户是否安装了相关的代码页文件。在任何部署了 libiconv 库的计算机中，该库支持的字符集编码都能够被正确地转换。
跨平台支持：libiconv 可以用于几乎所有已知的平台。

简单中文帮助

C:\Documents and Settings\Administrator>
            #bwfr
            ===============================================================================
            wide find and replace Ver 2.4.1.713 by BaiYang / 2004 - 2008, 免费软件
            批量替换版
            主页地址 － http://baiy.cn
            ===============================================================================
            在文件或管道中批量搜索-替换字符串——支持多字符集编码
            用法: bwfr [文件通配符1 文件通配符2 ...] {查找选项:查找内容} {-argfile:替换规则
            文件} [其它选项]
            ===============================================================================
            查找选项:
            -f:      - 匹配精确串
            -fic:    - 匹配精确串（忽略大小写）
            -r:      - 正则表达式匹配
            -ric:    - 正则表达式匹配（忽略大小写）
            -rnnl:   - 跨行正则匹配
            -rnnlic: - 跨行正则匹配（忽略大小写）
            ===============================================================================
            参数文件选项:
            您应当使用参数文件来指定 "查找/替换" 对。参数文件中的每一行代表一个查找替换对，
            例如：
            1->a
            2->b
            3->c
            4->d
            ...
            -argfile:- 指定参数文件路径
            -dlm:    - 指定查找/替换对中，'查找' 部分和 '替换' 部分之间的分隔符。
            默认："->"
            注意: 如果您正在使用正则匹配，那么您可以在“替换”部分中中使用子表达式代换
            (\0 ～ \9)
            注意：您可以在参数文件中使用系统环境变量，例如：
            Dir->%SystemRoot%
            注意：您可以使用 '\r' 和 '\n' 来表示 '回车符' 和 '换行符'，例如：
            1->a\r\n
            要输出 '\r' 和 '\n' 的字面值，可以使用 '\\r' 和 '\\n'。
            ===============================================================================
            字符编码选项:
            -encin:  - 指定输入文本（文件或管道）的字符集编码。
            默认值: 使用当前操作系统的默认代码页。
            -encout: - 指定输出（结果）文本的字符集编码。
            默认值: 与 "-encin" 指定的值相同。
            -encarg: - 指定参数文件的内容使用何种字符集编码。
            默认值: 使用当前操作系统的默认代码页。
            -unisign - 如果输出指定为 UNICODE 宽字符编码 (如: UCS-2, UTF-8/16 等), 则在文
            件头部加入 BOM 签名，这样支持 unicode 的文本编辑器将会自动识别该文
            件的编码格式。
            -listenc - 列出所有 bwfr 支持的字符集编码。
            -listcmp - 列出 bwfr 认可的键入字符集编码转换规则表。
            -force   - 强制实施转换，忽略编码兼容性规则。
            ===============================================================================
            其它选项:
            -s       - 包含子目录
            -exp     - 启用内存扩展算法，将会加倍 fr 的内存使用量，但是很多时候可以极大地
            提高执行效率。如果你碰到了性能问题，可以尝试开启该选项。
            -stdin   - 从标准输入设备获得要查找的内容，并将结果从到标准输出设备（如果未指
            定任何文件模板的话，则自动启用该选项）
            -stdout  - 像平时一样， 从文件中获得输入， 但将结果送到标准输出（而不是写回文
            件）
            -frc     - 显示每个文件内发生的替换次数
            -trc     - 显示所有文件总共发生的替换次数
            ===============================================================================
            POSIX 和 Perl 风格的正则表达式:
            - "find - replace" 完整支持 POSIX.2 扩展标准及 Perl 风格的正则表达式。
            以下是支持的字符类及其对应关系:
            POSIX           perl       描述
            --------------------------------------------------------------------
            [:alnum:]                  字母和数字
            [:alpha:]       \a         字母
            [:lower:]       \l         小写字母
            [:upper:]       \u         大写字母
            [:blank:]                  空格和制表符
            [:space:]       \s         空白字符
            [:cntrl:]                  控制字符
            [:digit:]       \d         十进制数字
            [:xdigit:]      \x         十六进制数字
            [:graph:]                  可打印字符（不包括空白符）
            [:print:]       \p         可打印字符（包括空白符）
            [:punct:]                  标点符号
            - 以下是一些 perl 的特殊字符类:
            perl   POSIX等效         描述
            --------------------------------------------------------------------
            \o     [0-7]             八进制数字
            \O     [^0-7]            非八进制数字
            \w     [[:alnum:]_]      单词组成
            \W     [^[:alnum:]_]     非单词组成
            \A     [^[:alpha:]]      非字母
            \L     [^[:lower:]]      非小写字母
            \U     [^[:upper:]]      非大写字母
            \S     [^[:space:]]      非空白符
            \D     [^[:digit:]]      非十进制数字
            \X     [^[:xdigit:]]     非十六进制数字
            \P     [^[:print:]]      非可打印字符
            \<     [^[:alpha:]_]     单词开始
            \>     [^[:alnum:]_]     单词结束
            - 注意: posix 字符类是必须工作在集合中的（“[”和“]”内）。相反，perl 风格
            的字符类是工作在集合运算之外的。
            - 此外，为了便于在命令行输入一些特殊字符，特别定义了一下别名:
            perl风格      POSIX风格       描述
            -----------------------------------------------------------------------
            \"            [:dq:]          双引号
            \'            [:sq:]          单引号
            \t            [:tb:]          制表符
            \n            [:nl:]          换行符 (0x0A)
            \r            [:rt:]          回车符 (0x0D)
            \b            [:bs:]          退格符
            ===============================================================================
            开关的前缀和后缀:
            * 所有命令行开关（选项）都是大小写无关的（如: "-fic:" 和 "-FIC:"）
            * 开关的前缀可以是 "-" 或 "/"（如: "/s" 和 "-s"）
            * 开关的后缀可以是 ":" 或 "="（如: "/f:", "/f=", "-f:" 和 "-f=" 等效）
            ===============================================================================
            应用示例:
            bwfr *.txt *.htm -fic -argfile:patterns.txt

帮助屏幕

C:\Documents and Settings\Administrator>
            #bwfr
            ===============================================================================
            wide find and replace Ver 2.4.1.713 by BaiYang / 2004 - 2008, Freeware
            batch version
            homepage - http://baiy.cn
            ===============================================================================
            Find string in files or pipe, and replace it to another string With Multi-
            charset encoding support.
            USAGE: bwfr [filePattern1 filePattern2 ...] {FindOption} {/argfile:}
            [OtherOptions]
            ===============================================================================
            FIND OPTIONs:
            -f:      - find
            -fic:    - find ignore case
            -r:      - regular expresion find
            -ric:    - regular expresion find ignore case
            -rnnl:   - regular expresion find ignore newline
            -rnnlic: - regular expresion find ignore newline ignore case
            ===============================================================================
            ARGUMENTs FILE OPTIONs:
            You should specify find/replace pairs by an arguments file with an one pair per
            line semantic. for example:
            1->a
            2->b
            3->c
            4->d
            ...
            -argfile:- specify the arguments file path
            -dlm:    - specify the delimiter between 'find' part and 'replace' part in
            a find/replace pair. DEFAULT:"->"
            NOTE: If find option you specified is a regular expresion method, the replace
            part will support sub expressions (\0-\9) as well.
            NOTE: You can use environment variable in arguments file. for example:
            Dir->%SystemRoot%
            NOTE: You can use '\r' and '\n' to represent 'return' and 'newline'. for
            example:
            1->a\r\n
            To produce '\r' and '\n' literally, just type it as '\\r' and '\\n'.
            ===============================================================================
            CHARSET ENCODING OPTIONs:
            -encin:  - specify charset encoding for the input text (file or pipe).
            DEFAULT: use current system's default codepage.
            -encout: - specify charset encoding for the output text.
            DEFAULT: same as "-encin".
            -encarg: - specify charset encoding of the arguments file.
            DEFAULT: current system's default codepage.
            -unisign - if the output encoding is unicode (i.e. UCS-2, UTF-8/16, etc.),
            then add BOM signature to the file.
            -listenc - list all accepted charset encoding names.
            -listcmp - list all compatible encoding convertion combination.
            -force   - enforce the text encoding convertion specified by "-encin" and
            "-encout", even if it is not compatible.
            ===============================================================================
            OTHER OPTIONs:
            -s       - include sub folders
            -exp     - enable the memory expand algorithm, will double the memory usage,
            but MUCH quick in many case.
            -stdio   - get input from standard input device (keyboard and pipe),
            and put the results to standard output device.
            default when file pattern is omitted.
            -stdout  - get input from file(s) as normally, but put the results to
            standard output device.
            -frc     - show File Replacements Count
            -trc     - show Total Replacements Count
            ===============================================================================
            POSIX and perl style Regular Expression:
            - "find - replace" fully support POSIX.2 Extended and Perl style Regular
            Expresion. Here is a list of they character classes:
            POSIX           perl       Description
            --------------------------------------------------------------------
            [:alnum:]                  letters and digits
            [:alpha:]       \a         letters
            [:lower:]       \l         lowercase letters
            [:upper:]       \u         uppercase letters
            [:blank:]                  space and tab characters
            [:space:]       \s         whitespace characters
            [:cntrl:]                  control characters
            [:digit:]       \d         decimal digits
            [:xdigit:]      \x         hexadecimal digits
            [:graph:]                  printable characters excluding space
            [:print:]       \p         printable characters including space
            [:punct:]                  punctuation characters
            - And here a some special char classes in perl:
            perl   POSIX equivalent  Description
            --------------------------------------------------------------------
            \o     [0-7]             octal digit
            \O     [^0-7]            non-octal digit
            \w     [[:alnum:]_]      word character
            \W     [^[:alnum:]_]     non-word character
            \A     [^[:alpha:]]      non-alphabetic character
            \L     [^[:lower:]]      non-lowercase character
            \U     [^[:upper:]]      non-uppercase character
            \S     [^[:space:]]      non-whitespace character
            \D     [^[:digit:]]      non-digit
            \X     [^[:xdigit:]]     non-hex digit
            \P     [^[:print:]]      non-printable characters
            \<     [^[:alpha:]_]     begin of word
            \>     [^[:alnum:]_]     end of word
            - note: posix char class must working in the square brackets. contrary,
            perl's must stay outside of the brackets.
            - And several alias has been created to help input some special char:
            perl style    POSIX style     Description
            -----------------------------------------------------------------------
            \"            [:dq:]          double quotation
            \'            [:sq:]          single quotation
            \t            [:tb:]          table
            \n            [:nl:]          new line (0x0A)
            \r            [:rt:]          return (0x0D)
            \b            [:bs:]          backspace
            NOTE: the posix style alias also available on /t, /tu and /tl when using
            the regex ("/r" and "/ric") match.
            ===============================================================================
            SWITCH PREFIX and SUFFIX:
            * All switchs (options) are case-insensitive (i.e: "-fic:" or "-FIC:")
            * Switch Prefix can be either "-" or "/" (i.e: "/s" or "-s")
            * Switch Suffix can be either ":" or "=" (i.e: "/f:", "/f=", "-f:" or "-f=")
            ===============================================================================
            EXAMPLEs:
            bwfr *.txt *.htm -f -argfile:patterns.txt

字符集编码和转换兼容性

wfr 支持的字符集：使用 -listenc 参数可以查看 wfr 支持的字符集编码列表如下：
　

C:\Documents and Settings\Administrator>
                        #wfr /listenc
                        Name                Codepage   Description
                        ===============================================================================
                        ANSI-Arabic         CP1256     Arabic - ANSI
                        ANSI-Baltic         CP1257     Baltic - ANSI
                        ANSI-CentralEuropean
                        CP1250     Central European - ANSI
                        ANSI-Cyrillic       CP1251     Cyrillic - ANSI
                        ANSI-Greek          CP1253     Greek - ANSI
                        ANSI-Hebrew         CP1255     Hebrew - ANSI
                        ANSI-LatinI         CP1252     Latin I - ANSI
                        ANSI-Thai           CP874      Thai - ANSI
                        ANSI-Turkish        CP1254     Turkish - ANSI
                        ARMSCII-8           CP1254     Armenian - ARMSCII
                        ASCII               CP437      English - ASCII (DOS OEM)
                        BIG5                CP950      Traditional Chinese - BIG5
                        BIG5-HKSCS          CP950      Traditional Chinese - BIG5-HKSCS
                        BIG5-HKSCS:1999     CP950      Traditional Chinese - BIG5-HKSCS:1999
                        BIG5-HKSCS:2001     CP950      Traditional Chinese - BIG5-HKSCS:2001
                        BIG5-HKSCS:2004     CP950      Traditional Chinese - BIG5-HKSCS:2004
                        EUC-CN              CP51936    Simplified Chinese - EUC
                        EUC-JP              CP51932    Japanese - EUC
                        EUC-KR              CP51949    Korean - EUC
                        EUC-TW              CP51950    Traditional Chinese - EUC
                        GB18030             CP54936    Simplified Chinese - GB18030
                        GB2312              CP20936    Simplified Chinese - GB2312
                        GBK                 CP936      Simplified Chinese - GBK
                        HZ                  CP52936    Simplified Chinese - HZ-GB2312
                        ISO-2022-CN         CP50227    Simplified Chinese - ISO-2022-CN
                        ISO-2022-CN-EXT     CP50227    Simplified Chinese - ISO-2022-CN-EXT
                        ISO-2022-JP         CP50220    Japanese - ISO-2022-JP
                        ISO-2022-JP-1       CP50221    Japanese - ISO-2022-JP-1
                        ISO-2022-JP-2       CP50222    Japanese - ISO-2022-JP-2
                        ISO-2022-KR         CP50225    Korean - ISO-2022-KR
                        ISO-646             CP20127    English - ASCII (ISO-646)
                        ISO-8859-1          CP28591    Latin 1 (West European) - ISO-8859-1
                        ISO-8859-10         CP28594    Latin 6 (Nordic) - ISO-8859-10
                        ISO-8859-11         CP874      Thai - ISO-8859-11
                        ISO-8859-13         CP28603    Latin 7 (Baltic Rim) - ISO-8859-13
                        ISO-8859-14         CP28591    Latin 8 (Celtic) - ISO-8859-14
                        ISO-8859-15         CP28605    Latin 9 (West European) - ISO-8859-15
                        ISO-8859-2          CP28592    Latin 2 (Central and East European) - ISO-8859-2
                        ISO-8859-3          CP28593    Latin 3 (South European) - ISO-8859-3
                        ISO-8859-4          CP28594    Latin 4 (North European / Baltic) - ISO-8859-4
                        ISO-8859-5          CP28595    Cyrillic - ISO-8859-5
                        ISO-8859-6          CP28596    Arabic - ISO-8859-6
                        ISO-8859-7          CP28597    Greek - ISO-8859-7
                        ISO-8859-8          CP28598    Hebrew - ISO-8859-8
                        ISO-8859-9          CP28599    Latin 5 (Turkish) - ISO-8859-9
                        JOHAB               CP1361     Korean - Johab
                        KOI8                CP20866    Russian - KOI8-R
                        KOI8-R              CP20866    Russian - KOI8-R
                        KOI8-U              CP21866    Ukrainian - KOI8-U
                        KSC                 CP949      Korean - Unified Hangeul Code
                        MacArabic           CP10004    Arabic - MAC
                        MacCentralEurope    CP10029    Central Europe - MAC
                        MacCroatian         CP10082    Croatian - MAC
                        MacCyrillic         CP10007    Cyrillic - MAC
                        MacGreek            CP10006    Greek - MAC
                        MacHebrew           CP10005    Hebrew - MAC
                        MacIceland          CP10079    Iceland - MAC
                        Macintosh           CP10029    Macintosh - MAC
                        MacRoman            CP10000    Roman - MAC
                        MacRomania          CP10010    Romania - MAC
                        MacThai             CP10021    Thai - MAC
                        MacTurkish          CP10081    Turkish - MAC
                        MacUkraine          CP10017    Ukraine - MAC
                        OEM-Arabic          CP864      Arabic - OEM
                        OEM-Baltic          CP775      Baltic - OEM
                        OEM-CanadianFrench  CP863      Canadian French - OEM
                        OEM-Cyrillic        CP855      Cyrillic (primarily Russian) - OEM
                        OEM-Greek           CP737      Greek (formerly 437G) - OEM
                        OEM-Hebrew          CP862      Hebrew - OEM
                        OEM-Icelandic       CP861      Icelandic - OEM
                        OEM-LatinI          CP850      Latin 1 (West European) - OEM
                        OEM-LatinII         CP852      Latin 2 (Central and East European) - OEM
                        OEM-ModernGreek     CP869      Modern Greek - OEM
                        OEM-MultilingualLatinI
                        CP850      Multilingual Latin 1 - OEM
                        OEM-MultlingualLatinI+EuroSymbol
                        CP858      Multlingual Latin I + Euro symbol - OEM
                        OEM-Nordic          CP865      Nordic - OEM
                        OEM-Portuguese      CP860      Portuguese - OEM
                        OEM-Russian         CP866      Russian - OEM
                        OEM-Turkish         CP857      Turkish - OEM
                        SHIFT_JIS           CP932      Japanese - SHIFT-JIS
                        TCVN                CP1258     Vietnamese - TCVN
                        TIS-620             CP874      Thai - TIS-620
                        UCS-2               CP1200     Unicode - UCS-2
                        UCS-2BE             CP1201     Unicode - UCS-2 Big-Endian
                        UCS-2LE             CP1200     Unicode - UCS-2 Little-Endian (BMP of ISO 10646)
                        UHC                 CP949      Korean - Unified Hangeul Code
                        UTF-16              CP1200     Unicode - UTF-16
                        UTF-16BE            CP1201     Unicode - UTF-16 Big-Endian
                        UTF-16LE            CP1200     Unicode - UTF-16 Little-Endian
                        UTF-7               CP65000    Unicode - UTF-7
                        UTF-8               CP65001    Unicode - UTF-8
                        VISCII              CP1258     Vietnamese - VISCII

字符集编码使用名称或代码页指定均可，并且不区分大小写。

不过，除了 UNICODE 类的超集字符集编码以外，每种编码所涵盖的语言文字符号范围都是有限的。对于可能会产生信息丢失的转换，我们叫做不兼容转换。意即：任何不可逆的转换都是不兼容转换。

使用 “-listcmp” 参数可以列出 wfr 遵循的编码转换兼容性规则表如下：

C:\Documents and Settings\Administrator>
                        #wfr /listcmp
                        -- Unicode Charset --
                        From                                    To
                        ===============================================================================
                        <ANY>                                   GB18030
                        UCS-2
                        UCS-2BE
                        UCS-2LE
                        UTF-16
                        UTF-16BE
                        UTF-16LE
                        UTF-7
                        UTF-8
                        -- Standard English --
                        From                                    To
                        ===============================================================================
                        ASCII                                   <ANY>
                        ISO-646
                        -- West European --
                        From                                    To
                        ===============================================================================
                        ANSI-LatinI                             ANSI-LatinI
                        ISO-8859-1                              ISO-8859-1
                        ISO-8859-14                             ISO-8859-14
                        ISO-8859-15                             ISO-8859-15
                        ISO-8859-9                              ISO-8859-9
                        MacIceland                              MacIceland
                        MacRoman                                MacRoman
                        OEM-CanadianFrench                      OEM-CanadianFrench
                        OEM-Icelandic                           OEM-Icelandic
                        OEM-LatinI                              OEM-LatinI
                        OEM-MultilingualLatinI                  OEM-MultilingualLatinI
                        OEM-MultlingualLatinI+EuroSymbol        OEM-MultlingualLatinI+EuroSymbol
                        OEM-Portuguese                          OEM-Portuguese
                        -- Central and East European --
                        From                                    To
                        ===============================================================================
                        ANSI-CentralEuropean                    ANSI-CentralEuropean
                        ISO-8859-2                              ISO-8859-2
                        MacCentralEurope                        MacCentralEurope
                        MacCroatian                             MacCroatian
                        MacRomania                              MacRomania
                        OEM-LatinII                             OEM-LatinII
                        -- South European --
                        From                                    To
                        ===============================================================================
                        ANSI-Turkish                            ANSI-Turkish
                        ARMSCII-8                               ARMSCII-8
                        ISO-8859-3                              ISO-8859-3
                        ISO-8859-9                              ISO-8859-9
                        MacTurkish                              MacTurkish
                        OEM-Turkish                             OEM-Turkish
                        -- North European --
                        From                                    To
                        ===============================================================================
                        ANSI-Baltic                             ANSI-Baltic
                        ISO-8859-10                             ISO-8859-10
                        ISO-8859-13                             ISO-8859-13
                        ISO-8859-4                              ISO-8859-4
                        OEM-Baltic                              OEM-Baltic
                        OEM-Nordic                              OEM-Nordic
                        -- Cyrillic --
                        From                                    To
                        ===============================================================================
                        ISO-8859-5                              ISO-8859-5
                        KOI8                                    KOI8
                        KOI8-R                                  KOI8-R
                        KOI8-U                                  KOI8-U
                        MacCyrillic                             MacCyrillic
                        MacUkraine                              MacUkraine
                        OEM-Cyrillic                            OEM-Cyrillic
                        OEM-Russian                             OEM-Russian
                        -- Greek --
                        From                                    To
                        ===============================================================================
                        ANSI-Greek                              ANSI-Greek
                        ISO-8859-7                              ISO-8859-7
                        MacGreek                                MacGreek
                        OEM-Greek                               OEM-Greek
                        OEM-ModernGreek                         OEM-ModernGreek
                        -- Arabic --
                        From                                    To
                        ===============================================================================
                        ANSI-Arabic                             ANSI-Arabic
                        ISO-8859-6                              ISO-8859-6
                        MacArabic                               MacArabic
                        OEM-Arabic                              OEM-Arabic
                        -- Hebrew --
                        From                                    To
                        ===============================================================================
                        ANSI-Hebrew                             ANSI-Hebrew
                        ISO-8859-8                              ISO-8859-8
                        MacHebrew                               MacHebrew
                        OEM-Hebrew                              OEM-Hebrew
                        -- Simplified Chinese --
                        From                                    To
                        ===============================================================================
                        EUC-CN                                  EUC-CN
                        GB18030                                 GB18030
                        GB2312                                  GB2312
                        GBK                                     GBK
                        HZ                                      HZ
                        ISO-2022-CN                             ISO-2022-CN
                        ISO-2022-CN-EXT                         ISO-2022-CN-EXT
                        -- Traditional Chinese --
                        From                                    To
                        ===============================================================================
                        BIG5                                    BIG5
                        BIG5-HKSCS                              BIG5-HKSCS
                        BIG5-HKSCS:1999                         BIG5-HKSCS:1999
                        BIG5-HKSCS:2001                         BIG5-HKSCS:2001
                        BIG5-HKSCS:2004                         BIG5-HKSCS:2004
                        EUC-TW                                  EUC-TW
                        -- Korean --
                        From                                    To
                        ===============================================================================
                        EUC-KR                                  EUC-KR
                        ISO-2022-KR                             ISO-2022-KR
                        JOHAB                                   JOHAB
                        KSC                                     KSC
                        UHC                                     UHC
                        -- Japanese --
                        From                                    To
                        ===============================================================================
                        EUC-JP                                  EUC-JP
                        ISO-2022-JP                             ISO-2022-JP
                        ISO-2022-JP-1                           ISO-2022-JP-1
                        ISO-2022-JP-2                           ISO-2022-JP-2
                        SHIFT_JIS                               SHIFT_JIS
                        -- Thai --
                        From                                    To
                        ===============================================================================
                        ANSI-Thai                               ANSI-Thai
                        MacThai                                 MacThai
                        TIS-620                                 TIS-620
                        -- Vietnamese --
                        From                                    To
                        ===============================================================================
                        TCVN                                    TCVN
                        VISCII                                  VISCII
                        -- GB Special --
                        From                                    To
                        ===============================================================================
                        ANSI-Greek                              GB18030
                        EUC-CN                                  GB2312
                        EUC-JP                                  GBK
                        GB2312
                        ISO-2022-JP
                        ISO-2022-JP-1
                        ISO-2022-JP-2
                        ISO-8859-1
                        ISO-8859-5
                        ISO-8859-7
                        KOI8
                        KOI8-R
                        KOI8-U
                        MacCyrillic
                        MacGreek
                        MacUkraine
                        OEM-Cyrillic
                        OEM-Greek
                        OEM-ModernGreek
                        OEM-Russian
                        SHIFT_JIS
                        -- CJK Charset --
                        From                                    To
                        ===============================================================================
                        BIG5                                    GB18030
                        BIG5-HKSCS                              GBK
                        BIG5-HKSCS:1999
                        BIG5-HKSCS:2001
                        BIG5-HKSCS:2004
                        EUC-CN
                        EUC-KR
                        EUC-TW
                        GB18030
                        GB2312
                        GBK
                        HZ
                        ISO-2022-CN
                        ISO-2022-CN-EXT
                        ISO-2022-KR
                        JOHAB
                        KSC
                        UHC

如果用户确实能保证正在进行有意义的转换，可以使用 “/force” 参数要求 wfr 忽略字符集编码兼容性检查。

下载

bwfr.rar iconv.rar（可选）

posted on 2008-11-02 18:32 starspace 阅读(509) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

星星的天空