Re: Emacs grep Problem in Windows
Hi, Xah Lee,
Here below are some results of my experiments on "Emacs grep Problem in Windows". (As I didn't find your email address on http://xahlee.org, and I think some others might be also interested in this topic, I put it here on my blog – although I rarely wrote blog article in English :-).
Environments:
OS: Windows XP SP2 (Simplified Chinese, default language settings (i.e. Simplified Chinese as my default locale))
GNU Emacs 23.1 & 24.0.1.50
Cygwin 1.7 era (cygwin-1.7.7-1, bash-3.2.51-24, grep-2.6.3-1)
Steps
before starting emacs, I set env-var
LC_CTYPE=C.UTF-8
then runemacs.exe -Q, put the following into scratch
;;(set-language-environment "UTF-8")
(setq shell-file-name "e/cygwin/bin/bash.exe")
(setq default-process-coding-system '(utf-8-dos . euc-cn-unix))
and evaluate them (M-x eval-buffer).
then call M-x grep
/bin/grep -nH -e 我试试 /cygdrive/e/foobar.utf8
I got the following result:
-*- mode: grep; default-directory: "d:/emacs/emacs-24/bin/" -*-
Grep started at Sat Apr 09 17:05:18
/bin/grep -nH -e 我试试 /cygdrive/e/foobar.utf8
/cygdrive/e/foobar.utf8:10:我试试
Grep finished (matches found) at Sat Apr 09 17:05:20
Some explanation
Why use bash? Could cmdproxy.exe work?
The error message you encountered "warning: extra args ignored after…" came from cmdproxy.exe. I think it is cmdproxy.exe wouldn't let those UTF-8 chars pass.
It is ok that you use dash.exe (On Cygwin & Msys, sh.exe are same with bash.exe), Or some other shell (any better choice?).
Strange enough, msys's grep won't work for me
-*- mode: grep; default-directory: "d:/emacs/emacs-24/bin/" -*-
Grep started at Sat Apr 09 18:09:13
e:/emacs/ergoemacs/msys/bin/grep -nH -e 我 e:/foobar.utf8
Grep finished with no matches found at Sat Apr 09 18:09:13
(but in M-x shell, it works)
Improvements needed
Firstly, default-process-coding-system is not meant to be used like this. Because some process would expect to work in ANSI, some not. Is process-coding-system-alist a proper option?
Secondly, some times we need to search something in a ANSI file, some times in a UTF-8 file. Is there any convenient way, e.g. M-x utf8-grep/ansi-grep? Or maybe we could implement a M-x ucs-grep (add an interactive argument to let user input the charset name)?
Some notes
LC_CTYPE=C.UTF-8
Adding it as an environment variable is not the only way. It is also OK if you put (setenv "LC_CTYPE" "C.UTF-8") into Emacs.
PATH=e:/cygwin/bin:$PATH
The is recommended before starting emacs. Thus you can type less keys in the following steps (path to cygwin could be omitted):
(setq shell-file-name "bash.exe")
grep -nH -e 我试试 /cygdrive/e/foobar.utf8
Something might be worthy mention: Use msys's bash + cygwin's grep
You can get the correct grep result if you use msys's sh/bash with Cygwin's grep. But it might be difficult make sure both msys and cygwin would not be confused with the path: they have different understanding of /, and if you use DOS path, cygwin would throw an anonying warning:
-*- mode: grep; default-directory: "d:/emacs/emacs-24/bin/" -*-
Grep started at Sat Apr 09 17:28:40
e:/cygwin/bin/grep -nH -e 我试试 e:/foobar.utf8
cygwin warning:
MS-DOS style path detected: e:/foobar.utf8
Preferred POSIX equivalent is: /cygdrive/e/foobar.utf8
CYGWIN environment variable option "nodosfilewarning" turns off this warning.
Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
e:/foobar.utf8:10:我试试

浙公网安备 33010602011771号