Re: Emacs grep Problem in Windows

Hi, Xah Lee,

Here below are some results of my experiments on "Emacs grep Problem in Windows". (As I didn't find your email address on http://xahlee.org, and I think some others might be also interested in this topic, I put it here on my blog – although I rarely wrote blog article in English :-).

Environments:

OS: Windows XP SP2 (Simplified Chinese, default language settings (i.e. Simplified Chinese as my default locale))
GNU Emacs 23.1 & 24.0.1.50
Cygwin 1.7 era (cygwin-1.7.7-1, bash-3.2.51-24, grep-2.6.3-1)

Steps

before starting emacs, I set env-var

LC_CTYPE=C.UTF-8

then runemacs.exe -Q, put the following into scratch

;;(set-language-environment "UTF-8") 
(setq shell-file-name "e/cygwin/bin/bash.exe")
(setq default-process-coding-system '(utf-8-dos . euc-cn-unix))

and evaluate them (M-x eval-buffer).

then call M-x grep

/bin/grep -nH -e 我试试 /cygdrive/e/foobar.utf8

I got the following result:

-*- mode: grep; default-directory: "d:/emacs/emacs-24/bin/" -*-
Grep started at Sat Apr 09 17:05:18

/bin/grep -nH -e 我试试 /cygdrive/e/foobar.utf8
/cygdrive/e/foobar.utf8:10:我试试

Grep finished (matches found) at Sat Apr 09 17:05:20

Some explanation

 

Why use bash? Could cmdproxy.exe work?

The error message you encountered "warning: extra args ignored after…" came from cmdproxy.exe. I think it is cmdproxy.exe wouldn't let those UTF-8 chars pass.

It is ok that you use dash.exe (On Cygwin & Msys, sh.exe are same with bash.exe), Or some other shell (any better choice?).

Strange enough, msys's grep won't work for me

-*- mode: grep; default-directory: "d:/emacs/emacs-24/bin/" -*-
Grep started at Sat Apr 09 18:09:13

e:/emacs/ergoemacs/msys/bin/grep -nH -e 我 e:/foobar.utf8

Grep finished with no matches found at Sat Apr 09 18:09:13

(but in M-x shell, it works)

Improvements needed

Firstly, default-process-coding-system is not meant to be used like this. Because some process would expect to work in ANSI, some not. Is process-coding-system-alist a proper option?

Secondly, some times we need to search something in a ANSI file, some times in a UTF-8 file. Is there any convenient way, e.g. M-x utf8-grep/ansi-grep? Or maybe we could implement a M-x ucs-grep (add an interactive argument to let user input the charset name)?

Some notes

 

LC_CTYPE=C.UTF-8

Adding it as an environment variable is not the only way. It is also OK if you put (setenv "LC_CTYPE" "C.UTF-8") into Emacs.

PATH=e:/cygwin/bin:$PATH

The is recommended before starting emacs. Thus you can type less keys in the following steps (path to cygwin could be omitted):

(setq shell-file-name "bash.exe")
grep -nH -e 我试试 /cygdrive/e/foobar.utf8

Something might be worthy mention: Use msys's bash + cygwin's grep

You can get the correct grep result if you use msys's sh/bash with Cygwin's grep. But it might be difficult make sure both msys and cygwin would not be confused with the path: they have different understanding of /, and if you use DOS path, cygwin would throw an anonying warning:

-*- mode: grep; default-directory: "d:/emacs/emacs-24/bin/" -*-
Grep started at Sat Apr 09 17:28:40

e:/cygwin/bin/grep -nH -e 我试试 e:/foobar.utf8
cygwin warning:
   MS-DOS style path detected: e:/foobar.utf8
   Preferred POSIX equivalent is: /cygdrive/e/foobar.utf8
   CYGWIN environment variable option "nodosfilewarning" turns off this warning.
   Consult the user's guide for more details about POSIX paths:
     http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
e:/foobar.utf8:10:我试试






posted @ 2011-04-09 18:26  巴蛮子  阅读(1180)  评论(0)    收藏  举报