Item 9: Master Emacs's regular expressions
条款9:掌握Emacs的正则表达式
The best way to do this is to get yourself the Friedl Book, Mastering Regular Expressions. It's worth it. Every programmer should have a copy, no matter what language or editor you're using.
最好的办法,就是买本Friedl的书《Mastering Regular Expressions》。绝对值!任何一位程序员都该有一本,管你用什么语言什么编辑器。
Emacs's regular expressions have some idiosyncracies that everyone dislikes, but they're not insurmountable, and once you've learned them, it opens up new horizons in editing power.
emacs的正则表达式有些大伙儿都不太喜欢的特质,但这并不是不可克服的,一旦你学到手,你的编辑功力会精进的喔~
Two important regexp-related commands are isearch-forward-regexp and isearch-backward-regexp.These are by default bound to ESC C-r and ESC C-s, respectively. Those keys are lame, though. Any sequence that requires hitting the actual Escape key is lame, and Alt-Ctrl-s on my Compaq machine is invisible to Emacs; it brings up a system diagnostics dialog.
与正则表达式相关的命令中,最重要的是 isearch-forward-regexp和 isearch-backward-regexp。默认设定下,分别绑定于 ESC C-s和 ESC C-r,但这么按有点儿僵。任何要用到Escape键的操作都很僵,而且如果是在我的Compaq电脑上,Alt-Ctrl-s这个按法emacs无法截获,因为它是弹出系统诊断对话框的热键。
I have the isearch-*-regexp commands bound to Alt-r and Alt-s, since I use them so much. Alt-s isn't normally bound. The default Emacs binding for Alt-r is the command move-to-window-line,which you won't need, because you'll use Item 4 for moving around within the window.
由于使用频繁,我把自己的isearch-*-regexp命令绑定到 Alt-r 和 Alt-s上了。alt-s一般没有默认绑定,alt-r默认为我不怎么使用的move-to-window-line,因为我用条款4的方法在编辑窗口移动。
Some modes insist on re-binding Alt-r and Alt-s, which is annoying, and I have a bunch of per-mode hacks to re-bind them, but I don't have all modes covered. If someone can suggest a way to bind Alt-r and Alt-s in such a way that they can't be overridden by any modes, please let me know --I'd be muchly appreciative.
有些mode坚决要重绑定alt-r和alt-s,这很烦人――害我要使用per-mode的招数来重绑定。但我没办法把所以的mode都招呼到。如果有哪个哥们儿知道怎么防止这两个键被任何mode重绑定,麻烦你教我一下,我会非常感激的。
The next two important regexp-related commands are replace-regexp and query-replace-regexp. They function identically, prompting for a regular expression and a replacement string, but query-replace-regexp prompts you to type y or n for each possible replacement.
另外两个也很重要的正则表达式命令是 replace-regexp 和 query-replace-regexp。它俩功能差不多,提示你输入一个正则表达式和替换字符串,只是 query-replace-regexp 要求你在每一个可能的替换发生时输入y或者n。
I use query-replace-regexp so frequently that I have an alias for it:
(defalias 'qrr 'query-replace-regexp)
我跟 query-replace-regexp 关系很铁,以至于还得给它起个外号(别名~):
(defalias 'qrr 'query-replace-regexp)
That way I can type M-x qrr to invoke the function.
这样一来,我就用 M-x qrr 就可以使用这个功能了。
Other useful commands that take regexps are M-x list-matching-lines, which shows you all the linesin a buffer that match some regexp, and M-x apropos, which shows you all commands whose names match a given regexp.
其它有用的命令还有 M-x list-matching-lines -- 可以把buffer中匹配某一regexp的行全列出来; M-x apropos -- 就是那个把所以匹配的命令都列出来的帮助命令。
The most Frequently Asked Question about Emacs regexps is: "How do I insert a newline into a regexp or the replacement string?" Hitting the Enter key simply tells the command that you're done entering the regexp, and it starts doing the replacements. (This is a very good reason for preferring query-replace-regexp over replace-regexp until you're 100% confident that your regexps are right on the first try. I'm still not there yet.)
emacs正则表达式最常被问到的是:“怎么在正则表达式或者替换字符串读取时输入回车呢?” 如果仅仅简单地直接打回车,那emacs会认为你把regexp输完了。(这也是推荐qrr而不是 replace-regexp的原因啊――――在你非常有信心一次试写就可以把正则表达式写对之前,至少,我还没到那个境界)。
The answer is that you need to insert a ^J character, which Emacs uses to represent newlines in functions and commands. At the point in the regexp or replacement where you need to insert a newline, hit Ctrl-q followed by Ctrl-j. Ctrl-q is Emacs's "quote" command: rather than executing the following keystroke, Emacs will insert the key into the current buffer or the minibuffer.
回答是:要输入一个 ^j (译注:也就是Ctrl-j) 字符。在你要输入表达式或者替换串的时候,如果你要输回车符,选按Ctrl-q然后再按Ctrl-j。Ctrl-q是emacs的"quote"命令,它不执行下一个按键,而是把它插入到当前buffer或者minibuffer当中。
Some other useful things to know about Emacs regular expressions:
* You need to double-escape ("\\") regexp metacharacters in strings in elisp code, but you single-escape them when entering regexps at the minibuffer.
* Emacs code does so much matching of parens that Emacs regexps reverse the paren character and metacharacter. In an Emacs regexp, "(" and ")" match actual parens, while "\(" and "\)" create matching groups.
* In the replacement string, use \1, \2, etc. for inserting match-groups from the regexp.
* If you enter in a regexp and it doesn't work, undo the result if the command actually changed anything. Then type the command again, and when it prompts you, use the arrow keys to scroll up and down to find your regexp and replacement string, and modify them in place. This can save you a lot of typing and frustration.
* You can yank text into the middle of a regexp you're typing out with most regexp commands,but NOT in the isearch-*-regexp commands. Yank does weird things in those. I'd love to know how to fix this.
还有一些其它与regexp相关的知识:
* 在elisp代码中,你写两次转义("\\"),而在minibuffer输入时,你只写一次转义就OK了。
* 由于emacs代码有很多要匹配括号,所以emacs反转了括号的语义,在emacs的regexp中,"(",")"匹配实际的括号,而"\)","\("则用以建立匹配组(字串)。
* 在替换串中,使用\1 \2 ...来插入匹配组中的字符串
* 如果你输入的regexp没有正确工作,把结果undo掉,重新输一次命令,当提示输入表达式时,使用上下箭头按键可以找出你之前输入过的记录。这样可以习省时间开销和防止混乱。
* 你可以粘贴内容到大多数正在输入的表达式正则表达式,但在isearch-*-regexp命令上却不行,在这个命令上粘贴内容结果会很奇怪,我很想知道如何解决。
Mastering regular expressions and the commands that use them is one of the most important components of becoming an Emacs Wizard.
掌握正则表达式,在命令中使用正则表达式,是emacs大师的重要技能。