grep man 有删减 百科

NAME
grep, egrep, fgrep, rgrep - print lines matching a pattern

SYNOPSIS
grep [OPTIONS] PATTERN [FILE...]
grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]

DESCRIPTION
grep searches the named input FILEs (or standard input if no files are named, or if a single hyphen-minus (-) is given as file name) for
lines containing a match to the given PATTERN. By default, grep prints the matching lines.

OPTIONS

Matching Control
-e PATTERN, --regexp=PATTERN
Use PATTERN as the pattern. This can be used to specify multiple search patterns, or to protect a pattern beginning with a hyphen
(-). (-e is specified by POSIX.)

-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing. (-f is specified by
POSIX.)

-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files. (-i is specified by POSIX.)

-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX.)

-w, --word-regexp
Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the
beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or
followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore.

-x, --line-regexp
Select only those matches that exactly match the whole line. (-x is specified by POSIX.)

-y Obsolete synonym for -i.

General Output Control
-c, --count
Suppress normal output; instead print a count of matching lines for each input file. With the -v, --invert-match option (see below),
count non-matching lines. (-c is specified by POSIX.)

-L, --files-without-match
Suppress normal output; instead print the name of each input file from which no output would normally have been printed. The
scanning will stop on the first match.

-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning
will stop on the first match. (-l is specified by POSIX.)

-q, --quiet, --silent
Quiet; do not write anything to standard output. Exit immediately with zero status if any match is found, even if an error was
detected. Also see the -s or --no-messages option. (-q is specified by POSIX.)

-s, --no-messages
Suppress error messages about nonexistent or unreadable files. Portability note: unlike GNU grep, 7th Edition Unix grep did not
conform to POSIX, because it lacked -q and its -s option behaved like GNU grep's -q option. USG-style grep also lacked -q but its -s
option behaved like GNU grep. Portable shell scripts should avoid both -q and -s and should redirect standard and error output to
/dev/null instead. (-s is specified by POSIX.)

Output Line Prefix Control
-H, --with-filename
Print the file name for each match. This is the default when there is more than one file to search.

-h, --no-filename
Suppress the prefixing of file names on output. This is the default when there is only one file (or only standard input) to search.

-n, --line-number
Prefix each line of output with the 1-based line number within its input file. (-n is specified by POSIX.)

-T, --initial-tab
Make sure that the first character of actual line content lies on a tab stop, so that the alignment of tabs looks normal. This is
useful with options that prefix their output to the actual content: -H,-n, and -b. In order to improve the probability that lines
from a single file will all start at the same column, this also causes the line number and byte offset (if present) to be printed in
a minimum size field width.

Context Line Control
-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines. Places a line containing a group separator (--) between contiguous groups
of matches. With the -o or --only-matching option, this has no effect and a warning is given.

-B NUM, --before-context=NUM
Print NUM lines of leading context before matching lines. Places a line containing a group separator (--) between contiguous groups
of matches. With the -o or --only-matching option, this has no effect and a warning is given.

-C NUM, -NUM, --context=NUM
Print NUM lines of output context. Places a line containing a group separator (--) between contiguous groups of matches. With the
-o or --only-matching option, this has no effect and a warning is given.


File and Directory Selection

--exclude=GLOB
Skip files whose base name matches GLOB (using wildcard matching). A file-name glob can use *, ?, and [...] as wildcards, and \ to
quote a wildcard or backslash character literally.

--exclude-from=FILE
Skip files whose base name matches any of the file-name globs read from FILE (using wildcard matching as described under --exclude).

--exclude-dir=DIR
Exclude directories matching the pattern DIR from recursive searches.

--include=GLOB
Search only files whose base name matches GLOB (using wildcard matching as described under --exclude).

-R, -r, --recursive
Read all files under each directory, recursively; this is equivalent to the -d recurse option.

REGULAR EXPRESSIONS

The period . matches any single character.

Character Classes and Bracket Expressions
A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of
the list is the caret ^ then it matches any character not in the list. To include a literal ] place it first in the list.
Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.

Anchoring
The caret ^ and the dollar sign $ are meta-characters that respectively match the empty string at the beginning and end of a line.

The Backslash Character and Special Expressions
The symbols \< and \> respectively match the empty string at the beginning and end of a word. The symbol \b matches the empty string at the
edge of a word, and \B matches the empty string provided it's not at the edge of a word. The symbol \w is a synonym for [[:alnum:]] and \W
is a synonym for [^[:alnum:]].``\s'' Match whitespace, it is a synonym for `[[:space:]]'.``\S'' Match non-whitespace, it is a synonym
for `[^[:space:]]'. For example, `\brat\b' matches the separate word `rat', `\Brat\B' matches `crate' but not `furry rat'.

Repetition
A regular expression may be followed by one of several repetition operators:
? The preceding item is optional and matched at most once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.
{n} The preceding item is matched exactly n times.
{n,} The preceding item is matched n or more times.
{n,m} The preceding item is matched at least n times, but not more than m times.

Concatenation
Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that
respectively match the concatenated expressions.

Alternation
Two regular expressions may be joined by the infix operator |; the resulting regular expression matches any string matching either alternate
expression.

Precedence
Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole expression may be enclosed in
parentheses to override these precedence rules and form a subexpression.

Back References and Subexpressions
The back-reference \n, where n is a single digit, matches the substring previously matched by the nth parenthesized subexpression of the
regular expression.

EXIT STATUS
The exit status is 0 if selected lines are found, and 1 if not found. If an error occurred the exit status is 2. (Note: POSIX error
handling code should check for '2' or greater.)

TeXinfo Documentation
The full documentation for grep is maintained as a TeXinfo manual. If the info and grep programs are properly installed at your site, the
command

info grep

should give you access to the complete manual.

 

 

http://baike.baidu.com/view/1057278.htm

GREP
编辑
grep (global search regular expression(RE) and print out the line,全面搜索正则表达式并把行打印出来)是一种强大的文本搜索工具,它能使用正则表达式搜索文本,并把匹配的行打印出来。Unix的grep家族包括grep、egrep和fgrep。

1
基本简介
egrep和fgrep的命令只跟grep有很小不同。egrep是grep的扩展,支持更多的re元字符,fgrep就是fixed grep或fast grep,它们把所有的字母都看作单词,也就是说,正则表达式中的元字符表示回其自身的字面意义,不再特殊。linux使用GNU版本的grep。它功能更强,可以通过-G、-E、-F命令行选项来使用egrep和fgrep的功能。
grep的工作方式是这样的,它在一个或多个文件中搜索字符串模板。如果模板包括空格,则必须被引用,模板后的所有字符串被看作文件名。搜索的结果被送到屏幕,不影响原文件内容。
grep可用于shell脚本,因为grep通过返回一个状态值来说明搜索的状态,如果模板搜索成功,则返回0,如果搜索不成功,则返回1,如果搜索的文件不存在,则返回2。我们利用这些返回值就可进行一些自动化的文本处理工作。
2
表达符集
^
锚定行的开始 如:'^grep'匹配所有以grep开头的行。
$
锚定行的结束 如:'grep$'匹配所有以grep结尾的行。
.
匹配一个非换行符('\n')的字符如:'gr.p'匹配gr后接一个任意字符,然后是p。
*
匹配零个或多个先前字符 如:' *grep' (注意*前有空格)匹配所有零个或多个空格后紧跟grep的行,需要用egrep 或者grep带上 -E 选项。 .*一起用代表任意字符。
[]
匹配一个指定范围内的字符,如'[Gg]rep'匹配Grep和grep。
[^]
匹配一个不在指定范围内的字符,如:'[^A-FH-Z]rep'匹配不包含A-F和H-Z的一个字母开头,紧跟rep的行。
\(..\)
标记匹配字符,如'\(love\)',love被标记为1。
\<
锚定单词的开始,如:'\<grep'匹配包含以grep开头的单词的行。
\>
锚定单词的结束,如'grep\>'匹配包含以grep结尾的单词的行。
x\{m\}
重复字符x,m次,如:'o\{5\}'匹配包含5个o的行。
x\{m,\}
重复字符x,至少m次,如:'o\{5,\}'匹配至少有5个o的行。
x\{m,n\}
重复字符x,至少m次,不多于n次,如:'o\{5,10\}'匹配5--10个o的行。
\w
匹配文字和数字字符,也就是[A-Za-z0-9],如:'G\w*p'匹配以G后跟零个或多个文字或数字字符,然后是p。
\W
\w的反置形式,匹配一个或多个非单词字符,如点号句号等。
\b
单词锁定符,如: '\bgrep\b'只匹配grep。[1]
用于egrep和 grep -E的元字符扩展集
+
匹配一个或多个先前的字符。如:'[a-z]+able',匹配一个或多个小写字母后跟able的串,如loveable,enable,disable等。
?
匹配零个或一个先前的字符。如:'gr?p'匹配gr后跟一个或没有字符,然后是p的行。
a|b|c
匹配a或b或c。如:grep|sed匹配grep或sed
()
分组符号,如:love(able|rs)ov+匹配loveable或lovers,匹配一个或多个ov。
x{m},x{m,},x{m,n}
作用同x\{m\},x\{m,\},x\{m,n\}
POSIX字符类
为了在不同国家的字符编码中保持一至,POSIX(The Portable Operating System Interface)增加了特殊的字符类,如[:alnum:]是A-Za-z0-9的另一个写法。要把它们放到[]号内才能成为正则表达式,如[A- Za-z0-9]或[[:alnum:]]。在linux下的grep除fgrep外,都支持POSIX的字符类。
[:alnum:]
文字数字字符
[:alpha:]
文字字符
[:digit:]
数字字符
[:graph:]
非空字符(非空格、控制字符)
[:lower:]
小写字符
[:cntrl:]
控制字符
[:print:]
非空字符(包括空格)
[:punct:]
标点符号
[:space:]
所有空白字符(新行,空格,制表符)
[:upper:]
大写字符
[:xdigit:]
十六进制数字(0-9,a-f,A-F)
3
命令选项
-?
同时显示匹配行上下的?行,如:grep -2 pattern filename同时显示匹配行的上下2行。
-a, --text
等价于匹配text,用于(Binary file (standard input) matches)报错
-b,--byte-offset
打印匹配行前面打印该行所在的块号码。
-c,--count
只打印匹配的行数,不显示匹配的内容。
-f File,--file=File
从文件中提取模板。空文件中包含0个模板,所以什么都不匹配。
-h,--no-filename
当搜索多个文件时,不显示匹配文件名前缀。
-i,--ignore-case
忽略大小写差别。
-o, --only-matching
只显示正则表达式匹配的部分。(show only the part of a line matching PATTERN)
-q,--quiet
取消显示,只返回退出状态。0则表示找到了匹配的行。
-l,--files-with-matches
打印匹配模板的文件清单。
-L,--files-without-match
打印不匹配模板的文件清单。
-n,--line-number
在匹配的行前面打印行号。
-s,--silent
不显示关于不存在或者无法读取文件的错误信息。
-v,--revert-match
反检索,只显示不匹配的行。
-w,--word-regexp
如果被\<和\>引用,就把表达式做为一个单词搜索。
-R, -r, --recursive
递归的读取目录下的所有文件,包括子目录。 比如grep -R 'pattern' test会在 test 及其子目录下的所有文件中,匹配 pattern。
-V,--version
显示软件版本信息。
4
实例
要用好grep这个工具,其实就是要写好正则表达式,所以这里不对grep的所有功能进行实例讲解,只列几个例子,讲解一个正则表达式的写法。
$ ls -l | grep '^a'
通过管道过滤ls -l输出的内容,只显示以a开头的行。
$ grep 'test' d*
显示所有以d开头的文件中包含test的行。
$ grep 'test' aa bb cc
显示在aa,bb,cc文件中匹配test的行。
$ grep '[a-z]\{5\}' aa
显示所有包含每个字符串有5个连续小写字符的字符串的行。
$ grep 'w\(es\)t.*\1' aa
如果west被匹配,则es就被存储到内存中,并标记为1,然后搜索任意个字符(.*),这些字符后面紧跟着另外一个es(\1),找到就显示该行。如果用egrep或grep -E,就不用"\"号进行转义,直接写成'w(es)t.*\1'就可以了。
5
注意
在某些机器上,要使用-E参数才能够进行逻辑匹配(详见下)
grep "a|b" (匹配包含字符样式为"a|b"的行)
grep -E "a|b" (匹配包含字符样式为"a"或"b"的行)
man grep里面关于-E参数的说明是
-E
Treats each pattern specified as an extended regular expression (ERE). A NULL value for the ERE matches every line.
Note: The grep command with the -E flag is the same as the egrep command, except that error and usage messages are different and the -s flag functions differently.
6
拓展命令
egrep 命令,搜索文件获得模式。
egrep 命令会在输入文件(缺省值为标准输入)中搜索与用 Pattern 参数指定的模式相匹配的行。这些模式是完整的正则表达式就像在 ed 命令中的那样(除了 \ (反斜杠)和 \\ (双反斜杠))。下列规则也应用于 egrep 命令:
* 一个正则表达式后面带一个 + (加号)会匹配一个或多个的正则表达式。
* 一个正则表达式后面带一个 ? (问号)会匹配零个或一个该正则表达式。
* 由 | (竖线)或者换行符隔开的多个正则表达式会匹配与任何一个正则表达式所匹配的字符串。
* 一个正则表达式可以被包括在“()”(括弧)中进行分组。
换行符将不会被正则表达式匹配。
运算符的优先顺序是 [, ], *, ?, +, 合并, | 和换行符。
注意: egrep 命令与 grep 命令带 -E 标志是一样的,除了错误消息和使用消息不同以及 -s 标志的功能不同之外。
egrep 命令会显示包含该匹配行的文件,如果您指定了多于一个 File 参数的话。
对 shell 有特殊含义的字符($, *, [, |, ^, (, ), \ ) 出现在 Pattern 参数中时必须带双引号。如果 Pattern 参数不是简单字符串,通常必须用单引号将整个模式括起来。在表达式中比如 [a-z],减号表示通过当前整理序列。整理序列可以定义等价的类以供在字符范围中使用。它使用了快速确定性的算法,有时需要外部空间。[2]
fgrep命令, 为文件搜索文字字符串。
fgrep命令搜索 File 参数指定的输入文件(缺省为标准输入)中的匹配模式的行。fgrep命令特别搜索 Pattern 参数,它们是固定的字符串。如果在 File 参数中指定一个以上的文件fgrep命令将显示包含匹配行的文件。
fgrep命令于 grep 和 egrep 命令不同,因为它搜索字符串而不是搜索匹配表达式的模式。fgrep命令使用快速的压缩算法。$, *, [, |, (, ) 和 \ 等字符串被fgrep命令按字面意思解释。这些字符并不解释为正则表达式,但它们在 grep 和 egrep 命令中解释为正则表达式。
因为这些字符对于 shell 有特定的含义,完整的字符串应该加上单引号(‘ ... ’)。
如果没有指定文件,fgrep命令假定标准输入。一般,找到的每行都复制到标准输出中去。如果不止一个输入文件,则在找到的每行前打印文件名。

 

posted @ 2014-01-05 09:17  baihuahua  阅读(502)  评论(0编辑  收藏  举报