linux命令系列 grep

grep, egrep, fgrep - print lines matching a pattern

　　SYNOPSIS
　　　　grep [OPTIONS] PATTERN [FILE...]
　　　　grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]

linux系统中grep命令是一种强大的文本搜索工具，它能使用正则表达式搜索文本，并把匹配的行打印出来，grep全称是Global Regular Expression Print

1. 常用选项:

　　-E, --extended-regexp: Interpret PATTERN as an extended regular expression. # 开启扩展（Extend）的正则表达式

　　-i, --ignore-case: Ignore case distinctions in both the PATTERN and the input files. # 忽略大小写

　　-v, --invert-match: Invert the sense of matching, to select non-matching lines. # 反过来，只打印没有匹配的，而匹配的反而不打印

-n, --line-number: Prefix each line of output with the 1-based line number within its input file. # 显示行号

　　-w, --word-regexp # 被匹配的文本只能是单词，而不能是单词中的某一部分，如文本中有liker，而我搜寻的只是like，就可以使用-w选项来避免匹配liker

　　　　Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word

　　　　constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the
　　　　underscore.

　　-c, --count: Suppress normal output; instead print a count of matching lines for each input file. # 显示总共有多少行被匹配到了，而不是显示被匹配到的内容，注意如果同时使用-cv选项是显示有多少行没有被匹配到。

　　-o, --only-matching: Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line. # 只显示被模式匹配到的字符串。

　　-A NUM, --after-context=NUM: Print NUM lines of trailing context after matching lines. # 显示匹配到的字符串所在的行及其后NUM行

　　-B NUM, --before-context=NUM: Print NUM lines of leading context before matching lines. # 显示匹配到的字符串所在的行及其前NUM行

　　-C NUM, -NUM, --context=NUM: Print NUM lines of output context. # 显示匹配到的字符串所在的行及其前后各NUM行

2. 模式部分:

　　(a)基本正则表达式：

　　　　匹配字符

　　　　　　. : 任意一个字符

　　　　　　[abc] : 表示匹配[abc]中任意一个字符

　　　　　　[a-zA-Z] : 匹配a-z或A-Z之间任意一个字符

　　　　　　[^123] : 匹配123之外的任意一个字符

　　　　　　对于常用的字符集，系统定义如下:

　　　　　　　　[a-zA-Z] <=> [[:alpha:]]

　　　　　　　　[0-9] <=> [[:digit:]]

　　　　　　　　[a-zA-Z0-9] <=> [[:alnum:]]

　　　　　　　　tab,space <=> [[:space:]]

　　　　　　　　[A-Z] <=> [[:upper:]]

　　　　　　　　[a-Z] <=> [[:lower:]]

　　　　　　　　标点符号 <=> [[:punct:]]　　　　　　

　　　　匹配次数:

　　　　　　\{m,n\} : 匹配其前出现的字符至少m次，至多n次

　　　　　　\? : 匹配其前出现的内容0次或1次，等价于\{0,1\}

　　　　　　* : 匹配其前出现的内容任意次,等价于\{0,\} ，所以".*"表示任意字符任意次　

　　　　位置锚定:

　　　　　　^ : 锚定行首

　　　　　　$ : 锚定行尾。常用技巧 "^$"匹配空白行

　　　　　　\b或者\<: 锚定单词的词首。如"\blike" 不会匹配alike，但是会匹配liker

　　　　　　\b或者\> : 锚定单词的词尾。如\blike\b 不会匹配alike或者liker，只会匹配like

　　　　　　\B : 与\b作用相反

　　　　分组及引用:

　　　　　　$string$ : 将string作为一个整体方便后面引用

　　　　　　\n : 引用第n个左括号及其对应的右括号所匹配的内容

3. 扩展的(Extend)正则表达式:

　　匹配字符: 这部分和基本正则表达式一致

　　匹配次数:

　　　　* : 和基本正则表达式一致

　　　　? : 相比基本正则表达式没有\

　　　　{m,n} : 相比基本正则没有\

　　　　+ : 匹配其前面的字符至少一次，相当于{1,}

　　位置锚定: 这部分和基本正则表达式一致

　　分组及引用:　

　　　　(string) : 相比基本正则表达式没有\

　　　　\n : 和正则表达式一样

　　或者:

　　　　a|b : 匹配a或者b, 注意a是指 | 的左边的整体,b也同理，比如C|cat，表示的是C或者cat，而不是Cat或者cat，如果要表示Cat或者cat，则应该写为(C|c)at。(String)除了用于引用还用于分组

注1：默认情况下，正则表达式的匹配工作在贪婪模式下，也就是说它会尽可能长地去匹配，比如某一行有字符串 abacb，如果搜索内容为 "a.*b" 那么会直接匹配 abacb这个串，而不会只匹配ab或acb。

注2：所有的正则字符，如 [ 、* 、( 等，若要搜索 * ，而不是想把 * 解释为重复先前字符任意次，可以使用 \* 来转义。

posted on 2018-11-30 11:54 _Joshua 阅读(436) 评论(0) 收藏举报

刷新页面返回顶部

linux命令系列 grep

导航

公告