Linux Bash文本操作之sed篇其一

作为Linux系统中文本处理的强力工具之一,sed功能强大,用法多变,值得我们好好学习。

sed是用于过滤和转换文本的流编辑器。

一般情况下sed把当前处理的行存储在临时缓冲区,按指定命令处理之后将缓冲区内容输出到屏幕,当然可以使用  -n  选项使得不打印内容到屏幕。另外这些操作默认对原文本没有影响,不会改变原来的文本内容。

但是如果我们确实想要将处理结果作用于原文本,使用  -i  选项将修改附加到原文件,注意要谨慎使用!

调用方式

  • 命令行输入
sed -e 'command' input_file
  • 脚本文件输入
sed -f script_file input_file

下面通过一些实际操作说明一下 sed (未加说明即是指 sed (GNU sed) 4.2.2 ,下同)常用参数的含义和用法

首先获得实验文本

cv@cv: ~/myfiles$ touch test.txt
cv@cv: ~/myfiles$ man sed | head -n 30 | tail -n 28 > test.txt
cv@cv: ~/myfiles$ cat test.txt
 1 NAME
 2        sed - stream editor for filtering and transforming text
 3 
 4 SYNOPSIS
 5        sed [OPTION]... {script-only-if-no-other-script} [input-file]...
 6 
 7 DESCRIPTION
 8        Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
 9        editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text
10        in a pipeline which particularly distinguishes it from other types of editors.
11 
12        -n, --quiet, --silent
13 
14               suppress automatic printing of pattern space
15 
16        -e script, --expression=script
17 
18               add the script to the commands to be executed
19 
20        -f script-file, --file=script-file
21 
22               add the contents of script-file to the commands to be executed
23 
24        --follow-symlinks
25 
26               follow symlinks when processing in place
27 
28        -i[SUFFIX], --in-place[=SUFFIX]
test.txt

然后根据我们获得的文本进行实验操作,这里还要说明一下,下面的展示中,

如 d Delete pattern... 这样单独列出来的是命令的含义,摘自sed手册;

如 -e 这样表示环境配置指令,不在引号表示的指令中出现,应该放在引号指令之前;

如  d  这样的要在 'command' 中才能使用。

删除操作

 d  表示删除模式空间中的内容,进行下一个循环。 number 用来匹配单独的一行, regexp 为正则项,用来寻找符合条件的内容。

 ^  是正则表达式中常用来匹配行首的字符。如  ^#  匹配以  #  开头的行。

示例1用来展示删除第二行的操作。

示例2表示删除第二到第十行,包括第二行。

示例3表示删除空行,注意该命令只能删除纯空行,也就是不能是由空格、制表符等组成的空行。

示例4中小数点可以用来匹配任意一个字符,包括空格制表符等,感叹号表示除了这些之外的所有行,两者结合就表示不包含任意字符的行即纯空行,因此它的作用就是删除文本中所有空行。

示例5表示如果不想将所有的空行删除,比如我只想将前四行中的空行删掉,可以使用下面的指令。

示例6表示删除指定行行首的空格,替换为制表符 。

示例7表示删除从匹配到第一个Sed开始的行到第一个pipeline所在行之间的部分。

d          Delete pattern space.  Start next cycle.
number     Match only the specified line number (which increments cumulatively across files, unless the -s option is specified on the command line).
/regexp/   Match lines matching the regular expression regexp.
cv@cv:~/myfiles$ sed '2d' test.txt  #example-1

cv@cv:~/myfiles$ sed '2,10d' test.txt  #example-2

cv@cv:~/myfiles$ sed '/^$/d' test.txt  #example-3

cv@cv:~/myfiles$ sed '/./!d' test.txt        #example-4

cv@cv:~/myfiles$ sed '1,4{/^$/d}' test.txt  #example-5

cv@cv:~/myfiles$ sed '6,10s/^[[:space:]]*/\t/g' test.txt  #example-6

cv@cv:~/myfiles$ sed '/Sed/,/pipeline/d' test.txt    #example-7
NAME
       sed - stream editor for filtering and transforming text
SYNOPSIS
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION

       -n, --quiet, --silent
              suppress automatic printing of pattern space
       -e script, --expression=script
              add the script to the commands to be executed
       -f script-file, --file=script-file
              add the contents of script-file to the commands to be executed
       --follow-symlinks
              follow symlinks when processing in place
       -i[SUFFIX], --in-place[=SUFFIX]

了解了这些之后我们可以试着将处理作用于原文件,可以查看一下效果。

-i[SUFFIX], --in-place[=SUFFIX]    edit files in place (makes backup if SUFFIX supplied)
cv@cv:~/myfiles$ sed -i '1,6{/^$/d};12,${/^$/d}' test.txt
cv@cv:~/myfiles$ cat test.txt
NAME
       sed - stream editor for filtering and transforming text
SYNOPSIS
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text
       in a pipeline which particularly distinguishes it from other types of editors.

       -n, --quiet, --silent
              suppress automatic printing of pattern space
       -e script, --expression=script
              add the script to the commands to be executed
       -f script-file, --file=script-file
              add the contents of script-file to the commands to be executed
       --follow-symlinks
              follow symlinks when processing in place
       -i[SUFFIX], --in-place[=SUFFIX]

查看操作

默认情况下,sed打印当前缓存区中的输入行,也就是说它会把所有输入行都打印在标准输出上。如果在某一行匹配到给定字符串,该行就会被另外打印一遍。

选项  -n  取消sed取消默认打印操作。命令  p  指示sed将再次打印该行。此二者配合使用,模式缓冲区内的输入行,只被打印一次。

示例1表示输出第三到五行内容。

示例2表示打印所有匹配行的行号。

示例3表示打印所有匹配行的内容。

示例4表示既打印行号又打印内容。

示例5表示查看从第二行开始到匹配到的行之间的内容。

示例6表示打印所有以三个大写字母开头的行。  ^  匹配行首, [A-Z] 匹配一个大写字母, {3} 表示前面的字母有三个,综合起来就表示以三个大写字母开头。

示例7用来打印长度超过60个字符的所有行。

示例8与此相对,打印长度不超过60个字符的所有行。

-e script, --expression=script    add the script to the commands to be executed
-n, --quiet, --silent    suppress automatic printing of pattern space
=    Print the current line number.
p    Print the current pattern space.
cv@cv:~/myfiles$ sed -n '3,5p' test.txt  #example-1
SYNOPSIS
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION

cv@cv:~/myfiles$ sed -n '/[sS]ed/=' test.txt  #example-2

cv@cv:~/myfiles$ sed -n '/[sS]ed/p' test.txt  #example-3

cv@cv:~/myfiles$ sed -n -e '/[sS]ed/=' -e '/[sS]ed/p' test.txt  #example-4
2
       sed - stream editor for filtering and transforming text
4
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
6
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
7
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text

cv@cv:~/myfiles$ sed -n '2,/quiet/p' test.txt  #example-5
    sed - stream editor for filtering and transforming text
SYNOPSIS
    sed [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION
    Sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an
    editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed's ability to filter text
    in a pipeline which particularly distinguishes it from other types of editors.

    -n, --quiet, --silent

cv@cv:~/myfiles$ sed -n '/^[A-Z]\{3\}/p' test.txt      #example-6
NAME
SYNOPSIS
DESCRIPTION

cv@cv:~/myfiles$ sed -n '/^.\{60\}/p' test.txt    #example-7
       sed - stream editor for filtering and transforming text
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text
       in a pipeline which particularly distinguishes it from other types of editors.
              add the contents of script-file to the commands to be executed

cv@cv:~/myfiles$ sed -n '/^.\{60\}/!p' test.txt    #example-8
NAME
SYNOPSIS
DESCRIPTION

       -n, --quiet, --silent
              suppress automatic printing of pattern space
       -e script, --expression=script
              add the script to the commands to be executed
       -f script-file, --file=script-file
       --follow-symlinks
              follow symlinks when processing in place
       -i[SUFFIX], --in-place[=SUFFIX]

甚至可以设置奇偶行或者指定间隔的行输出和操作,  first  指的是其实匹配行,  step  指的是步长。如下面的例子

示例1表示从第一行开始,每隔两行输出一次,也就是输出所有奇数行的内容。

示例2为了更清晰地展示输出的是第几行的内容,稍微改动一下指令,使用  =  显示行号即可。

first~step    Match  every step th line starting with line first.  For example, ``sed -n 1~2p'' will print
              all the odd-numbered lines in the input stream, and the address 2~5 will match every fifth line,
              starting with the second. first can be zero; in this case, sed operates as if it were equal to
              step. (This is an extension.)
cv@cv:~/myfiles$ sed -n '1~2p' test.txt      #example-1

cv@cv:~/myfiles$ sed -n '1~2=;1~2p' test.txt   #example-2
1
NAME
3
SYNOPSIS
5
DESCRIPTION
7
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text
9

11
              suppress automatic printing of pattern space
13
              add the script to the commands to be executed
15
              add the contents of script-file to the commands to be executed
17
              follow symlinks when processing in place

增加一行或几行内容

示例1表示在每一行之后输出 "my name is lee" 

示例2表示只在第二行之后输出。

示例3表示只在最后一行输出,类似于vim中的o,在行后插入内容。

示例4表示在最后一行之前输出,类似于vim中的O,在行前插入内容。

示例5表示在匹配到的行之后添加一行内容。考虑到在最后一行之后再添加空行反而影响输出效果,我们可以通过添加  $!  来取消对最后一行的操作。

示例6和7表示在每一行的后面添加一空行,整体文本行间距加倍,连续两个  G  使后面添加两行空行。同样的,对最后一行不操作。

a text    Append text, which has each embedded newline preceded by a backslash.
i text    Insert text, which has each embedded newline preceded by a backslash.
$         Match the last line.
G         Copy/append hold space to pattern space.
cv@cv: ~/myfiles$ sed 'a my name is lee' test.txt   #example-1

cv@cv: ~/myfiles$ sed '2a my name is lee' test.txt  #example-2

cv@cv: ~/myfiles$ sed '$a my name is lee' test.txt  #example-3

cv@cv: ~/myfiles$ sed '$i my name is lee' test.txt  #example-4

cv@cv:~/myfiles$ sed '/quiet/a\ inserted line' test.txt    #example-5
NAME
       sed - stream editor for filtering and transforming text
SYNOPSIS
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text
       in a pipeline which particularly distinguishes it from other types of editors.

       -n, --quiet, --silent
 inserted line

cv@cv:~/myfiles$ sed '$!G' test.txt        #example-6
cv@cv:~/myfiles$ sed '$!G;$!G' test.txt     #example-7

替换操作

如下面的例子所示。

示例1表示将第二行替换为 "my name is lee" 。

示例2表示将第二行到最后一行替换为 "my name is lee" ,也即是删除选中行,再插入给定的一行。

c text    Replace the selected lines with text, which has each embedded newline preceded by a backslash.
cv@cv:~/myfiles$ sed '2c my name is lee' test.txt      #example-1
cv@cv:~/myfiles$ sed '2,$c my name is lee' test.txt     #example-2

除了像上面这种命令直接替换之外,还有一种更强大的模式匹配替换。

格式: [address[,address]]s/pattern-find/replacement-pattern/[g,p,w,n] 

  g  对模式空间所有出现的情况进行全局更改,缺省状态只替换首次出现的模式。

  p  打印模式空间的内容,上面已经解释过了。

  w  后面跟文件名,表示将替换操作重定向到指定文件。

  n  代表 [1, 512] 之间的一个数字,表示对本模式中指定模式第n次出现的情况进行替换。

示例1表示将文中的sed或Sed替换成SED,只替换每一行首次出现的匹配字符串 ,这是缺省设置。

示例2表示使用参数  g ,与上面的结果对比,可以发现使用该参数会替换所有行内出现的所有匹配字符串而不只是第一次出现的。注意第六行的 used 在示例1中并没有改变。

示例3表示使用参数p只显示被改变的行,但直接像示例这样还不行,会将所有改变的行多显示一遍,因此我们常将  p  和  n  一起使用。

示例4表示使用 --quiet/silent 参数完成上面的任务,只显示被改变的行。

示例5表示将第一行到第三行之间的所有sed或Sed替换成SED,后面使用  p  和 gp  等价,结果完全一样。

示例6表示多个命令使用分隔符 ';' 隔离开,并且可以使用其他的分隔符来代替常用的默认的 '/' 。

示例7展示了将替换部分重定向到指定文件的操作。

s/regexp/replacement/    Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement.
                         The replacement may contain the special character & to refer to that portion of the pattern space which matched,
                         and the special escapes \1 through \9 to refer to the corresponding matching sub-expressions in the regexp.
cv@cv:~/myfiles$ sed 's/[sS]ed/\U&/' test.txt   #example-1

cv@cv:~/myfiles$ sed 's/[sS]ed/\U&/g' test.txt  #example-2

cv@cv:~/myfiles$ sed 's/[sS]ed/\U&/p' test.txt  #example-3

cv@cv:~/myfiles$ sed -n 's/[sS]ed/\U&/p' test.txt    #example-4

cv@cv:~/myfiles$ sed -n '2,4s/[sS]ed/\U&/p' test.txt  #example-5
cv@cv:~/myfiles$ sed -n '2,4s/[sS]ed/\U&/gp' test.txt

cv@cv:~/myfiles$ sed -n '2,4s/[sS]ed/\U&/gp;6,7s#editor#EdItOr#gp' test.txt  #example-6

cv@cv:~/myfiles$ touch output    # example-7
cv@cv:~/myfiles$ sed 's/[sS]ed/SED/gw output'  test.txt
NAME
       SED - stream editor for filtering and transforming text
SYNOPSIS
       SED [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION
       SED  is  a  stream  editor.  A stream editor is uSED to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), SED works by making only one pass over the input(s), and is consequently more efficient.  But it is SED's ability to filter text
       in a pipeline which particularly distinguishes it from other types of editors.

       -n, --quiet, --silent
              suppress automatic printing of pattern space
       -e script, --expression=script
              add the script to the commands to be executed
       -f script-file, --file=script-file
              add the contents of script-file to the commands to be executed
       --follow-symlinks
              follow symlinks when processing in place
       -i[SUFFIX], --in-place[=SUFFIX]

cv@cv:~/myfiles$ cat output
       SED - stream editor for filtering and transforming text
       SED [OPTION]... {script-only-if-no-other-script} [input-file]...
       SED  is  a  stream  editor.  A stream editor is uSED to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), SED works by making only one pass over the input(s), and is consequently more efficient.  But it is SED's ability to filter text

示例1使用I指令是sed对大小写不敏感,可以此来简化上面示例中的[]表达式。

示例2利用正则表达式匹配到的行作为操作当前行,将该行中的sed替换成SED。

示例3作用与2正相反,对未匹配到editor字符串且包含有sed模式的行进行操作,将其中的sed替换成SED。

示例4先匹配文中的sed字符串,找到之后用下一个输入行替代模式空间中的当前行,执行替换操作,然后打印该行,再进行其他操作。

cv@cv:~/myfiles$ sed -n 's/sed/SED/Ip' test.txt    #example-1
       SED - stream editor for filtering and transforming text
       SED [OPTION]... {script-only-if-no-other-script} [input-file]...
       SED  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), SED works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text

cv@cv:~/myfiles$ sed -n '/editor/s/sed/SED/p' test.txt    #example-2
       SED - stream editor for filtering and transforming text
       Sed  is  a  stream  editor.  A stream editor is uSED to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), SED works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text

cv@cv:~/myfiles$ sed -n '/editor/!s/sed/SED/p' test.txt    #example-3
       SED [OPTION]... {script-only-if-no-other-script} [input-file]...

cv@cv:~/myfiles$ sed -n '/sed/{n;s/sed/SED/p;}' test.txt    #example-4
        editor which permits scripted edits (such as ed), SED works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text

还可以使用匹配到的行去替换另一种匹配得到的行,这里

 h  是将模式空间内容拷贝到保持空间,类似于windows中复制之后保存到剪贴板上;

 g  表示将保持空间的内容拷贝到模式空间,也就是来替换我们找到的匹配处内容。这里的参数与刚刚测试过的正则表达式中的参数含义不同,请特别注意。

操作流程大致如下

  1. 读取输入文件流;
  2. 读取文件流中一行内容到 Pattern Space ;
  3. 在 Pattern Space 与 Hold Space 中执行命令;
  4. 输出 Pattern Space 中的内容;
  5. 清空 Pattern Space ;
  6. 判断是否为文件末尾 EOF ;
  7. 如果尚未到末尾,读取文件下一行内容,返回步骤2循环操作;
  8. 如果为文件末尾,终止循环,结束操作流程。

示例1的作用是寻找文本中包含 NAME 的行,并用它来替换所有包含 pipeline 的行。

示例2的分析可以结合上面的流程进行。

h H    Copy/append pattern space to hold space.
g G    Copy/append hold space to pattern space.
cv@cv:~/myfiles$ sed -e '/NAME/h' -e '/pipeline/g' test.txt  #example-1
cv@cv:~/myfiles$ cat test.txt
NAME
       sed - stream editor for filtering and transforming text
SYNOPSIS
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION
NAME
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text
NAME

       -n, --quiet, --silent
              suppress automatic printing of pattern space
       -e script, --expression=script
              add the script to the commands to be executed
       -f script-file, --file=script-file
              add the contents of script-file to the commands to be executed
       --follow-symlinks
              follow symlinks when processing in place
       -i[SUFFIX], --in-place[=SUFFIX]
cv@cv:~/myfiles$ seq 5 | sed 'H;g'    #example-2

1

1
2

1
2
3

1
2
3
4

1
2
3
4
5

转换内容操作

y/source/dest/    Transliterate the characters in the pattern space which appear in source to the corresponding character in dest.

用来将原字符转换成指定字符,只能一个一个转换。不常用,用的时候要注意作用范围。如下面的示例。

cv@cv:~/myfiles$ sed 'y/Ssed/XXYZ/' test.txt
NAME
       XYZ - XtrYam YZitor for filtYring anZ tranXforming tYxt
XYNOPXIX
       XYZ [OPTION]... {Xcript-only-if-no-othYr-Xcript} [input-filY]...
DEXCRIPTION
       XYZ  iX  a  XtrYam  YZitor.  A XtrYam YZitor iX uXYZ to pYrform baXic tYxt tranXformationX on an input XtrYam (a filY or input from a pipYlinY).  WhilY in XomY wayX Ximilar to an
       YZitor which pYrmitX XcriptYZ YZitX (Xuch aX YZ), XYZ workX by making only onY paXX ovYr thY input(X), anZ iX conXYquYntly morY YfficiYnt.  But it iX XYZ'X ability to filtYr tYxt
       in a pipYlinY which particularly ZiXtinguiXhYX it from othYr typYX of YZitorX.

       -n, --quiYt, --XilYnt
              XupprYXX automatic printing of pattYrn XpacY
       -Y Xcript, --YxprYXXion=Xcript
              aZZ thY Xcript to thY commanZX to bY YxYcutYZ
       -f Xcript-filY, --filY=Xcript-filY
              aZZ thY contYntX of Xcript-filY to thY commanZX to bY YxYcutYZ
       --follow-XymlinkX
              follow XymlinkX whYn procYXXing in placY
       -i[XUFFIX], --in-placY[=XUFFIX]

 \b  用来匹配单词边界,就是单词的开头或结尾的意思,例如给定一个单词 hello ,使用  \bhe  可以找到该单词,  h  为单词的左边界,  llo\b  也可以匹配到,  o  为单词的右边界。

同理  \B  用来匹配非单词边界,也就是要匹配的内容必须包含在单词中间,不能是左边界也不能是右边界,  \Bell, ell\B, \Bell\B  都可以匹配到单词,  \Bhe  或者  llo\B  就不行,因为  h  和  o  是单词边界。

示例1中后面的  \U&  意思是将符合条件的过滤项转换成全部大写的形式,该示例表示  \b  用来匹配文本中单词开头或结尾字符,  \btw  这里指以  tw  开头的单词。

示例2中  tw\b  这里指的是以  tw  结尾的单词。

示例3寻找以  tw  开头且以  tw  结尾,匹配形如  tw、twtw、tw*tw  的单词。

示例4中  \B  用来匹配文本中非单词开头和结尾字符,这里指的是单词中包含  tw  但是  tw  不在开头也不在结尾处。

示例5寻找不以  tw  结尾的单词。

示例6寻找不以  tw  开头也不以  tw  结尾,只能在单词中间存在。

示例7寻找以  tw  开头但不以  tw  结尾,  twtw  这样的就不符合筛选条件。

示例8寻找不以  tw  开头而且以  tw  结尾的单词。

\b    matches the empty string at the edge of a word
\B    matches the empty string provided it's not at the edge of a word
cv@cv:~/myfiles$ echo "one two three btw is the abbr of by the way whether twher is meaningful? SHA random code twfdoetw tw wsr239wfgrte see-you-tw-tommorrow" >> test.txt

cv@cv:~/myfiles$ sed -n 's/\btw/\U&/gp' test.txt    #example-1
one TWo three btw is the abbr of by the way whether TWher is meaningful? SHA random code TWfdoetw TW wsr239wfgrte seeyoutwtommorrow

cv@cv:~/myfiles$ sed -n 's/tw\b/\U&/gp' test.txt    #example-2
one two three bTW is the abbr of by the way whether twher is meaningful? SHA random code twfdoeTW TW wsr239wfgrte seeyoutwtommorrow

cv@cv:~/myfiles$ sed -n 's/\btw\b/\U&/gp' test.txt   #example-3
one two three btw is the abbr of by the way whether twher is meaningful? SHA random code twfdoetw TW wsr239wfgrte seeyoutwtommorrow

cv@cv:~/myfiles$ sed -n 's/\Btw/\U&/gp' test.txt    #example-4
one two three bTW is the abbr of by the way whether twher is meaningful? SHA random code twfdoeTW tw wsr239wfgrte seeyouTWtommorrow

cv@cv:~/myfiles$ sed -n 's/tw\B/\U&/gp' test.txt    #example-5
one TWo three btw is the abbr of by the way whether TWher is meaningful? SHA random code TWfdoetw tw wsr239wfgrte seeyouTWtommorrow

cv@cv:~/myfiles$ sed -n 's/\Btw\B/\U&/gp' test.txt   #example-6
one two three btw is the abbr of by the way whether twher is meaningful? SHA random code twfdoetw tw wsr239wfgrte seeyouTWtommorrow

cv@cv:~/myfiles$ sed -n 's/\btw\B/\U&/gp' test.txt   #example-7
one TWo three btw is the abbr of by the way whether TWher is meaningful? SHA random code TWfdoetw tw wsr239wfgrte seeyoutwtommorrow

cv@cv:~/myfiles$ sed -n 's/\Btw\b/\U&/gp' test.txt   #example-8
one two three bTW is the abbr of by the way whether twher is meaningful? SHA random code twfdoeTW tw wsr239wfgrte seeyoutwtommorrow

cv@cv:~/myfiles$ sed -i '$d' test.txt          # delete the last line

这里的  &  其实是正则表达式中用来代表刚匹配过的字符串的符号。

示例1中在第一到四行找到匹配指定模式的行并将其匹配字符串左右各加一个等号。

示例2中包含在小括号内的模式  s  作为标签1保存在特定的寄存器中,替换串可以通过  \1  来引用它,该例就是将 sed 中的 ed 替换成 ad 而保持首字母不变。

示例3中小括号内使用  \|  表示选择,可以匹配两个中的任意一个,然后再执行与示例2相同的操作。

示例4也一样可以将第一行到第六行的sed或Sed替换成SED,这里的  \|  也表示选择。

示例5表示寻找包含其中一种模式的行,括号是正则表达式中的额选择项。

cv@cv:~/myfiles$ sed -n '1,4s/sed/=&=/p' test.txt    #example-1
       =sed= - stream editor for filtering and transforming text
       =sed= [OPTION]... {script-only-if-no-other-script} [input-file]...

cv@cv:~/myfiles$ sed -n '1,6s/\(s\)ed/\1ad/p' test.txt    #example-2
       sad - stream editor for filtering and transforming text
       sad [OPTION]... {script-only-if-no-other-script} [input-file]...
       Sed  is  a  stream  editor.  A stream editor is usad to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an

cv@cv:~/myfiles$ sed -n '1,6s/\(s\|S\)ed/\1ad/p' test.txt    #example-3
       sad - stream editor for filtering and transforming text
       sad [OPTION]... {script-only-if-no-other-script} [input-file]...
       Sad  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an

cv@cv:~/myfiles$ sed -n '1,6s/sed\|Sed/SED/p' test.txt    #example-4
       SED - stream editor for filtering and transforming text
       SED [OPTION]... {script-only-if-no-other-script} [input-file]...
       SED  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an

cv@cv:~/myfiles$ sed '/\(file\|script\)/d' test.txt      #example-5
NAME
       sed - stream editor for filtering and transforming text
SYNOPSIS
DESCRIPTION
       in a pipeline which particularly distinguishes it from other types of editors.

       -n, --quiet, --silent
              suppress automatic printing of pattern space
       --follow-symlinks
              follow symlinks when processing in place
       -i[SUFFIX], --in-place[=SUFFIX]

合并指定两行

说到合并两行内容,就需要介绍  N  的含义和用法。

根据手册如下内容,可知  N  的作用是读取下一行内容到模式空间,也可以理解为把两行读入成中间带  \n  换行的一行内容,sed原本是按行处理文本,  N  选项就是告诉sed把下一行同时读取到模式空间待命。并没有输出当前模式空间中的行。

对于  n  指令,如果在某一行匹配到给定模式,  n  就指示sed,提前读入下一个输入行替换模式空间中的当前行,进行操作,然后再继续往下处理。注意跟在  n  后的命令应用在新读入的行上。

大写字母  P  的作用是将模式空间中的第一行内容打印到屏幕;

小写字母  p  的作用是将当前模式空间中的所有内容都打印到屏幕,注意区分一下。

示例1中使用  N  选项将1和2同时读入模式空间,再依据  P  选项打印模式空间第一行内容,从而输出1。然后读取3和4,输出3。最后读取5,但为什么没有输出呢?因为当无法处理下一行内容时,也就是读不到第二行时,  N  会自动终止退出,

所以后面的  P  也不会执行,因此没有打印5到屏幕上。

示例2与上面有所区别,这里使用  $!N  选项,意思是对最后一行不执行  N  命令,直接进行下一步,继续执行  P  ,打印模式空间第一行,也就是读到的5,因此能输出5。

示例3读取最后一行5时,该命令中  N  无法处理下一行内容,终止退出,不执行后面的替换语句,这里另外使用了  -n  参数,没有处理的命令不再输出,如果去掉该参数sed默认会输出该行内容,所以5依然可以输出到屏幕。

示例4读取最后一行到模式空间时,  N  没起作用,后面的替换执行也不成功,因为末尾没有换行符,但sed还是会默认输出该行内容,5也出现在了屏幕上。

注意这里的小  p  放在替换语句结束符 ';' 之后,如果放在分号之前表示只打印输出有改动的行,就会得到示例5和6的结果,没有5的输出。

示例7中将文本第一行和第二行,第三行和第四行分别合并为一行,中间以空格隔离开。

n    Read the next line of input into the pattern space.
N    Append the next line of input into the pattern space.
P    Print up to the first embedded newline of the current pattern space.(Capital)
p    Print the current pattern space.(Lowercase)
cv@cv:~/myfiles$ seq 5 | sed -n 'N;P'      #example-1
1
3

cv@cv:~/myfiles$ seq 5 | sed -n '$!N;P'        #example-2
1
3
5

cv@cv:~/myfiles$ seq 5 | sed -n 'N;s/\n/ /;p'  #example-3
1 2
3 4

cv@cv:~/myfiles$ seq 5 | sed -n '$!N;s/\n/ /;p' #example-4
1 2
3 4
5

cv@cv:~/myfiles$ seq 5 | sed -n 'N;s/\n/ /p'    #example-5
1 2
3 4

cv@cv:~/myfiles$ seq 5 | sed -n '$!N;s/\n/ /p'  #example-6
1 2
3 4

cv@cv:~/myfiles$ sed -n -e '$!N;1,4s/\n[[:space:]]*/ /p;' test.txt   #example-7
NAME sed - stream editor for filtering and transforming text
SYNOPSIS sed [OPTION]... {script-only-if-no-other-script} [input-file]...

其他特性或指令的应用

关于匹配地址,作用范围,手册里面给出了下面的解释。当没有给定范围时,默认在整个文本范围内进行匹配搜索,对所有行操作。当只给出一个地址时,只对该行进行搜索和操作。当给定两个范围时,在范围中间进行操作。

并且也列出了三点注意事项,一个是地址之间用逗号隔开;第二个是 addr1 一定会操作,就算是 addr2 小于第一个地址的值,如果找不到  addr2  ,也即结束条件无法满足,会一直操作到文件结尾处;第三个是当 addr2 以正则形式给出时,对于与 addr1 有冲突的按 addr1 执行,没有太明白这句话表达的意思。

此外地址后面加  !  表示跳过指定的行,执行文本中的其他所有行。

Addresses    
    Sed commands can be given with no addresses, in which case the command will be executed for all input lines;
    with one address, in which case the command will only be executed for input lines which match that address;
    or with two addresses, in which case the command will be executed for all input lines which match the inclusive range of lines starting from the first address and continuing to the second address.
    Three things to note about address ranges:
        >> the syntax is addr1,addr2 (i.e., the addresses are separated by a comma);
        >> the line which addr1 matched will always be accepted, even if addr2 selects an earlier line;
        >> and if addr2 is a regexp, it will not be tested against the line that addr1 matched.
    After the address (or address-range), and before the command, a ! may be inserted, which specifies that the command shall only be executed if the address (or address-range) does not match.

下面的第一个指令表示从  addr1  行开始往下匹配N行,总共匹配N+1行。

第二个指令表示从  addr1  开始往下匹配,一直到N的倍数所指的行结束,包括N的倍数所指的行。

addr1,+N    Will match addr1 and the N lines following addr1.
addr1,~N    Will match addr1 and the lines following addr1 until the next line whose input line number is a multiple of N.

如下面示例中那样,用大括号可以表示一系列命令。

{    Begin a block of commands (end with }).
}    The closing bracket of a { } block.

示例1表示显示文中含有字符串sed或Sed的所有行。

示例2表示显示第二行到第六行之间含有sed或Sed的所有行,包括第二行。

示例3表示从第二行开始往下四行中含有给定字符串的行,包括第二行。

示例4表示从第二行开始,到4的倍数所指的行为止含有给定字符串的行,包括第二行和第四行。

示例5表示从第二行开始,到6的倍数所指的行为止含有给定字符串的行,包括第二行和第六行。

cv@cv:~/myfiles$ sed -n '/[sS]ed/p' test.txt    #example-1
       sed - stream editor for filtering and transforming text
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text
cv@cv:~/myfiles$ sed -n '2,6{/[sS]ed/p}' test.txt    #example-2
       sed - stream editor for filtering and transforming text
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
cv@cv:~/myfiles$ sed -n '2,+4{/[sS]ed/p}' test.txt    #examle-3
       sed - stream editor for filtering and transforming text
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
cv@cv:~/myfiles$ sed -n '2,~4{/[sS]ed/p}' test.txt    #example-4
       sed - stream editor for filtering and transforming text
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
cv@cv:~/myfiles$ sed -n '2,~6{/[sS]ed/p}' test.txt    #example-5
       sed - stream editor for filtering and transforming text
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
cv@cv:~/myfiles$ sed -n '2,4!{/[sS]ed/p}' test.txt    #example-6
       Sed  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an
       editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient.  But it is sed's ability to filter text

 q  指令表示退出程序,当sed执行到该处时,将会终止程序,不再进行其他处理。

示例1表示执行完第五行退出。前面加行号表示取前若干行,在读取大文件的前若干行时有很大的作用。

示例2表示寻找Sed匹配的行,找到后对该行中的Sed进行替换操作,操作完毕打印到屏幕上,然后终止程序退出。

q [exit-code]   Immediately quit the sed script without processing any more input, except that
                if auto-print is not disabled the current pattern space will be  printed.
                The exit code argument is a GNU extension.
cv@cv:~/myfiles$ sed '5q' test.txt    #example-1
NAME
       sed - stream editor for filtering and transforming text
SYNOPSIS
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
DESCRIPTION

cv@cv:~/myfiles$ sed -n '/Sed/{s/Sed/SAD/p;q}' test.txt    #example-2
       SAD  is  a  stream  editor.  A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline).  While in some ways similar to an

 剩下的内容留到下一篇接着写。

参考资料

[1] sed命令用法

[2] Linux 常用命令sed/awk/grep及正则表达式

[3] shell命令-sed常用命令

[4] sed中标识符\b和\B的用法

[5] Linux三剑客之sed

[6] sed之N和$!N的区别和运用

[7] sed行处理详解(交换行,合并行,删除行等)

[8] A Sed and Awk Micro-Primer

posted @ 2019-11-06 00:47  coffee_tea_or_me  阅读(1771)  评论(0编辑  收藏  举报