最近看 linux 的 sed 命令,讲有到 pattern space 、hold space ,编辑命令 h, x, G 有点难以理解,研究了下得出下面的东西:


sed 有两种缓冲区:

    通常说的 sed 将输入文件复制到缓冲区,对缓冲区中的副本处理后,将其输出,这个缓冲区叫: Pattern Buffer ( pattern space );

    还有一个就是 Hold Buffer ( hold space ).


在例子之前先看看 sed 是如何工作的,

How sed Works

sed maintains two data buffers: 

    the active pattern space, and the auxiliary hold space. Both are initially empty.


sed operates by performing the following cycle on each line of input: 

    first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. 

    Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.

    When the end of the script is reached, unless the -n option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed. Then the next cycle starts for the next input line.


翻译后(关于trailing newline 的部分不是很明白)大概意思如下:

1、从输入流中读取一行;

2、假如匹配的话执行命令;

3、如果没有 -n 选项的话,将 pattern space 中的内容打印到输出流。

4、下一行重复1-3.


man sed 中对 h/H , g/G ,x 的描述:

h/H:copy/append pattern space to hold space. (复制/追加 pattern space 到 hold space) 

g/G:copy/append hold space to pattern space. (复制/追加 hold space 到 pattern space)

x: exchange the contents of the hold and pattern space. (互换 pattern ,hold space)


例子来了:

#cat name.txt

    1 tom

    2 jerry

    3 selina

    4 green

    5 lily

    6 lilei

#sed -e '/tom/h' -e '/green/x' -e '$G' name.txt


逐行解析如下:


文章逐行COMMANDPATTERN SPACEHOLD SPACEOUTPUT
1 tom/tom/1 tom1 tom
h1 tom1 tom
2 jerry2 jerry1 tom2 jerry
3 selina3 selina1 tom3 selina
4 green/green/4 green1 tom1 tom
x1 tom4 green
5 lily5 lily4 green5 lily
6 lilei$6 lilei4 green

6 lilei

4 green

G

6 lilei

4 green

4 green
所以,命令的输出为:

#sed -e '/tom/h' -e '/green/x' -e '$G' name.txt

1 tom

2 jerry

3 selina

1 tom

5 lily

6 lilei

4 green



最后看看 -n 选项 和 p 命令:

  • -n(--quiet,--silent)

  • By default, sed prints out the pattern space at the end of each cycle through the script. These options disable this automatic printing, and sed only produces output when explicitly told to via the p command.


    p:

  • Print out the pattern space (to the standard output). This command is usually only used in conjunction with the -n command-line option.  

大概意思就是, -n 选项抑制 pattern space 的自动打印,而 p 使 -n 失效。


#sed -n '/selina/p' name.txt

3 selina


因为第 1、2、4、5、6 行被 -n 抑制了输出,而第 3 行因为匹配了,而执行 p 命令,使 -n 无效,所以能输出。