Linux文本三剑客之 - sed命令基础用法


前言

此前言只会在sed文章中出现一次,但适用于sed、gawk
如果想要在shell脚本中处理各种数据,则必须熟悉Linux中的sed和gawk。这两款工具能极大的简化数据处理任务。


简介sed

sed编辑器被称为流编辑器(stream editor),完全不同于普通的交互式文本编辑器!
交互式文本编辑器(如:vim)中,可以用键盘命令交互式地插入、删除或替换文本数据,流编辑器则根据事先设计好的一组规则编辑数据流
在流编辑器匹配并针对一行数据执行所有命令后,会读取下一行数据并重复这个过程。在流编辑器处理完数据流中的所有行后,就结束运行。

sed的优点:1.处理数据极快 2.不会修改原始数据

sed的几种使用形式:(以替换命令s为例)
1.命令行单行输入
注意多个命令要加-e参数且每个命令用分号;隔开

[flower@study chapter19]$ cat data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed -e 's/dog/cat/; s/brown/pink/' data1.txt 
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.

2.命令行多行输入
利用了次提示符PS2来分隔命令,且必须记住要在闭合单引号所在行结束命令

[flower@study chapter19]$ cat data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed -e '
> s/dog/cat/
> s/brown/pink/
> ' data1.txt
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.

3.从文件中读取编辑器命令
为了避免sed编辑器脚本文件与bash shell脚本文件混淆,建议使用.sed做为sed脚本文件的扩展名(并不是强制的)

[flower@study chapter19]$ cat data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ cat script1.sed 
s/dog/cat/
s/brown/pink/
[flower@study chapter19]$ sed -f script1.sed data1.txt 
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy cat.

sed编辑器基础命令

替换命令 s

基础的替换命令只替换一个中匹配的第一个文本

[flower@study chapter19]$ echo "This test is a test." | sed 's/test/file/'
This file is a test.

注意到上面的替换中,只有第一个test被替换了,而第二个test并没有被替换
若想改变这种默认行为,就要在替换命令字符串后使用替换标志


替换标志

有4种可用的替换标志

  1. 数字,指明新文本将替换一行中的第几处匹配
  2. g,指明新文本将替换一行中所有的匹配
  3. p,指明打印出替换后的行(注意这个p是配合sed的-n选项一起用的,不然会重复的输出,sed本身输出一次,p输出一次)
  4. w file,将替换的结果写入文件
[flower@study chapter19]$ cat data4.txt 
This is a test of the test script.
This is the second trial of the test script.
[flower@study chapter19]$ sed 's/test/trial/2' data4.txt 
This is a test of the trial script.
This is the second trial of the test script.
[flower@study chapter19]$ sed 's/test/trial/g' data4.txt 
This is a trial of the trial script.
This is the second trial of the trial script.

-n会抑制sed编辑器的输出,而替换标志p会输出替换后的行。二者配合使用的结果就是只输出被替换命令修改过的行

[flower@study chapter19]$ cat data5.txt 
This is a test line
This is a different line
[flower@study chapter19]$ sed 's/test/trial/p' data5.txt 
This is a trial line
This is a trial line
This is a different line
[flower@study chapter19]$ sed -n 's/test/trial/p' data5.txt 
This is a trial line

sed的正常输出会被保存在stdout中,w替换标志会让只有那些包含匹配模式的行才被保存在指定的输出文件中

[flower@study chapter19]$ sed 's/test/trial/w test.txt' data5.txt 
This is a trial line
This is a different line
[flower@study chapter19]$ cat test.txt 
This is a trial line

更改替换分隔字符

由于正斜线被用作替换命令的分隔符,因此它在模式匹配和替换文本中出现时,必须使用反斜杠来转义。
这很容易造成混乱和错误(虽然能用,但极其难受)

[flower@study chapter19]$ sed 's/\/bin\/bash/\/bin\/csh/' /etc/passwd

为了解决这个问题,sed编辑器允许选择其他字符作为替换命令的替代分隔符,不一定是冒号:,你哪个顺眼用哪个,只要不冲突

[flower@study chapter19]$ sed 's:/bin/bash:/bin/csh:' /etc/passwd

行寻址

默认情况下,在sed编辑器中使用的命令会应用于所有的文本行。如果只想将命令应用与特定的某一行或某些行,则必须使用行寻址
在sed编辑器中有两种形式的行寻址

  1. 以数字形式表示行区间
  2. 匹配行内的文本模式
    (sed编辑器在文本模式中引入了正则表达式,但sed对正则表达式的支持并不多,没gawk多)

1.数字形式的行寻址

[flower@study chapter19]$ cat data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed '2s/dog/cat/' data1.txt (指定某行)
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed '2,3s/dog/cat/' data1.txt (指定某行区间)
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed '2,$s/dog/cat/' data1.txt (指定某行到结尾行)
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.

2.匹配行内的文本模式
给flower用户的/etc/passwd最后一行改为/bin/tcsh,且不修改/etc/passwd文件

[flower@study chapter19]$ sed -n '/flower/s:/bin/bash:/bin/tcsh:p' /etc/passwd
flower:x:1005:1005::/home/flower:/bin/tcsh

命令组

直接看问题:

[flower@study chapter19]$ cat data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed '2s/dog/cat/; s/brown/pink/' data1.txt 
The quick pink fox jumps over the lazy dog.
The quick pink fox jumps over the lazy cat.
The quick pink fox jumps over the lazy dog.
The quick pink fox jumps over the lazy dog.

原本是想让第二行的dog变cat,brown变pink。
现在貌似行寻址只作用到了第一个命令,而第二个命令依旧是默认的对每行都生效,这样写两个命令是独立的
但我想让多个命令同时只作用于行寻址指定的行,而不能分开执行怎么办?

用花括号{}将想要同时作用于行寻址的命令包起来

[flower@study chapter19]$ cat data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed '2{s/dog/cat/; s/brown/pink/}' data1.txt 
The quick brown fox jumps over the lazy dog.
The quick pink fox jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.

删除命令 d

删除命令d用来删除文本流中的特定行,它会配合行寻址删除匹配的所有行

使用该命令时要小心,如果忘记加入寻址模式,则流中的所有文本行都会被删除

[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '2d' data6.txt (删除第2行)
This is line number 1.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '2,3d' data6.txt (删除第2行到第3行)
This is line number 1.
This is the 4th line.
[flower@study chapter19]$ sed '2,$d' data6.txt (删除第2行到结尾行)
This is line number 1.
[flower@study chapter19]$ sed '/number 1/d' data6.txt (删除匹配指定模式的所有行)
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '/number/d' data6.txt (匹配number1成功的行全都删除了)
This is the 3rd line.
This is the 4th line.

注意:替换命令默认是只替换一行中第一个匹配的,而删除会删除文本流中所有匹配的行

关于删除命令d还有一个注意点,就是在使用两个文本模式来删除某个区间内的行时有特殊情况:
我想删除1到3匹配模式的行,保留其他行

[flower@study chapter19]$ cat data7.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
This is line number 1 again; we want to keep it.
This is more text we want to keep.
Last line in the file; we want to keep it.
[flower@study chapter19]$ sed '/1/,/3/d' data7.txt (似乎出了点问题)
This is the 4th line.

问题在于:

  1. 删除完一个指定的区间后,如果又遇见了匹配的开始删除模式,则会再次触发删除
  2. 而且若指定的停止模式始终未在文本中出现,则会删除掉匹配的开始删除模式及其后续的整个数据流

插入和附加命令 i、a

插入(insert)(i):在指定行增加一行
附加(append)(a):在指定行增加一行

[flower@study chapter19]$ echo "Test Line 2" | sed 'i\Test Line 1'
Test Line 1
Test Line 2
[flower@study chapter19]$ echo "Test Line 2" | sed 'a\Test Line 1'
Test Line 2
Test Line 1

要向数据流内部插入或附加数据,必须用地址告诉sed编辑器希望数据出现在什么位置
用 i 或 a 时只能指定一个地址,使用行号或文本模式匹配都行,但不能用行区间

[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '3i\This is an inserted line.' data6.txt (添加到第3行之前)
This is line number 1.
This is line number 2.
This is an inserted line.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '$a\The end!' data6.txt (添加到结尾行之后)
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
The end!
[flower@study chapter19]$ sed '/number 1/i\The begin!' data6.txt (使用模式匹配)
The begin!
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.

特殊点在于要插入或附加多行文本,必须在要插入或附加的每行新文本末尾使用反斜线 \

[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '/number 2/a\
> This is an inserted line.\
> This is another inserted line.
> ' data6.txt  (在/number 2/文本模式匹配后附加两行)
This is line number 1.
This is line number 2.
This is an inserted line.
This is another inserted line.
This is the 3rd line.
This is the 4th line.

修改命令 c

修改(c)命令允许修改数据流中整行文本的内容。它跟插入和附加命令的工作机制一样,要在sed命令中单独指定一行

虽然指定行区间不会报错,但会直接替换区间内所有行为代修改行

[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '2c\This is a changed line of test.' data6.txt (修改第二行)
This is line number 1.
This is a changed line of test.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '/number 2/c\This is a changed line of test.' data6.txt(修改文本模式匹配行) 
This is line number 1.
This is a changed line of test.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '2,3c\This is a changed line of test.' data6.txt(注意这种情况,直接修改掉了行区间内的所有行) 
This is line number 1.
This is a changed line of test.
This is the 4th line.

转换命令 y

转换(y)命令是唯一可以处理单个字符的sed编辑器命令
转换命令是一个全局命令,它会对文本行中匹配到的所有指定字符进行转换,不考虑字符出现的位置

[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed 'y/1234/5678/' data6.txt (用5678去替换对应出现的1234)
This is line number 5.
This is line number 6.
This is the 7rd line.
This is the 8th line.

打印命令 p、=、l

这三个命令都能打印数据流中的信息:

  1. 打印(p)命令用于打印文本行
  2. 等号(=)命令用于打印行号
  3. 列出( l )命令用于列出行

打印行 p
和替换命令s中的p标志类似,打印命令用于打印sed编辑器输出的一行。
而且常常配合sed的–n选项一起使用,只打印出选定的行

[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed -n '2p' data6.txt (使用行寻址,只打印第2行)
This is line number 2.
[flower@study chapter19]$ sed '2p' data6.txt (若不加-n选项,则sed命令本身会输出每一行,p命令又会把第2行又输出一次,就很混乱)
This is line number 1.
This is line number 2.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed -n '/3/{
> p
> s/line/test/p
> }' data6.txt (这个例子不错,只打印修改前的行和修改后的行)
This is the 3rd line.
This is the 3rd test.

打印行号 =
等号命令会打印文本行在数据流中的行号。
行号由数据流中的换行符决定。数据流中每出现一个换行符,sed编辑器就会认为有一行文本结束了

[flower@study chapter19]$ cat data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ sed '=' data1.txt (打印每一行的行号,注意默认是单独在上一行显示)
1
The quick brown fox jumps over the lazy dog.
2
The quick brown fox jumps over the lazy dog.
3
The quick brown fox jumps over the lazy dog.
4
The quick brown fox jumps over the lazy dog.
[flower@study chapter19]$ cat data7.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
This is line number 1 again; we want to keep it.
This is more text we want to keep.
Last line in the file; we want to keep it.
[flower@study chapter19]$ sed -n '/text/{
> =
> p
> }' data7.txt(查找指定文本模式匹配行并打印其行号)
6
This is more text we want to keep.

列出行 l
列出命令可以打印数据流中的文本和不可打印字符。
在显示不可打印字符的时候,要么在其八进制值前加一个反斜杠,要么使用标准的C语言命名规范(用于常见的不可打印字符),比如\t用于代表制表符
如果数据流包含转义字符,则列出命令会在必要时用八进制显示

flower@study chapter19]$ cat data10.txt 
This	line	contains	tabs.
This line does contain tabs.
[flower@study chapter19]$ sed -n 'l' data10.txt 
This\tline\tcontains\ttabs.$
This line does contain tabs.$

制表符所在的为止显示为\t,行尾的美元符号表示换行符


文件处理命令 w

替换命令s包含一些文件处理标志(w)。一些常规的sed编辑器命令也可以让你无须替换文本即可完成此操作

1.写入文件 w

写入(w)命令用来向文件写入行。文件的路径可以使用相对或绝对路径

[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed -n '2,3w test.txt' data6.txt (行寻址匹配2到3行区间,并将匹配的行写入一个新文件中)
[flower@study chapter19]$ cat test.txt
This is line number 2.
This is the 3rd line.

2.从文件读数据 r

读取(r)命令允许将一条独立文件中的数据插入数据流
读取命令无法使用行区间,只能指定单个行号或文本模式匹配,sed编辑器会将文件内容插入到指定地址之后

[flower@study chapter19]$ cat data13.txt 
This is an added line.
This is a second added line.
[flower@study chapter19]$ cat data6.txt 
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
[flower@study chapter19]$ sed '3r data13.txt' data6.txt(把数据文件内容插入到当前文件的第3行之后)
This is line number 1.
This is line number 2.
This is the 3rd line.
This is an added line.
This is a second added line.
This is the 4th line.
[flower@study chapter19]$ sed '$r data13.txt' data6.txt(把数据文件内容插入到当前文件的末尾行之后)
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
This is an added line.
This is a second added line.

sed编辑器会将数据文件中的所有文本行都插入数据流

读取命令(r)还有一种很酷的用法是和删除命令(d)配合使用,利用另外一个文件中的数据来替换文件中的占位文本。

[flower@study chapter19]$ cat notice.std 
Would the following perple:
LIST
please report to the ship captain.
[flower@study chapter19]$ cat data12.txt 
Tom A 14
Jack A 21
Pick B 19
Peter C 17
[flower@study chapter19]$ sed '/LIST/{ (将notice.std中的LIST行替换为data12.txt的内容)
> r data12.txt
> d
> }' notice.std
Would the following perple:
Tom A 14
Jack A 21
Pick B 19
Peter C 17
please report to the ship captain.
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值