Linux中关于sed与gawk的使用

SedGawk

sed编辑器

sed编辑器被称作流编辑器,和普通的交互式文本编辑器恰好相反。在交互式文本编辑器中,你可以用键盘命令来交互式的插入、删除或替换数据中的文本。流编辑器则会在编辑器处理数据之前基于预先提供的一组规则来编辑数据流。
sed编辑器可以根据命令来处理数据流中的数据,这些命令要么从命令行中输入,要么存储在一个命令文本文件中。
sed会执行如下操作
  • 一次从数据中读取一行数据
  • 根据所提供的编辑器命令匹配数据
  • 按照命令修改流中的数据
  • 将新的数据输出到STDOUT

sed options script file

sed命令选项
选项描述
-e script在处理输入时,将script中指定的命令添加到已有的命令中
-f file在处理输入时,将file中指定的命令添加到已有的命令中
-n不产生命令输出,使用print命令来完成输出
  • 在命令行定义编辑器命令
$ echo "This is a test" | sed 's/test/big test'
This is a big test
$ cat data1.txt
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
$
$ sed 's/dog/cat' data1.txt
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
$
  • 在命令行使用多个编辑器命令
$ sed 's/brown/green/; s/dog/cat' data1.txt
The quick green fox jumps over the lazy cat.
The quick green fox jumps over the lazy cat.
The quick green fox jumps over the lazy cat.
The quick green fox jumps over the lazy cat.
$ sed -e '
> s/brown/green/
> s/fox/elephant/
> s/dog/cat/' data1.txt
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
  • 从文件中读取编辑器命令
$ cat script1.sed
s/brown/green/
s/fox/elephant/
s/dog/cat/
$
$ sed -f script1.sed data1.txt
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
The quick green elephant jumps over the lazy cat.
替换标记
默认情况下只替换每行中出现的第一处
$ cat data4.txt
This is as test of the test script.
This is the second test of the test script.
$
$ sed 's/test/trial/' data4.txt
This is a trial of the test script.
This is the second trial of the test script.
数字,表明新文本将替换第几处匹配的地方
g,表明新文本将会替换所有匹配的文本
p,表明原先行的内容要打印出来
w file,将替换的结果写到文件中
$ sed 's/test/trial/2' data4.txt
This is a test of the trial script.
This is the second test of the trial script.
$
sed 's/test/trial/g' data4.txt
This is a trial of the trial script.
This is the second trial of the trial script.
替换字符

对于正斜线/的替换

sed 's/\/bin\/bash/\/bin\/csh/' /etc/passwd

用!代替转义

sec 's!/bin/bash!/bin/csh!' /etc/passwd
数字方式行寻址
$ sed '2s/dog/cat/' data1.txt
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog

$ sed '2,3s/dog/cat/' data1.txt
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy dog
美元号表从某行开始所有行
$ sed '2,$/dog/cat' data1.txt
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
The quick brown fox jumps over the lazy cat
使用文本模式过滤器
$ grep Samanths /etc/passwd

$ sed '/Samanths/s/bash/csh' /etc/passwd
命令组合
$ sed '2{
> s/fox/elephant/
> s/dog/cat/
> }' data1.txt
The quick brown fox jumps over the lazy dog
The quick brown elephant jumps over the lazy cat
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
$ sed '3,${
> s/brown/green/
> s/lazy/active/
> }' data1.txt
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick green fox jumps over the active dog
The quick green fox jumps over the active dog
删除行
$ cat data1.txt
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog

$ sed 'd' data1.txt
$
$ cat data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
$
$ sed '3d' data6.txt
This is line number 1.
This is line number 2.
This is line number 4.

$ sed '2,3d' data6.txt
This is line number 1.
This is line number 4.

$ sed '3,$d' data6.txt
This is line number 1.
This is line number 2.

$ sed '/number 1/d' data6.txt
This is line number 2.
This is line number 3.
This is line number 4.

$ sed '/1/,/3/d' data6.txt
This is line number 4.
$ cat data7.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is line number 1 again.
This is text you wang to keep.
This is the last line in the file.

$ sed '/1/,/3/d' data7.txt
This is line number 4.
第二个出现的1未出现停止模式故而将剩余行全部删除
插入和附加
$ echo "Test line 2" | sed "i\Test Line 1"
Test Line 1
Test Line 2

$ echo "Test line 2" | sed "a\Test Line 1"
Test Line 2
Test Line 1
向数据流内部插入数据
$ sed '3i\
> This is an inserted line.' data6.txt
This is line number 1.
This is line number 2.
This is an inserted line.
This is line number 3.
This is line number 4.

$ sed '3a\
> This is an appended line.' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is an appended line.
This is line number 4.
多行数据流附加到数据流末尾
$ sed '$a\
> This is a new line of text.' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is a new line of text.
在第一行之前插入多行文本
$ sed '1i\
> This is one line of now text.\
> This is another line of new text.' data6.txt
This is one line of new text.
This is another line of new text.
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
修改行
$ sed '3c\
> This is a changed line of text.' data6.txt
This is line number 1.
This is line number 2.
This is a changed line of text.
This is line number 4.

$ sed '/number 3/c\
> This is a changed line of text.' data6.txt
This is line number 1.
This is line number 2.
This is a changed line of text.
This is line number 4.
$ cat data8.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is line number 1 again.
This is yet another line.
This is the last line in the file.

$ sed '/number 1/c\
> This is a changed line of text.' data8.txt
This is a changed line of text.
This is line number 2.
This is line number 3.
This is line number 4.
This is a changed line of text.
This is yet another line.
This is the last line in the file.
转换命令
转换命令y是唯一可以处理单个字符的sed编辑器命令
$ sed 'y/123/789/' data8.txt
This is line number 7.
This is line number 8.
This is line number 9.
This is line number 4.
This is line number 7 again.
This is yet another line.
This is the last line in the file.
  • p命令用来打印文本行
  • 等号(=)命令用来打印行号
  • l(小写L)命令用来列出行
打印行
$ echo "this is a test" | sed 'p'
this is a test
this is a test
$ cat data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
$
$ sed -n '/number 3/p' data6.txt
This is line number 3.
$ sed -n '2,3p' data6.txt
This is line number 2.
This is line number 3.
$ sed -n '/3/{
> p
> s/line/test/p
> }' data6.txt
This is line number 3.
This is test number 3.
打印行号
$ cat data1.txt
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.

$ sed '=' data1.txt
1
The quick brown fox jumps over the lazy dog.
2
The quick brown fox jumps over the lazy dog.
3
The quick brown fox jumps over the lazy dog.
4
The quick brown fox jumps over the lazy dog.
$ sed -n '/number 4/{
> = 
> p
> }' data6.txt
列出行
$ cat data9.txt
This    line    contains    tabs.
$ sed -n 'l' data9.txt
This\tline\tcontains\ttabs.$
写入文件
$ sed '1,2w test.txt' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
$ cat  test.txt
This is line number 1.
This is line number 2.
cat data11.txt
Blum, R         Browncoat
McGuiness, A    Alliance
Bresnahan, C    Browncoat
Harken, C       Alliance

$sed -n '/Browncoad/w Browncoa.txt' data11.txt
$cat Browncoats.txt
Blum, R         Browncoat
Bresnahan, C    Browncoat
从文件读取数据
$ cat data12.txt
This is an added line.
This is the second added line.
$
$ sed '3r data12.txt' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is an added line.
This is the second added line.
This is line number 4.
$ sed '/number 2/r data12.txt' data6.txt
This is line number 1.
This is line number 2.
This is an added line.
This is the second added line.
This is line number 3.
This is line number 4. 
添加到末尾
$ sed '$r data12.txt' data6.txt
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4. 
This is an added line.
This is the second added line.
和删除命令配合使用
$ cat notice.std
Would the following people:
LIST
please report to the ship's captain.
$
$ sed '/LIST/{
> r data11.txt
> d
> }' notice.std
Would the following people:
Blum,   R       Browncoat
McGuiness, A    Alliance
Bresnahan, C    Browncoat
Harken, C       Alliance
please report to the ship's captain.

gawk程序

虽然sed编辑器是非常方便的自动修改文本文件的工具,但也有其自身的限制。gawk能够提供一个类编程环境来修改和重新组织文件中的数据
gawk程序是Unix中的原始awk程序的GNU版本。gawk程序让流编辑迈上一个新的台阶,它提供了一种编程语言而不是编辑器命令。
gawk会做下面的事
  • 定义变量来保存数据
  • 使用算术和字符串操作符来处理数据
  • 使用结构化编程概念(比如if-then语句和循环)来为数据处理增加处理逻辑
  • 通过提取数据文件中的数据元素,将其重新排列或格式化,生成格式化报告
1.gawk命令格式

gawk options program file

选项描述
-F fs指定行中划分数据字段的字段分隔符
-f file从指定的文件中读取程序
-v var=value定义gawk程序中的一个变量及其默认值
-mf N指定要处理的数据文件中的最大字段数
-mr N指定数据文件中的最大数据行数
-W keyword指定gawk的兼容模式或警告等级
2.从命令行读取程序脚本
gawk '{print "Hello World!"}'
此命令会自动中止,须Ctrl+D产生一个EOF字符能够将该命令返回到命令行提示符下
3.使用数据字段变量
  • $0代表整个文本行
  • $1代表文本行中第1个数据字段
  • $n代表文本行中第n个数据字段
$ cat data2.txt
One line of test text.
Two lines of test text.
Three lines of test text.
$
$ gawk '{print $1}' data2.txt
One
Two
Three
gawk -F: '{print $1}' /etc/passwd
root
bin
daemon
adm
lp
sync
shutdown
halt
mail
operator
games
[...]
4.在程序脚本中使用多个命令
$ echo "My name is Rich" | gawk '{$4="Christine";print $0}'
My name is Christine
$ gawk '{
> $4="Christine"
> print $0}'
5.从文件中读取程序
$ cat script2.gawk
{print $1 "'s home directory is " $6}
$ gawk -F: -f script2.gawk /etc/passwd
6.在处理数据前运行脚本
cat data3.txt
Line 1
Line 2
Line 3
$
$ gawk 'BEGIN {print "The data3 File Contents:"}
>{print $0}' data3.txt
The data3 File Contents:
Line 1
Line 2
Line 3
$
7.在处理数据后运行脚本
$ gawk 'BEGIN {print "The data3 File Contents:"}
>{print $0}
>END {print "End of File"}' data3.txt
$ cat script4.gawk
BEGIN {
print "The latest list of users and shells"
print " UserId \t Shell"
print "---------- \t ----------"
FS=":"
}
{
    print $1 "      \t     "$7
}
END {
    print "This concludes the listing"
}
$
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

盛者无名

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值