shell 初识sed和gawk

最新推荐文章于 2024-07-21 17:33:34 发布

NickDeCodes

最新推荐文章于 2024-07-21 17:33:34 发布

阅读量51

点赞数

分类专栏： shell 文章标签： linux bash 服务器

本文链接：https://blog.csdn.net/NickDeCodes/article/details/133052430

版权

shell 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

初识sed和gawk

文本处理
- sed编辑器
- gawk编辑器
sed编辑器基础命令
实战
小结

文本处理

sed编辑器

sed 编辑器被称作流编辑器(stream editor)，与普通的交互式文本编辑器截然不同。在交互式文本编辑器(比如 Vim)中，可以用键盘命令交互式地插入、删除或替换文本数据。

sed 编辑器根据命令来处理数据流中的数据，这些命令要么从命令行中输入，要么保存在命令文本文件中。sed 编辑器可以执行下列操作。

从输入中读取一行数据。
根据所提供的编辑器命令匹配数据。
按照命令修改数据流中的数据。
将新的数据输出到 STDOUT。

sed 命令的格式如下。 sed options file

选项	描述
-e commands	在处理输入时，加入额外的 sed 命令
-f file	在处理输入时，将 file 中指定的命令添加到已有的命令中
-n	不产生命令输出，使用 p(print)命令完成输出

在命令行中定义编辑器命令

$ echo "This is a test" | sed 's/test/big test/' 
This is a big test

在命令行中使用多个编辑器命令

$ sed -e 's/brown/red/; s/dog/cat/' data1.txt 
The quick red fox jumps over the lazy cat. 
The quick red fox jumps over the lazy cat. 
The quick red fox jumps over the lazy cat. 
The quick red fox jumps over the lazy cat.

从文件中读取编辑器命令

$ cat script1.sed 
s/brown/green/ 
s/fox/toad/ 
s/dog/cat/
$
$ sed -f script1.sed data1.txt
The quick green toad jumps over the lazy cat. 
The quick green toad jumps over the lazy cat. 
The quick green toad jumps over the lazy cat. 
The quick green toad jumps over the lazy cat. 
$

gawk编辑器

gawk 是 Unix 中最初的 awk 的 GNU 版本。gawk 比 sed 的流编辑提升了一个“段位”，它提供了一种编程语言，而不仅仅是编辑器命令。在 gawk 编程语言中，可以实现以下操作

定义变量来保存数据。
使用算术和字符串运算符来处理数据。
使用结构化编程概念(比如 if-then 语句和循环)为数据处理添加处理逻辑。
提取文件中的数据将其重新排列组合，最后生成格式化报告。

gawk 的基本格式如下。 gawk options program file

选项	描述
-F fs	指定行中划分数据字段的字段分隔符
-f file	从指定文件中读取 gawk 脚本代码
-v var=value	定义 gawk 脚本中的变量及其默认值
-L [keyword]	指定 gawk 的兼容模式或警告级别

sed编辑器基础命令

使用地址

在 sed 编辑器中有两种形式的行寻址。

以数字形式表示的行区间。
匹配行内文本的模式。

$ cat data1.txt
The quick brown fox jumps over the lazy dog. 
The quick brown fox jumps over the lazy dog. 
The quick brown fox jumps over the lazy dog. 
The quick brown fox jumps over the lazy dog. 
$
$ sed '2s/dog/cat/' data1.txt
The quick brown fox jumps over the lazy dog. 
The quick brown fox jumps over the lazy cat. 
The quick brown fox jumps over the lazy dog. 
The quick brown fox jumps over the lazy dog. 
$
The quick brown fox jumps over the lazy dog.
$ sed '2,3s/dog/cat/' data1.txt
The quick brown fox jumps over the lazy dog. 
The quick brown fox jumps over the lazy cat. 
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy dog.
$
$ sed '2,$s/dog/cat/' data1.txt 
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy cat.
$

使用文本模式过滤

$ grep /bin/bash /etc/passwd 
root:x:0:0:root:/root:/bin/bash 
christine:x:1001:1001::/home/christine:/bin/bash 
rich:x:1002:1002::/home/rich:/bin/bash
$
$ sed '/rich/s/bash/csh/' /etc/passwd 
root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin 
[...] 
christine:x:1001:1001::/home/christine:/bin/bash 
sshd:x:126:65534::/run/sshd:/usr/sbin/nologin 
rich:x:1002:1002::/home/rich:/bin/csh
$

命令组

$ sed '2{
> s/fox/toad/
> s/dog/cat/
> }' data1.txt
The quick brown fox jumps over the lazy dog. 
The quick brown toad jumps over the lazy cat. 
The quick brown fox jumps over the lazy dog. 
The quick brown fox jumps over the lazy dog. 
$

删除行

文本替换命令并非 sed 编辑器唯一的命令。如果需要删除文本流中的特定行，可以使用删除 (d)命令。

$ cat data6.txt
This is line number 1. 
This is line number 2.
This is the 3rd line. 
This is the 4th line. 
$
$ sed '3d' data6.txt 
This is line number 1.
This is line number 2. 
This is the 4th line. 
$

记住，sed 编辑器不会修改原始文件。你删除的行只是从 sed 编辑器的输出中消失了。原始文件中仍然包含那些“被删掉”的行。

插入和附加文本

如你所望，跟其他编辑器类似，sed 编辑器也可以向数据流中插入和附加文本行。这两种操作的区别可能比较费解。

插入(insert)(i)命令会在指定行前增加一行。
附加(append)(a)命令会在指定行后增加一行。

$ cat data6.txt
This is line number 1.
This is line number 2.
This is the 3rd line.
This is the 4th line.
$
$ sed '3i\
> This is an inserted line.
> ' data6.txt
This is line number 1. 
This is line number 2. 
This is an inserted line. 
This is the 3rd line. 
This is the 4th line.
$

修改行

$ sed '2c\
> This is a changed line of text. > ' data6.txt
This is line number 1.
This is a changed line of text. This is the 3rd line.
This is the 4th line.
$

转换命令

转换(y)命令是唯一可以处理单个字符的 sed 编辑器命令。该命令格式如下所示

[address]y/inchars/outchars/

$ cat data9.txt
This is line 1.
This is line 2.
This is line 3.
This is line 4.
This is line 5.
This is line 1 again.
This is line 3 again.
This is the last file line. $
$ sed 'y/123/789/' data9.txt This is line 7.
This is line 8.
This is line 9.
This is line 4.
This is line 5.
This is line 7 again.
This is line 9 again.
This is the last file line.
$

实战

#!/bin/bash
# Change the shebang used for a directory of scripts

################## Function Declarations ##########################

function errorOrExit {
	echo
	echo $message1
	echo $message2
	echo "Exiting script..."
	exit
}

function modifyScripts {
	echo
	read -p "Directory name in which to store new scripts? " newScriptDir
	
	echo "Modifying the scripts started at $(date +%N) nanoseconds"
	
	count=0
	for filename in $(grep -l "/bin/sh" $scriptDir/*.sh)
	do
		newFilename=$(basename $filename)
		cat $filename | 
		sed '1c\#!/bin/bash' > $newScriptDir/$newFilename
		count=$[$count + 1] 
	done
	echo "$count modifications completed at $(date +%N) nanoseconds"
}

################# Check for Script Directory ######################
if [ -z $1 ]
then 
	message1="The name of the directory containing scripts to check"
	message2="is missing. Please provide the name as a parameter."
        errorOrExit
else
	scriptDir=$1
fi 

################ Create Shebang Report ############################

sed -sn '1F; 
1s!/bin/sh!/bin/bash!' $scriptDir/*.sh | 
gawk 'BEGIN {print ""
print "The following scripts have /bin/sh as their shebang:"
print "==================================================="}
{print $0}
END {print ""
print "End of Report"}'

################## Change Scripts? #################################

echo
read -p "Do you wish to modify these scripts' shebang? (Y/n)? " answer

case $answer in
Y | y)
	modifyScripts
	;;
N | n)
	message1="No scripts will be modified."
	message2="Run this script later to modify, if desired."
	errorOrExit
	;;
*)
	message1="Did not answer Y or n."
	message2="No scripts will be modified."
	errorOrExit
	;;
esac