@如何删除重复的行？@

最新推荐文章于 2021-05-04 15:42:04 发布

紫颖

最新推荐文章于 2021-05-04 15:42:04 发布

阅读量1.7k

点赞数 1

分类专栏：点点滴滴文章标签： file shell

本文链接：https://blog.csdn.net/zhuying_linux/article/details/7106668

版权

点点滴滴专栏收录该内容

61 篇文章 0 订阅

订阅专栏

假如我们有一个文件file，然后想要删除该文件中重复的行，那么我们有哪些方法呢？

file文件的内容如下：

my friends, xiaoying
my teacher, xiaoniu
my teacher, xiaoniu
my fuqin, father
my sister, wushiying
my sister, wushiying
my friends, xiaoying
my teacher, xiaoniu
my fuqin, father
my sister, wushiying
my friends, xiaoying
my fuqin, father

方法一：awk '{if ($0!=line) print;line=$0}' file

也就是：

cat file |sort |awk '{if ($0!=line) print;line=$0}'【因为这个需要先排序，才能够用这样的方法~】

原理：

因为awk也是一次读入一行，line第一次为空【line 是 awk 的变量，像shell中的一样不需事先声明，没给它赋值前当然就是空的】

所以自然就不等于$0（$0为"my friend,xiaoying"），所以就打印了；接着把line的值赋为$0;然后awk又读入一行，由于此时$0的值

与line相同（均为"my friend,xiaoying"），所以就不打印了。当读入"my teacher, liyong"时，$0与line(值为"my friend,xiaoying")又不

同了，所以打印出来，其余的以此类推。

方法二：【这个是最简单的~】

[root@sor-sys zy]# cat file| sort | uniq
my friends, xiaoying
my fuqin, father
my sister, wushiying
my teacher, xiaoniu

方法三：

文件rmdup.sed的内容如下：

#n rmdup.sed - ReMove DUPlicate consecutive lines

# read next line into pattern space (if not the last line)
$!N

# check if pattern space consists of two identical lines
s/^$.*$\n\1$/&/
# if yes, goto label RmLn, which will remove the first line in pattern space
t RmLn
# if not, print the first line (and remove it)
P

# garbage handling which simply deletes the first line in the pattern space
: RmLn
D

[root@sor-sys zy]# cat file|sort |sed -f rmdup.sed
my friends, xiaoying
my fuqin, father
my sister, wushiying
my teacher, xiaoniu

紫颖

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
3
评论
@如何删除重复的行？@

假如我们有一个文件file，然后想要删除该文件中重复的行，那么我们有哪些方法呢？file文件的内容如下：my friends, xiaoyingmy teacher, xiaoniumy teacher, xiaoniumy fuqin, fathermy sister, wushiyingmy sister, wushiyingmy friends, xiaoyi
复制链接

扫一扫

专栏目录