Linux命令之uniq
NAME
uniq - (报告或者省略重复的行)report or omit repeated lines
SYNOPSIS
uniq [OPTION]... [INPUT [OUTPUT]]
DESCRIPTION
(从输入过滤相邻匹配的行并输出或者写入文件)Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output).
(如果没有跟参数的话则只是将重复的内容进行合并)With no options, matching lines are merged to the first occurrence.
-c, --count
(对重复的行进行计数)prefix lines by the number of occurrences
-d, --repeated
(只输出重复的行,重复内容只输出一次)only print duplicate lines, one for each group
-D, --all-repeated[=METHOD]
(输出所有重复的行)print all duplicate lines groups can be delimited with an empty line METHOD={none(default),prepend,separate}
-f, --skip-fields=N
avoid comparing the first N fields
--group[=METHOD]
(对于重复的行进行分组间隔输出)show all items, separating groups with an empty line METHOD={separate(default),prepend,append,both}
-i, --ignore-case
(忽略大小写)ignore differences in case when comparing
-s, --skip-chars=N
(避免比较第N个字符)avoid comparing the first N characters
-u, --unique
(只打印出唯一的行)only print unique lines
-z, --zero-terminated
(规定行结束符为0,而不是开始一个新行)end lines with 0 byte, not newline
-w, --check-chars=N
(比较不超过N个字符)compare no more than N characters in lines
实际应用场景
如果有以下内容的文件txt1.txt,请输出第二个字段的重复个数,以‘,’作为分隔符
1,a
2,b
3,c
4,a
5,d
6,a
7,f
8,g
9,f
10,C
示例解答:
方法一:
cat txt1.txt | awk -F ',' '{print $2}' |sort | uniq -c
方法二:
cat txt1.txt | awk -F ',' '{++result[$2]} END {for (item in result)print item,result[item]}'
拓展思考:
不区分大小写的话又该怎么实现?