Linux_Linux_uniq 指令

最新推荐文章于 2024-08-12 12:00:18 发布

高达一号

最新推荐文章于 2024-08-12 12:00:18 发布

阅读量274

点赞数

分类专栏： Shell_工具脚本

本文链接：https://blog.csdn.net/u010003835/article/details/106807558

版权

Shell_工具脚本专栏收录该内容

12 篇文章 0 订阅

订阅专栏

uniq 在日常工作中也是非常常用的命令，这篇文章来看下 uniq 指令的作用。

注意： uniq 主要是用来做去重，以及计数统计的，但是注意一点！ uniq 的文件必须预先经过 sort 为有序的文件

英文介绍

[root@cdh-manager linux_cmd_test]# uniq --help
Usage: uniq [OPTION]... [INPUT [OUTPUT]]
Filter adjacent matching lines from INPUT (or standard input),
writing to OUTPUT (or standard output).

With no options, matching lines are merged to the first occurrence.

Mandatory arguments to long options are mandatory for short options too.
  -c, --count           prefix lines by the number of occurrences
  -d, --repeated        only print duplicate lines, one for each group
  -D, --all-repeated[=METHOD]  print all duplicate lines
                          groups can be delimited with an empty line
                          METHOD={none(default),prepend,separate}
  -f, --skip-fields=N   avoid comparing the first N fields
      --group[=METHOD]  show all items, separating groups with an empty line
                          METHOD={separate(default),prepend,append,both}
  -i, --ignore-case     ignore differences in case when comparing
  -s, --skip-chars=N    avoid comparing the first N characters
  -u, --unique          only print unique lines
  -z, --zero-terminated  end lines with 0 byte, not newline
  -w, --check-chars=N   compare no more than N characters in lines
      --help     display this help and exit
      --version  output version information and exit

A field is a run of blanks (usually spaces and/or TABs), then non-blank
characters.  Fields are skipped before chars.

Note: 'uniq' does not detect repeated lines unless they are adjacent.
You may want to sort the input first, or use 'sort -u' without 'uniq'.
Also, comparisons honor the rules specified by 'LC_COLLATE'.

GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
For complete documentation, run: info coreutils 'uniq invocation'

中文介绍

Linux uniq 命令用于检查及删除文本文件中重复出现的行列，一般与 sort 命令结合使用。

uniq 可检查文本文件中重复出现的行列。

语法

uniq [-cdu][-f<栏位>][-s<字符位置>][-w<字符位置>][--help][--version][输入文件][输出文件]

案例

构建测试文件 test3.txt test4.txt

test3.txt

[root@cdh-manager linux_cmd_test]# cat test3.txt
ds
a
sd
a
sd
adc
a
adc
zsw
a
edx
ex

test4.txt

[root@cdh-manager linux_cmd_test]# cat test4.txt
s
ws
ws
sc
sc
wd
wd
wd

案例一基本去重

去重，我们对未排序的文件尝试下去重，test3.txt

[root@cdh-manager linux_cmd_test]# uniq -c test3.txt
      1 ds
      1 a
      1 sd
      1 a
      1 sd
      1 adc
      1 a
      1 adc
      1 zsw
      1 a
      1 edx
      1 ex

可以看到并未进行去重

我们对排序后的文件进行去重，test.txt

[root@cdh-manager linux_cmd_test]# uniq test4.txt
s
ws
sc
wd

可以看到完成了去重

那么对于 test3.txt 要进行去重，应该怎么办呢？答案是先用 sort 进行排序，然后进行去重。

[root@cdh-manager linux_cmd_test]# sort test3.txt | uniq 
a
adc
ds
edx
ex
sd
zsw

案例二去重后统计计数

有的时候需要统计每个记录出现的次数，可以使用 -c 选项

[root@cdh-manager linux_cmd_test]# sort test3.txt | uniq -c | sort -r
      4 a
      2 sd
      2 adc
      1 zsw
      1 ex
      1 edx
      1 ds

在很多时候需要对重复内容的统计。可以在使用 -c 的情况下，使用 -d 选项

[root@cdh-manager linux_cmd_test]# sort test3.txt | uniq -d -c
      4 a
      2 adc
      2 sd

高达一号

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

Linux_Linux_uniq 指令

英文介绍

中文介绍

语法

案例

案例一 基本去重

案例二 去重后统计计数

案例一基本去重

案例二去重后统计计数