sort命令:用于将文本文件内容加以排序,sort可针对文本文件的内容,以行为单位来排序。
命令格式:
sort [-bcdfimMnr][-o<输出文件>][-t<分隔字符>][+<起始栏位>-<结束栏位>][--help][--verison][文件]
常见参数:
-b 忽略每行前面开始出的空格字符。
-c 检查文件是否已经按照顺序排序。
-d 排序时,处理英文字母、数字及空格字符外,忽略其他的字符。只考虑空格、字母和数字
-f 排序时,将小写字母视为大写字母。
-i 排序时,除了040至176之间的ASCII字符外,忽略其他的字符。只考虑可打印字符。
-m 将几个排序好的文件进行合并。
-M 将前面3个字母依照月份的缩写进行排序。
-n 依照数值的大小排序;对指定的列进行排序,+0表示第一列,以空格或制表符作为列的间隔符。
-o<输出文件> 将排序后的结果存入指定的文件。
-u 去重,配合-c,严格校验排序;不配合-c,则只输出一次排序结果,一般用uniq代替。
-r 倒序(降序)以相反的顺序来排序。
-t<分隔字符> 指定排序时所用的栏位分隔字符。例如:-t. 表示按点号分隔域,类似于awk -F或cut -d
-k指定第几列或第几列的第几个字符。与-t配合使用
+<起始栏位>-<结束栏位> 以指定的栏位来排序,范围由起始栏位到结束栏位的前一栏位。
–help 显示帮助。
–version 显示版本信息。
测试:
1. 去重
[fuyun@bigdata-training datas]$ cat sort1.txt
a 5
c 6
a 1
b 1
c 3
192.168.43.117
192.168.43.119
192.168.43.118
192.168.43.118
192.168.43.117
192.168.43.117
192.168.43.119
192.168.43.110
[fuyun@bigdata-training datas]$ sort -u sort1.txt
192.168.43.110
192.168.43.117
192.168.43.118
192.168.43.119
a 1
a 5
b 1
c 3
c 6
[fuyun@bigdata-training datas]$ sort --uniq sort1.txt
192.168.43.110
192.168.43.117
192.168.43.118
192.168.43.119
a 1
a 5
b 1
c 3
c 6
[fuyun@bigdata-training datas]$ uniq sort1.txt #注:-u,一般用uniq代替
a 5
c 6
a 1
b 1
c 3
192.168.43.117
192.168.43.119
192.168.43.118
192.168.43.117
192.168.43.119
192.168.43.110
2. 去重+降序(倒序)排序:
[fuyun@bigdata-training datas]$ sort -ur sort1.txt
c 6
c 3
b 1
a 5
a 1
192.168.43.119
192.168.43.118
192.168.43.117
192.168.43.110
3. 按数字排序:
注:默认为升序
[fuyun@bigdata-training datas]$ sort -n sort1.txt
a 1
a 5
b 1
c 3
c 6
192.168.43.110
192.168.43.117
192.168.43.117
192.168.43.117
192.168.43.118
192.168.43.118
192.168.43.119
192.168.43.119
4. 按数字倒序排序
[fuyun@bigdata-training datas]$ sort -r sort1.txt
c 6
c 3
b 1
a 5
a 1
192.168.43.119
192.168.43.119
192.168.43.118
192.168.43.118
192.168.43.117
192.168.43.117
192.168.43.117
192.168.43.110
5. 指定分隔符排序:按对应的列排序
-k, --key=pos1[pos2]
start a key at pos1 (orign 1), end it at pos2 (default end of line)
- 默认按整行排序。
- -t指定分隔符,-k1,分隔符之后的第一列排序
- -k 1,1 用逗号来分隔字段,表示第一个字段开始排序到第一个字段结束
- -k 1.1,3.3 用点分隔字符。表示第一个字段的第一个字符开始排序到第三个字段的第三个字符结束。
[fuyun@bigdata-training datas]$ cat sort.txt
a 5
c 6
a 1
b 1
c 3
a 192.168.43.117
f 192.168.43.119
b 192.168.43.118
z 192.168.43.118
s 192.168.43.117
k 192.168.43.117
c 192.168.43.119
o 192.168.43.110
[fuyun@bigdata-training datas]$ sort -t" " -k2 sort.txt
a 1
b 1
o 192.168.43.110
a 192.168.43.117
k 192.168.43.117
s 192.168.43.117
b 192.168.43.118
z 192.168.43.118
c 192.168.43.119
f 192.168.43.119
c 3
a 5
c 6
分隔符默认为空格,-t可省略,所有上边命令可以sort -k2 sort.txt
[fuyun@bigdata-training datas]$ sort -k2 sort.txt
a 1
b 1
o 192.168.43.110
a 192.168.43.117
k 192.168.43.117
s 192.168.43.117
b 192.168.43.118
z 192.168.43.118
c 192.168.43.119
f 192.168.43.119
c 3
a 5
c 6
[fuyun@bigdata-training datas]$ cat sort2.txt
192.168.0.1 00:OF:AF:45:4C:78
192.168.0.71 00:OF:1AF:45:4C:76
192.168.0.16 00:OF:KF:55:S6:25
192.168.0.99 00:LF:9F:R5:IC:27
192.168.0.91 00:OF:H6:45:A1:67
192.168.0.65 00:O1:W3:45:49:94
192.168.0.89 00:OF:A8:33:V5:90
192.168.0.31 00:90:32:J9:1L:14
192.168.0.19 00:OF:76:29:30:DF
192.168.0.177 00:OF:12:09:P9:41
192.168.0.121 00:YF:A2:U7:4O:RT
192.168.0.253 00:OF:SD:40:J3:19
192.168.0.51 00:II:V5:39:47:OI
192.168.0.46 00:OF:A3:81:D3:1Y
192.168.0.7 00:OI:W1:IW:H7:B1
192.168.0.189 00:OF:S5:00:12:70
192.168.0.155 00:OY:TF:4Q:46:8M
从第4个字段开始排序,到第4个字段结束排序,按数字降序
[fuyun@bigdata-training datas]$ sort -t. -k4,4nr sort2.txt
192.168.0.253 00:OF:SD:40:J3:19
192.168.0.189 00:OF:S5:00:12:70
192.168.0.177 00:OF:12:09:P9:41
192.168.0.155 00:OY:TF:4Q:46:8M
192.168.0.121 00:YF:A2:U7:4O:RT
192.168.0.99 00:LF:9F:R5:IC:27
192.168.0.91 00:OF:H6:45:A1:67
192.168.0.89 00:OF:A8:33:V5:90
192.168.0.71 00:OF:1AF:45:4C:76
192.168.0.65 00:O1:W3:45:49:94
192.168.0.51 00:II:V5:39:47:OI
192.168.0.46 00:OF:A3:81:D3:1Y
192.168.0.31 00:90:32:J9:1L:14
192.168.0.19 00:OF:76:29:30:DF
192.168.0.16 00:OF:KF:55:S6:25
192.168.0.7 00:OI:W1:IW:H7:B1
192.168.0.1 00:OF:AF:45:4C:78
从第三个字段第一个字符排序,到第四个字段第一个字符结束,按数字降序
[fuyun@bigdata-training datas]$ sort -t. -k3.1,4.1nr sort2.txt
192.168.0.91 00:OF:H6:45:A1:67
192.168.0.99 00:LF:9F:R5:IC:27
192.168.0.89 00:OF:A8:33:V5:90
192.168.0.7 00:OI:W1:IW:H7:B1
192.168.0.71 00:OF:1AF:45:4C:76
192.168.0.65 00:O1:W3:45:49:94
192.168.0.51 00:II:V5:39:47:OI
192.168.0.46 00:OF:A3:81:D3:1Y
192.168.0.31 00:90:32:J9:1L:14
192.168.0.253 00:OF:SD:40:J3:19
192.168.0.1 00:OF:AF:45:4C:78
192.168.0.121 00:YF:A2:U7:4O:RT
192.168.0.155 00:OY:TF:4Q:46:8M
192.168.0.16 00:OF:KF:55:S6:25
192.168.0.177 00:OF:12:09:P9:41
192.168.0.189 00:OF:S5:00:12:70
192.168.0.19 00:OF:76:29:30:DF
将两个排序好的文件合并并输出到结果文件中
[fuyun@bigdata-training datas]$ sort -k2 sort.txt -t. -k4,4nr sort2.txt -m
a 5
c 6
a 1
b 1
c 3
192.168.0.1 00:OF:AF:45:4C:78
192.168.0.71 00:OF:1AF:45:4C:76
192.168.0.16 00:OF:KF:55:S6:25
192.168.0.99 00:LF:9F:R5:IC:27
192.168.0.91 00:OF:H6:45:A1:67
192.168.0.65 00:O1:W3:45:49:94
192.168.0.89 00:OF:A8:33:V5:90
192.168.0.31 00:90:32:J9:1L:14
192.168.0.19 00:OF:76:29:30:DF
192.168.0.177 00:OF:12:09:P9:41
192.168.0.121 00:YF:A2:U7:4O:RT
192.168.0.253 00:OF:SD:40:J3:19
192.168.0.51 00:II:V5:39:47:OI
192.168.0.46 00:OF:A3:81:D3:1Y
192.168.0.7 00:OI:W1:IW:H7:B1
192.168.0.189 00:OF:S5:00:12:70
192.168.0.155 00:OY:TF:4Q:46:8M
a 192.168.43.117
f 192.168.43.119
b 192.168.43.118
z 192.168.43.118
s 192.168.43.117
k 192.168.43.117
c 192.168.43.119
o 192.168.43.110
[fuyun@bigdata-training datas]$ sort -k2 sort.txt -t. -k4,4nr sort2.txt -m -o sortAll.t
[fuyun@bigdata-training datas]$ cat sortAll.txt
a 5
c 6
a 1
b 1
c 3
192.168.0.1 00:OF:AF:45:4C:78
192.168.0.71 00:OF:1AF:45:4C:76
192.168.0.16 00:OF:KF:55:S6:25
192.168.0.99 00:LF:9F:R5:IC:27
192.168.0.91 00:OF:H6:45:A1:67
192.168.0.65 00:O1:W3:45:49:94
192.168.0.89 00:OF:A8:33:V5:90
192.168.0.31 00:90:32:J9:1L:14
192.168.0.19 00:OF:76:29:30:DF
192.168.0.177 00:OF:12:09:P9:41
192.168.0.121 00:YF:A2:U7:4O:RT
192.168.0.253 00:OF:SD:40:J3:19
192.168.0.51 00:II:V5:39:47:OI
192.168.0.46 00:OF:A3:81:D3:1Y
192.168.0.7 00:OI:W1:IW:H7:B1
192.168.0.189 00:OF:S5:00:12:70
192.168.0.155 00:OY:TF:4Q:46:8M
a 192.168.43.117
f 192.168.43.119
b 192.168.43.118
z 192.168.43.118
s 192.168.43.117
k 192.168.43.117
c 192.168.43.119
o 192.168.43.110