在Shell脚本中使用sort进行排序

最新推荐文章于 2024-05-30 15:40:26 发布

隔壁老登

最新推荐文章于 2024-05-30 15:40:26 发布

阅读量252

点赞数 9

分类专栏： shell代码文章标签： linux

本文链接：https://blog.csdn.net/weixin_45547818/article/details/136855303

版权

shell代码专栏收录该内容

16 篇文章 0 订阅

订阅专栏

sort --help
Ordering options:

  -b, --ignore-leading-blanks  ignore leading blanks
  -d, --dictionary-order      consider only blanks and alphanumeric characters
  -f, --ignore-case           fold lower case to upper case characters
  -g, --general-numeric-sort  compare according to general numerical value
  -i, --ignore-nonprinting    consider only printable characters
  -M, --month-sort            compare (unknown) < 'JAN' < ... < 'DEC'
  -h, --human-numeric-sort    compare human readable numbers (e.g., 2K 1G)
  -n, --numeric-sort          compare according to string numerical value
  -R, --random-sort           shuffle, but group identical keys.  See shuf(1)
      --random-source=FILE    get random bytes from FILE
  -r, --reverse               reverse the result of comparisons
      --sort=WORD             sort according to WORD:
                                general-numeric -g, human-numeric -h, month -M,
                                numeric -n, random -R, version -V
  -V, --version-sort          natural sort of (version) numbers within text
Other options:

      --batch-size=NMERGE   merge at most NMERGE inputs at once;
                            for more use temp files
  -c, --check, --check=diagnose-first  check for sorted input; do not sort
  -C, --check=quiet, --check=silent  like -c, but do not report first bad line
      --compress-program=PROG  compress temporaries with PROG;
                              decompress them with PROG -d
      --debug               annotate the part of the line used to sort,
                              and warn about questionable usage to stderr
      --files0-from=F       read input from the files specified by
                            NUL-terminated names in file F;
                            If F is - then read names from standard input
  -k, --key=KEYDEF          sort via a key; KEYDEF gives location and type
  -m, --merge               merge already sorted files; do not sort
  -o, --output=FILE         write result to FILE instead of standard output
  -s, --stable              stabilize sort by disabling last-resort comparison
  -S, --buffer-size=SIZE    use SIZE for main memory buffer
  -t, --field-separator=SEP  use SEP instead of non-blank to blank transition
  -T, --temporary-directory=DIR  use DIR for temporaries, not $TMPDIR or /tmp;
                              multiple options specify multiple directories
      --parallel=N          change the number of sorts run concurrently to N
  -u, --unique              with -c, check for strict ordering;
                              without -c, output only the first of an equal run
  -z, --zero-terminated     line delimiter is NUL, not newline
      --help     display this help and exit
      --version  output version information and exit

使用场景:
磁盘空间不够需要清理历史数据,此时需要对hdfs文件按归日期升序或按文件大小降序
优先清理日期排前面的数据和文件大的数据
常用参数: -n 按数字顺序 -r 降序

示例如下:

#查询该路径下占用磁盘大的前100条纪录
hadoop fs -du  /user/hive/warehouse/copy/ | sort -r -n | head -100

#查询该路径日期靠前的100条纪录
cat t1.txt
hello 2024-01-02 world
hello 2024-01-04 scala
hello 2024-01-03 shell

#按第2列排序
sort -k 2 t1.txt
hello 2024-01-02 world
hello 2024-01-03 shell
hello 2024-01-04 scala

#也可以指定字段分隔符
sort -t ' ' -k 3 t1.txt
hello 2024-01-04 scala
hello 2024-01-03 shell
hello 2024-01-02 world