192. Word Frequency



统计文件中单词出现频率
Write a bash script to calculate the frequency of each word in a text file words.txt.

For simplicity sake, you may assume:

    words.txt contains only lowercase characters and space ' ' characters.
    Each word must consist of lowercase characters only.
    Words are separated by one or more whitespace characters.

For example, assume that words.txt has the following content:

the day is sunny the the
the sunny is is

Your script should output the following, sorted by descending frequency:

the 4
is 3
sunny 2
day 1


#!/bin/bash
awk '{ for (i=1; i <= NF; ++i) { if (arr[$i] == 0) {arr[$i] = 1;} else { ++arr[$i]; }}}; END { for (k in arr) print k " " arr[k] | "sort -r -n -k2"; }' words.txt



通过管道,调用sort排序,-r 从大到小,-n 按照数字排序,-k2 以第2列排序;如果以key值排序 –k2 变成 -k1

       -n, --numeric-sort
              compare according to string numerical value

       -r, --reverse
              reverse the result of comparisons
       -k, --key=POS1[,POS2]
              start a key at POS1 (origin 1), end it at POS2 (default end of line).  See POS syntax below


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值