使用awk数组功能实现类似SQL语句中的group by功能
文本处理中,有时需要实现类似SQL语句中的group by功能。在关系数据库中,实现很容易,比如下面的group by SQL语句可以查询每个月的总收入
---------- -----------
Jan 100
March 200
April 210
Jan 200
April 220
MONTH INCOME
---------- -----------
April 430
Jan 600
March 200
3 record(s) selected.
如果是文本呢?这时可以利用awk数据的功能来实现,例子如下:
inst105@db2a:~$ cat income.txt
Jan 100
March 200
April 210
Jan 200
April 220
Jan 300
inst105@db2a:~$ awk '{i[$1]+=$2}
END { for (mon in i)
printf("%s\t\t%s\n",mon,i[mon])
}' income.txt
Jan 600
March 200
April 430
参考资料:《The AWK Programming Language》
文本处理中,有时需要实现类似SQL语句中的group by功能。在关系数据库中,实现很容易,比如下面的group by SQL语句可以查询每个月的总收入
inst105@db2a:~$ db2 "select * from test"
MONTH INCOME---------- -----------
Jan 100
March 200
April 210
Jan 200
April 220
Jan 300
6 record(s) selected.
inst105@db2a:~$ db2 "select MONTH, SUM(INCOME) as INCOME from test group by MONTH"MONTH INCOME
---------- -----------
April 430
Jan 600
March 200
3 record(s) selected.
如果是文本呢?这时可以利用awk数据的功能来实现,例子如下:
inst105@db2a:~$ cat income.txt
Jan 100
March 200
April 210
Jan 200
April 220
Jan 300
inst105@db2a:~$ awk '{i[$1]+=$2}
END { for (mon in i)
printf("%s\t\t%s\n",mon,i[mon])
}' income.txt
Jan 600
March 200
April 430
参考资料:《The AWK Programming Language》