myselect用sql语法对日志文件进行统计分析,把要分析的日志文件当成一个数据库,里面的日志行当作数据库记录,比awk等工具使用更方便
$ myselect -h
usage:
myselect 'sql sentence'; 用 sql进行统计分析
myselect -s 'log line';对日志行按空格进行分割编号
myselect -n 'log line' 'sql sentence'; 对日志行用sql进行解析
myselect -p 'sql sentence'; 查看sql语法解析结果
myselect -c 'sql sentence'; 查看sql计算过程
对于如下的nginx日志
198.52.103.14 - - [29/Jun/2014:00:17:11 +0800] "GET /q/1403060495509100 HTTP/1.1" 200 26788 "http://wenda.so.com/q/1403060495509100" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727)" 221 0.532
如果要知道来源ip最多的是哪些,myselect实现如下
$ myselect 'select count($1),$1 from accesstest.log group by $1 order by count($1) desc limit 10'
14 111.13.65.251
13 10.141.88.248
12 10.141.88.239
10 10.141.88.250
9 121.226.135.115
8 10.141.88.241
8 10.141.88.249
8 222.74.246.190
7 211.149.165.150
6 61.174.51.174
来源:http://www.oschina.net/project/lang/22/php?tag=147&os=0&sort=time