1. 文本命令数据分析

    假设线上倒出的接口访问日志有上百行,该日志的记录格式如下:

    /data1/www/logs/archives/170524/170524.v6.weibo.com_10.72.13.113.0.cn.gz:v6.weibo.com 123.125.104.20 0.016s - [24/May/2017:14:04:37 +0800] "POST /aj/video/playstatistics?ajwvr=6&cuid=2008282113&lang=zh-cn&ip=60.255.47.150&curl=http%3A%2F%2Fd.weibo.com%2F%3Ftopnav%3D1%26amp%3Bmod%3Dlogo%26amp%3Bwvr%3D6&ua=Mozilla%2F5.0%20%28Windows%20NT%205.1%29%20AppleWebKit%2F537.36%20%28KHTML%2C%20like%20Gecko%29%20Chrome%2F49.0.2623.221%20Safari%2F537.36%20SE%202.X%20MetaSr%201.0&wvr=v5 HTTP/1.1" 200 71 "http://zhaoren.weibo.com" - "SUP=- SUBP=-" "REQUEST_ID=1000659645207911167" "Weibo.com Swift framework HttpRequest class" "REQ_UID=2008282113"

    统计日志中根据ip进行排重,并统计相同ip统计的次数,执行命令如下:

    cat play.log | awk -F ' ' '{print $2}' | sort -k 1 -n -r | uniq -c > rizhi.log

    说明:每行以空格分割,输出第二个参数,并根据第一行排序,-n数字排序-r降序,并统计每行在文本中出现的次数,输出结果如下:


    3 223.166.87.59

       1 60.12.35.5

       1 1.189.96.233