nginx的日志格式可能有多种多样,本文举例的nginx日志格式为:
http {
# …
log_format main '[] $remote_addr -
r
e
m
o
t
e
u
s
e
r
[
remote_user [
remoteuser[time_local] “KaTeX parse error: Double superscript at position 36: … '̲status
b
o
d
y
b
y
t
e
s
s
e
n
t
"
body_bytes_sent "
bodybytessent"http_referer” ’
‘“
h
t
t
p
u
s
e
r
a
g
e
n
t
"
"
http_user_agent" "
httpuseragent""http_x_forwarded_for”’;
# …
}
我们使用log_format指令来指定日志文件的格式,以$开头的都是变量,这些变量的含义如下:
r
e
m
o
t
e
a
d
d
r
与
remote_addr 与
remoteaddr与http_x_forwarded_for 用以记录客户端的ip地址;
$remote_user :用来记录客户端用户名称;
$time_local : 用来记录访问时间与时区;
$request : 用来记录请求的url与http协议;
$status : 用来记录请求状态;成功是200,
$body_bytes_s ent :记录发送给客户端文件主体内容大小;
$http_referer :用来记录从那个页面链接访问过来的;
$http_user_agent :记录客户端浏览器的相关信息
日志文件内容举例为:
[] 100.116.108.148 - - [13/Jul/2017:00:05:19 +0800] “POST /message/check HTTP/1.0” 200 89 “https://www.example.com/message/add” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36” “36.57.161.201”
[] 100.109.253.3 - - [13/Jul/2017:00:12:16 +0800] “GET /statisticDaily/index HTTP/1.0” 200 37374 “https://www.example.com/statisticDaily/index” “Mozilla/5.0 (Linux; Android 5.1.1; vivo Xplay5A Build/LMY47V; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.49 Mobile MQQBrowser/6.2 TBS/043305 Safari/537.36 MicroMessenger/6.5.10.1080 NetType/WIFI Language/zh_CN” “223.87.234.226”
统计nginx访问量最多的前100个url和频次
grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $2,$3}’ | awk ‘{print $2}’| sort | uniq -c | sort -k1nr | head -100
#输出:频次 请求路径
186405 /
148257 /home
132921 /ucenter/index
80749 /login
60431 /captcha
统计nginx访问状态码非200的前100个url和频次
grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $2,$3}’ | awk ‘{if ($4!=200) {print $4,$1,$2}}’ | sort | uniq -c | sort -k1nr | head -100
#输出:频次 状态 请求方法 请求路径
52573 302 GET /
16730 302 GET /submitlogin
16477 404 GET /apple-touch-icon-precomposed.png
15427 404 GET /apple-touch-icon.png
14408 302 GET /home
统计nginx访问不正常(状态码400+)的前100个url和频次
grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $2,$3}’ | awk ‘{if ($4>=“400”) {print $4,$1,$2}}’ | sort | uniq -c | sort -k1nr | head -100
#输出:频次 状态码 请求方法 请求路径
16401 404 GET /apple-touch-icon-precomposed.png
15483 404 GET /apple-touch-icon.png
6512 404 GET /apple-touch-icon-120x120-precomposed.png
5743 404 GET /apple-touch-icon-120x120.png
4118 499 POST /statisticTrade/rechargeDetail
统计nginx访问频次最高的100个Ip
grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $(NF-1)}’ | sort | uniq -c | sort -k1nr | head -100
#输出: 频次 ip
408982 111.127.132.32
252175 120.41.162.180
170169 61.148.196.162
168990 59.173.42.117
103752 123.116.99.75
uniq -c 命令输出统计词频
sort -k1nr 解释: -k指定以那个列排序 1表示第一列 n表示使用数字而非文本排序 r表示倒序