一、Nginx日志例子
Nginx日志例子
nginx日志默认配置:
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" $upstream_response_time';
nginx日志记录:
183.3.226.234 - - [16/Jul/2018:17:23:33 +0800] "POST /app/user/exchangeRC?token=undefined HTTP/1.1" 200 124 "http://test.com/dist/myroomcard.html" "Mozilla/5.0 (Linux; Android 7.1.1; OPPO R11 Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/6.2 TBS/044113 Mobile Safari/537.36 MicroMessenger/6.6.7.1320(0x26060739) NetType/4G Language/zh_CN" "117.136.40.160" 0.070
二、使用grok表达式进行匹配
针对以上的一条记录,我们先写一个正则表达式进行匹配:
PS:grokdebug网站:http://grokdebug.herokuapp.com/,可以线上测试gork表达式(国内google.api前端资源被墙了,因此需要本地代理资源或者翻墙才能使用)
grok表达式为:
%{COMBINEDAPACHELOG} %{QS:http_x_forwarded_for} %{NUMBER:upstream_time}
%{COMBINEDAPACHELOG}:这个是匹配标准的apache或者nginx日志。
%{QS:http_x_forwarded_for}:匹配来源IP,如果使用负载均衡的话,$remote_addr获取的是负载均衡的IP
%{NUMBER:upstream_time}:后端PHP的执行时间,单位为秒
匹配结果如下:
{
"COMBINEDAPACHELOG": [
[
"183.3.226.234 - - [16/Jul/2018:17:23:33 +0800] "POST /app/user/exchangeRC?token=undefined HTTP/1.1" 200 124 "http://test.com/dist/myroomcard.html" "Mozilla/5.0 (Linux; Android 7.1.1; OPPO R11 Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/6.2 TBS/044113 Mobile Safari/537.36 MicroMessenger/6.6.7.1320(0x26060739) NetType/4G Language/zh_CN""
]
],
"COMMONAPACHELOG": [
[
"183.3.226.234 - - [16/Jul/2018:17:23:33 +0800] "POST /app/user/exchangeRC?token=undefined HTTP/1.1" 200 124"
]
],
"clientip": [
[
"183.3.226.234"
]
],
"HOSTNAME": [
[
"183.3.226.234"
]
],
"IP": [
[
null
]
],
"IPV6": [
[
null
]
],
"IPV4": [
[
null
]
],
"ident": [
[
"-"
]
],
"USERNAME": [
[
"-",
"-"
]
],
"auth": [
[
"-"
]
],
"timestamp": [
[
"16/Jul/2018:17:23:33 +0800"
]
],
"MONTHDAY": [
[
"16"
]
],
"MONTH": [
[
"Jul"
]
],
"YEAR": [
[
"2018"
]
],
"TIME": [
[
"17:23:33"
]
],
"HOUR": [
[
"17"
]
],
"MINUTE": [
[
"23"
]
],
"SECOND": [
[
"33"
]
],
"INT": [
[
"+0800"
]
],
"verb": [
[
"POST"
]
],
"request": [
[
"/app/user/exchangeRC?token=undefined"
]
],
"httpversion": [
[
"1.1"
]
],
"BASE10NUM": [
[
"1.1",
"200",
"124",
"0.070"
]
],
"rawrequest": [
[
null
]
],
"response": [
[
"200"
]
],
"bytes": [
[
"124"
]
],
"referrer": [
[
""http://test.com/dist/myroomcard.html""
]
],
"QUOTEDSTRING": [
[
""http://test.com/dist/myroomcard.html"",
""Mozilla/5.0 (Linux; Android 7.1.1; OPPO R11 Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/6.2 TBS/044113 Mobile Safari/537.36 MicroMessenger/6.6.7.1320(0x26060739) NetType/4G Language/zh_CN"",
""117.136.40.160""
]
],
"agent": [
[
""Mozilla/5.0 (Linux; Android 7.1.1; OPPO R11 Build/NMF26X; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.132 MQQBrowser/6.2 TBS/044113 Mobile Safari/537.36 MicroMessenger/6.6.7.1320(0x26060739) NetType/4G Language/zh_CN""
]
],
"http_x_forwarded_for": [
[
""117.136.40.160""
]
],
"upstream_time": [
[
"0.070"
]
]
}
三、整理我们所需要的字段
以上通过正则表达式匹配的结果,可以获取到
字段名称 | 说明 |
clientip | 请求的IP来源 |
timestamp | 请求时间 |
verb | 请求方式 |
request | 请求地址 |
response | http响应码 |
这里就不一一列举了
四、后续修饰
通过grok正则表达式,我们已经获取到我们想要的数据了,但是数据类型有些不对,比如upstream_time发送到es是String格式,我们需要转换为float。
这里需要使用到filter中的date和mutate插件,最终的config配置文件如下:
input {
file {
path => "/data0/log_receiver/nginx/access.log"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG} %{QS:http_x_forwarded_for} %{NUMBER:upstream_time}"}
}
mutate {convert => ["upstream_time", "float"]}
}
output {
elasticsearch {
hosts => "logview-es.yunshanpp.com:80"
index => "xinghuo-nginx-%{+YYYY.MM.dd}"
}
}
ELK日志系统开发,Kibana简单实用Discover(三)