nginx解析elastic stack(采集,解析分离)
需求:实时读取nginx access日志经过采集,缓存,最终导入到elasticsearch平台进行展示查询。由于nginx日志源主机与elasticsearch平台在不同的内网网段内,所以没有采用通常直接的logstash读取并解析然后直接输出到elasticsearch。而是在日志源运行logstash作为采集并转发到隔离网关的redis集群,然后内网内部运行logstash从redis读取并解析再输出到elasticsearch平台。
采集
日志采集的logstash角色被称为shipping,仅读取并转存,不进行格式解析,使用如下配置:
input {
file {
path => "/var/log/nginx/access.log"
start_position => beginning
codec => plain {
charset => "UTF-8"
}
type => "nginx"
}
}
output {
redis {
host => "192.168.100.42"
port => 7021
data_type => "list"
key => "logstash-key"
}
stdout {
codec => rubydebug
}
}
#注: redis-cli连接redis,查看list类型的key值
lpop "logstash-key"
解析
解析端的logstash从redis集群读取list,并按照nginx日志模板解析输出到elasticsearch平台。
input {
redis {
host => "192.168.100.42"
port => 7021
type => "redis_input"
codec => "json"
data_type => "list"
key => "logstash-key" #该key与采集端的logstash:redis key一致
}
}
filter {
grok {
patterns_dir => "/usr/local/logstash-5.1.1/patt"
match => {
"message" => "%{NGINXACCESS}%"
}
overwrite => ["message"]
}
date {
match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => "192.168.100.41:9200"
index => "shopweb_nginx"
}
stdout {
codec => rubydebug
}
}
## grok模板
CACHE_STAT \w+|-
RE_TIME %{NUMBER}|-
the_URI %{URI}|-
NGINXACCESS1 %{IP:client} - - \[%{HTTPDATE:localtime}\] \"%{WORD:method} %{URIPATHPARAM:uri_parm} HTTP/%{NUMBER:ver}\" %{NUMBER:status:int} %{NUMBER:body_bytes_sent:int} \"%{the_URI:referer}\" %{NUMBER:bytes_sent:int} %{NUMBER:request_length:int} \"%{GREEDYDATA:agent}\" \"-\" \"%{CACHE_STAT:cache_status}\" %{RE_TIME:request_time} %{RE_TIME:up_response_time}