本章介绍filter grok插件,分析httpd的日志
grok插件:
解析各种非结构化的日志数据插件
grok使用正则表达式把飞结构化的数据结构化
在分组匹配,正则表达式需要根据具体数据结构编写
虽然编写困难,但适用性极广,几乎可以应用于各类数据
分析httpd的日志
# vim /etc/httpd/conf/httpd.conf //查看access_log的日志格式
196 LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
217 CustomLog "logs/access_log" combined
# /var/log/httpd/access_log //随便选一段对应查看
192.168.1.254 - - [27/Dec/2018:09:15:35 +0800] "GET /favicon.ico HTTP/1.1" 404 209 "http://192.168.1.100/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
参考网站:http://httpd.apache.org、金步国
步骤一、
# vim /etc/logstash/logstash.conf
input{
stdin{ codec => "json" }
file{
path => ["/tmp/a.log","/var/tmp/b.log"]
#sincedb_path => "/var/lib/logstash/since.db"
sincedb_path => "/dev/null" //仅实验环境可用,不用读之前就删除
start_position => "beginning"
type => "testlog"
}
tcp{
host => "0.0.0.0"
mode => "server"
port => 8888
type => "tcplog"
}
udp{
host => "0.0.0.0"
port => 8888
type => "udplog"
}
syslog {
port => "514"
type => "syslog"
}
}
filter{
grok{
match => { "message" => "(?<ip>[0-9.]+) (?<ident>\S+) (?<user>\S+) \[(?<time>.+)\] \"(?<method>[A-Z]+) (?<url>\S+) (?<ver>\S+)\" (?<rc>\d+) (?<size>\d+) \"(?<ref>\S+)\" \"(?<agent>[^\"]+).*" }
}
}
output{
stdout{ codec => rubydebug }
}
复制/var/log/httpd/access_log的日志到logstash下的/tmp/a.log
# vim /tmp/a.log
192.168.1.254 - - [27/Dec/2018:09:15:35 +0800] "GET /favicon.ico HTTP/1.1" 404 209 "http://192.168.1.100/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
# logstash -f /etc/logstash/logstash.conf
# cd \
/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.5/patterns/
# vim grok-patterns //查找正则宏路径
COMBINEDAPACHELOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent} //查找COMBINEDAPACHELOG
# vim /etc/logstash/logstash.conf
...
filter{
grok{
match => {"message", "%{COMBINEDAPACHELOG}"}
}
}
...
# logstash -f /etc/logstash/logstash.conf
步骤二、安装Apache服务,用filebeat收集Apache服务器的日志,存入elasticsearch
web、logstash:10.211.55.10
在安装了Apache的主机上面安装filebeat
# yum -y install filebeat
# vim /etc/filebeat/filebeat.yml
paths:
- /var/log/httpd/access_log //日志的路径,短横线加空格代表yml格式
document_type: apachelog //文档类型
#elasticsearch: //加上注释
hosts: ["localhost:9200"] //加上注释
logstash: //去掉注释
hosts: ["192.168.1.20:5044"] //去掉注释,logstash那台主机的ip
# grep -Pv "^\s*(#|$)" /etc/filebeat/filebeat.yml
# systemctl restart filebeat
# vim /etc/logstash/logstash.conf
input{
stdin{ codec => "json" }
beats{
port => 5044
}
file{
path => ["/tmp/a.log","/var/tmp/b.log"]
sincedb_path => "/var/lib/logstash/since.db"
#sincedb_path => "/dev/null"
start_position => "beginning"
type => "testlog"
}
tcp{
host => "0.0.0.0"
mode => "server"
port => 8888
type => "tcplog"
}
udp{
host => "0.0.0.0"
port => 8888
type => "udplog"
}
syslog {
port => "514"
type => "syslog"
}
}
filter{
grok{
match => { "message" => "{%COMBINEDAPACHELOG}"}
}
}
output{
stdout{ codec => rubydebug }
if [type] == "apachelog"{
elasticsearch{
hosts => ["10.211.55.9:9200"]
index => "weblog-%{+YYYY.MM.dd}"
}}
}
打开另一logstash终端查看5044是否成功启动
# netstat -antup | grep 5044
访问web主机:
# firefox http://10.211.55.10/
# for i in {1..30}
> do
> curl http://10.211.55.10/ -so /dev/null
> done
步骤三、
访问http://10.211.55.9:9200/_plugin/head/
访问http://10.211.55.11:5601,创建新索引
观察到有访问数据
步骤四、配置图表:10.211.55.10 web访问量