Logstash入门
- Java版本
- 安装Logstash
- 运行和测试
- 使用Logstash解析日志
- 将多个输入和输出插件拼接一起
JAVA版本
Logstash需要以下版本之一:
- Java 8
- Java 11
- Java 14
安装Logstash
下载Logstash
开始运行
测试你Logstash的安装
运行:xattr -d -r com.apple.quarantine logstash-7.10.0
cd logstash-7.10.0
bin/logstash -e 'input { stdin { } } output { stdout {} }'
出现错误:Logstash - java.lang.IllegalStateException: Logstash stopped processing because of an error: (ArgumentError) invalid byte sequence in US-ASCII
-e 直接从指令行指定配置
启动Logstash后,可以看到:
The stdin plugin is now waiting for input:
...
控制台输入hello world,输出:
hello word!
{
"message" => "hello word!",
"host" => "...",
"@version" => "1",
"@timestamp" => 2020-11-15T06:19:56.476Z
}
CTRL+D** 退出Logstash
Logstash解析日志
Logstash pipeline 具有一个或多个输入,过滤器和输出插件。
本部分,创建一个Logstash pipeline,使用Filebeat来获取Apache Web日志作为输入,解析日志从中获取命名字段,并将解析的输入写入Elasticsearch集群。
配置Filebeat 发送日志到Logstash
在安装的Filebeat之后,找到filebeat.yml 打开,用以下几行替换内容。
filebeat.inputs:
- type: log
paths:
- /path/to/file/logstash-tutorial.log
output.logstash:
hosts: ["localhost:5044"]
为Filebeat输入配置Logstash
# The # character at the beginning of a line indicates a comment. Use
# comments to describe your configuration.
input {
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
}
在logstash-7.10.0/config 目录创建 first-pipeline.conf,复制以上内容。
Logstash实例配置使用Beats输入插件:
beats {
port => "5044"
}
配置Logstash将输出打印到控制台:
stdout { codec => rubydebug }
First-pipeline.conf 将显示如下:
input {
beats {
port => "5044"
}
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
stdout { codec => rubydebug }
}
验证配置,终端运行:
bin/logstash -f first-pipeline.conf --config.test_and_exit
配置文件通过测试,用一下指令开始Logstash
bin/logstash -f first-pipeline.conf --config.reload.automatic
config.reload.automatic 修改配置文件之后不用停止Logstash。
正常运行,控制台看到一系列以下输出:
{
"@version" => "1",
"agent" => {
"hostname" => "...",
"name" => "...",
"ephemeral_id" => "...",
"version" => "7.10.0",
"id" => "...",
"type" => "filebeat"
}
...
}
使用Grok过滤器插件解析Web日志
现在有了一个工作管道,该管道从Filebeat中读取日志行。
使用grok过滤器插件,将非结构化日志数据解析为结构化可查询的内容。
... - - [04/Jan/2015:05:13:42 +0000] "..."
解析数据使用grok模式%{xxx}
信息 | 栏位名称 |
---|---|
IP地址 | clientip |
用户身份 | ident |
用户认证 | auth |
时间戳记 | timestamp |
HTTP动词 | verb |
请求正文 | request |
HTTP版本 | httpversion |
HTTP状态码 | response |
字节数 | bytes |
推荐连结网址 | referrer |
用户代理 | agent |
编辑 first-pipeline.conf
文件
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
}
转到运行Filebeat终端窗口,按Ctrl+C关闭Filebeat。然后删除Filebeat注册表文件。运行:
sudo rm data/registry
重启Filebeat
sudo ./filebeat -e -c filebeat.yml -d "publish"
在Logstash用用grok模式之后,事件具有以下JSON表示形式:
{
"auth" => "-",
"verb" => "GET",
"httpversion" => "1.1",
"agent" => {
"id" => "...",
"name" => "...",
"ephemeral_id" => "...",
"type" => "filebeat",
"version" => "7.10.0",
"hostname" => "..."
}
...
}
使用Geoip过滤器插件增强数据
解析日志数据以进行更好的搜索外,筛选器插件还可以从现有数据中获取补充信息。例如,geoip插件查找IP地址,从地之中获取位置信息,然后将该位置信息添加到日志中。
geoip过滤插件添加 first-pipeline.con 的Filter部分如下:
geoip {
source => "clientip"
}
geoip配置在grok配置部分后面,first-pipeline.conf 如下所示:
input {
beats {
port => "5044"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
geoip {
source => "clientip"
}
}
output {
stdout { codec => rubydebug }
}
退出Filebeat,执行 sudo rm data/register,重启Filebeat,logstash终端返回:
{
"@version" => "1",
"agent" => {
"hostname" => "...",
"name" => "...",
"ephemeral_id" => "...",
"version" => "7.10.0",
"id" => "...",
"type" => "filebeat"
}
...
}
将您的数据发送到Elasticsearch
web logs已将数据机构话展示,将已准备数据导入Elasticsearch。
Logstash管道可以将数据索引到Elasticsearch 集群中。编辑first-pipeline.conf文件,替换如下:
output {
elasticsearch {
hosts => [ "localhost:9200" ]
}
}
退出Filebeat,删除data/registry 文件,重启Filebeat,指令如下:
sudo ./filebeat -e -c filebeat.yml -d "publish"
测试PIpeline
查看在elasticsearch 中的logstash 名字
curl 'localhost:9200/_cat/indices?v'
curl -XGET 'localhost:9200/_cat/indices?v&pretty'
终端返回:
green open .apm-custom-link L3cwUisUSB2DNkdKRSSr-g 1 0 0 0 208b 208b
green open .kibana_task_manager_1 1CKN9HjpSD-vZsC7lktMVw 1 0 5 1387 307.7kb 307.7kb
yellow open logstash-2020.11.15-000001 _CPW_2hoRvyiKh0RMVRc1A 1 1 6981 0 1.7mb 1.7mb
green open .apm-agent-configuration z8U3_zZKRCqL4TDyuwhkTg 1 0 0 0 208b 208b
yellow open metricbeat-7.10.0-2020.11.13-000001 oae67QKSSUKRLjvEqmYimw 1 1 37190 0 10.5mb 10.5mb
green open .kibana-event-log-7.10.0-000001 lRRg53WpTEKoYeCsAyS39g 1 0 1 0 5.6kb 5.6kb
yellow open filebeat-7.10.0-2020.11.14-000001 ND47B_aeQMWgE4xnvylEjA 1 1 12097 0 1.7mb 1.7mb
green open .async-search oeEIHAJKSj2I79vEkUwadw 1 0 0 0 3.3kb 3.3kb
green open .kibana_1 9RBHuGmCSBiR4z4WDlR7eQ 1 0 3465 1550 11.7mb 11.7mb
现在Logstash管道以配置连接到Elasticsearch集群中,在Elasticsearch中查询。
curl -XGET 'localhost:9200/logstash-2020.11.15-000001/_search?pretty&q=response=200'
终端返回:
{
"took" : 27,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
}
...
用Kibana 可视化数据,在Kibana搜索Filebeat数据。
以上成功创建一个管道,该管道使用Filebeat来获取Apache Web日志作为输入,解析这些日志成特定格式,并将解析后的数据写入Elasticsearch 集群。
[High disk watermark 90%] exceeded on node,shards will be relocated away from this node
[解决:Oops! SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]
curl -H ‘Content-Type: application/json’ -XPUT ‘localhost:9200/*/_settings’ -d ’ { “index” : { “number_of_replicas” : 0 } } ’
[解决:ElasticSearch ClusterBlockException[blocked by: FORBIDDEN/12/index read-only / allow delete (api)];
curl -XPUT -H “Content-Type: application/json” http://localhost:9200/_all/_settings -d ‘{“index.blocks.read_only_allow_delete”: null}’
https://discuss.elastic.co/t/high-disk-watermark-90-exceeded-on-node-shards-will-be-relocated-away-from-this-node/99473