目录标题
如何启动logstash
# cd到 logstash解压文件的bin目录下
PS C:\Users\hs> cd D:\lihua\ELK\logstash-7.15.1-windows-x86_64\logstash-7.15.1\bin
# logstash -f 指定配置文件的路径
PS D:\lihua\ELK\logstash-7.15.1-windows-x86_64\logstash-7.15.1\bin> .\logstash -f D:\lihua\iot\iot-engine\code\hx-iot-engine-starter\src\main\resources\logstash.conf
一、专业术语介绍
(一)@metadata
用于存储您不想包含在输出事件中的内容的特殊字段。例如,该@metadata 字段可用于创建用于条件语句的临时字段。
例子:
filter {
mutate { add_field => { "show" => "This data will be in the output" } }
mutate { add_field => { "[@metadata][test]" => "Hello" } }
mutate { add_field => { "[@metadata][no_show]" => "This data will not be in the output" } }
}
logstash控制台输出:
{
"@timestamp" => 2016-06-30T02:46:48.565Z,
# 被@metadata修饰的字段(field)是不会流入到output里面的。也就是这个字段是临时的字段,生命周期只存在filter 阶段
"@metadata" => {
"test" => "Hello",
"no_show" => "This data will not be in the output"
},
"@version" => "1",
"host" => "example.com",
"show" => "This data will be in the output",
"message" => "asdf"
}
@metadata当您需要临时字段但不希望它出现在最终输出中时,请随时使用该字段。
注意:mutate { add_field => { “[@metadata][test]” => “Hello” } } 中 field的name为[@metadata][test],你引用的时候不能写[test],需要写成[@metadata][test]
(二)field
一个事件属性。例如,apache 访问日志中的每个事件都有属性,例如状态代码(200、404)、请求路径(“/”、“index.html”)、HTTP 动词(GET、POST)、客户端 IP 地址、等等。Logstash 使用术语“字段”来指代这些属性。
-
field的具体表现:
-
创建(声明) field ——【 add_field】
- 值类型是哈希
- 默认值为 {}
- 作用:向事件添加字段
例子:
input {
file { add_field => { "show" => "This data will be in the output" } }
}
filter {
mutate { add_field => { "show" => "这个字段可以流入output" } }
mutate { add_field => { "[@metadata][test]" => "Hello" } }
mutate { add_field => { "[@metadata][no_show]" => "这个字段是临时的,不能流入output" } }
}
注意:
1、三种类型的插件【input、filter、output】都能创建field ,只要具体插件中提供了add_field这个配置选项。
2、当创建的field 已经存在,那么会将这个field 转换成数组类型,并插入一个元素。如下:
-
使用(引用)field
字段引用通常用方 ( [] ) 括号括起来,例如[fieldname]。如果您指的是顶级字段,则可以省略[]并仅使用字段名称。要引用嵌套字段,请指定该字段的完整路径:[top-level field][nested field]- 在逻辑运算中引用field
详细参考官网
在逻辑运算中使用 [fieldname] 引用 field
filter { # 如果字段foo 在字段foobar中 if [foo] in [foobar] { # 向tag数组添加"field in field"这个元素 mutate { add_tag => "field in field" } } if [foo] in "foo" { mutate { add_tag => "field in string" } } if "hello" in [greeting] { mutate { add_tag => "string in field" } } if [foo] in ["hello", "world", "foo"] { mutate { add_tag => "field in list" } } if [missing] in [alsomissing] { mutate { add_tag => "shouldnotexist" } } if !("foo" in ["hello", "world"]) { mutate { add_tag => "shouldexist" } } }
注意:[foo] 本身可以判断是否存在foo 这个字段,例如:
output { # 如果[loglevel]不为空,并且[loglevel]的值为 "ERROR" if [loglevel] and [loglevel] == "ERROR" { pagerduty { ... } } }
- 在字符输出中引用field
在字符输出中使用%{fieldname} 引用field
参考官网文档
output { elasticsearch { hosts => ["192.168.1.83:9200"] index => "jmqttlogs-%{type}-%{logger}-%{loglevel}-%{+YYYY.MM}" } stdout { codec => rubydebug } }
- 在逻辑运算中引用field
(三)field reference
对事件字段的引用。此引用可能出现在 Logstash 配置文件的输出块或过滤器块中。字段引用通常用方 ( [] ) 括号括起来,例如[fieldname]。如果您指的是顶级字段,则可以省略[]并仅使用字段名称。要引用嵌套字段,请指定该字段的完整路径:[top-level field][nested field]
可以认为概念与field 一致
。
(四)input plugin
从特定来源读取事件数据的 Logstash插件。输入插件是 Logstash 事件处理管道的第一阶段。流行的输入插件包括 file、syslog、redis 和 beats。
input {
# file输入插件
file{
# 插件提供的配置项 ,具体配置项可以查看官网
path => ["/jmqttlogs/*.log","/jmqttlogs/"]
type => "test"
exclude => ["brokerLog.log","remotingLog.log"]
}
#beats 输入插件
beats {
# 插件提供的配置项 ,具体配置项可以查看官网
port => 5044
}
# tcp输入插件
tcp {
# 插件提供的配置项 ,具体配置项可以查看官网
port => 12345
codec => json
}
}
logstash 为我们提供了以下输入插件:
官网地址
Plugin | Description | Github repository |
Receives events from Azure Event Hubs | ||
Receives events from the Elastic Beats framework | ||
Pulls events from the Amazon Web Services CloudWatch API | ||
Streams events from CouchDB’s | ||
read events from Logstash’s dead letter queue | ||
Receives events from the Elastic Agent framework | logstash-input-beats (shared) | |
Reads query results from an Elasticsearch cluster | ||
Captures the output of a shell command as an event | ||
Streams events from files | ||
Reads Ganglia packets over UDP | ||
Reads GELF-format messages from Graylog2 as events | ||
Generates random log events for test purposes | ||
Reads events from a GitHub webhook | ||
Extract events from files in a Google Cloud Storage bucket | ||
Consume events from a Google Cloud PubSub service | ||
Reads metrics from the | ||
Generates heartbeat events for testing | ||
Receives events over HTTP or HTTPS | ||
Decodes the output of an HTTP API into events | ||
Reads mail from an IMAP server | ||
Reads events from an IRC server | ||
Generates synthetic log events | ||
Reads events from standard input | ||
Creates events from JDBC data | ||
Reads events from a Jms Broker | ||
Retrieves metrics from remote Java applications over JMX | ||
Reads events from a Kafka topic | ||
Receives events through an AWS Kinesis stream | ||
Reads events over a TCP socket from a Log4j | ||
Receives events using the Lumberjack protocl | ||
Captures the output of command line tools as an event | ||
Streams events from a long-running command pipe | ||
Receives facts from a Puppet server | ||
Pulls events from a RabbitMQ exchange | ||
Reads events from a Redis instance | ||
Receives RELP events over a TCP socket | ||
Captures the output of command line tools as an event | ||
Streams events from files in a S3 bucket | ||
Reads logs from AWS S3 buckets using sqs | ||
Creates events based on a Salesforce SOQL query | ||
Polls network devices using Simple Network Management Protocol (SNMP) | ||
Creates events based on SNMP trap messages | ||
Creates events based on rows in an SQLite database | ||
Pulls events from an Amazon Web Services Simple Queue Service queue | ||
Reads events from standard input | ||
Creates events received with the STOMP protocol | ||
Reads syslog messages as events | ||
Reads events from a TCP socket | ||
Reads events from the Twitter Streaming API | ||
Reads events over UDP | ||
Reads events over a UNIX socket | ||
Reads from the | ||
Reads events from a websocket | ||
Creates events based on the results of a WMI query | ||
Receives events over the XMPP/Jabber protocol |
注意:这里的input plugin、filter plugin、output plugin 虽然叫做插件,但是他们并不需要我们额外安装,logstash已经集成了他们。
(五)filter plugin
对事件执行中间处理的 Logstash插件。通常,过滤器在通过输入摄取事件数据后,通过根据配置规则对数据进行变异、丰富和/或修改来对事件数据进行处理。过滤器通常根据事件的特征有条件地应用。流行的过滤器插件包括 grok、mutate、drop、clone 和 geoip。过滤阶段是可选的。
logstash 为我们提供了以下过滤插件:
官网地址
Plugin | Description | Github repository |
Calculates the age of an event by subtracting the event timestamp from the current timestamp | ||
Aggregates information from several events originating with a single task | ||
Performs general alterations to fields that the | ||
Parses string representations of computer storage sizes, such as "123 MB" or "5.6gb", into their numeric value in bytes | ||
Checks IP addresses against a list of network blocks | ||
Applies or removes a cipher to an event | ||
Duplicates events | ||
Parses comma-separated value data into individual fields | ||
Parses dates from fields to use as the Logstash timestamp for an event | ||
Computationally expensive filter that removes dots from a field name | ||
Extracts unstructured event data into fields using delimiters | ||
Performs a standard or reverse DNS lookup | ||
Drops all events | ||
Calculates the elapsed time between a pair of events | ||
Copies fields from previous log events in Elasticsearch to current events | ||
Stores environment variables as metadata sub-fields | ||
Extracts numbers from a string | ||
Fingerprints fields by replacing values with a consistent hash | ||
Adds geographical information about an IP address | ||
Parses unstructured event data into fields | ||
Provides integration with external web services/REST APIs | ||
Removes special characters from a field | ||
Generates a UUID and adds it to each processed event | ||
Enriches events with data pre-loaded from a remote database | ||
Enrich events with your database data | ||
Parses JSON events | ||
Serializes a field to JSON | ||
Parses key-value pairs | ||
Provides integration with external data in Memcached | ||
Takes complex events containing a number of metrics and splits these up into multiple events, each holding a single metric | ||
Aggregates metrics | ||
Performs mutations on fields | ||
Prunes event data based on a list of fields to blacklist or whitelist | ||
Checks that specified fields stay within given size or length limits | ||
Executes arbitrary Ruby code | ||
Sleeps for a specified time span | ||
Splits multi-line messages, strings, or arrays into distinct events | ||
Parses the | ||
Enriches security logs with information about the attacker’s intent | ||
Throttles the number of events | ||
Replaces the contents of the default message field with whatever you specify in the configuration | ||
Replaces field contents based on a hash or YAML file | ||
Truncates fields longer than a given length | ||
Decodes URL-encoded fields | ||
Parses user agent strings into fields | ||
Adds a UUID to events | ||
Enriches logs with device information such as brand, model, OS | ||
Parses XML into fields |
(六)output plugin
将事件数据写入特定目的地的 Logstash插件。输出是事件管道的最后阶段。流行的输出插件包括 elasticsearch、file、graphite 和 statsd。
logstash 为我们提供了以下输出插件:
官网地址
Plugin | Description | Github repository |
Sends events to the Elastic App Search solution | ||
Sends annotations to Boundary based on Logstash events | ||
Sends annotations to Circonus based on Logstash events | ||
Aggregates and sends metric data to AWS CloudWatch | ||
Writes events to disk in a delimited format | ||
Sends events to DataDogHQ based on Logstash events | ||
Sends metrics to DataDogHQ based on Logstash events | ||
Sends events to Dynatrace based on Logstash events | ||
Sends events to the Elastic App Search solution | ||
Sends events to the Elastic Workplace Search solution | ||
Stores logs in Elasticsearch | ||
Sends email to a specified address when output is received | ||
Runs a command for a matching event | ||
Writes events to files on disk | ||
Writes metrics to Ganglia’s | ||
Generates GELF formatted output for Graylog2 | ||
Writes events to Google BigQuery | ||
Uploads log events to Google Cloud Storage | ||
Uploads log events to Google Cloud Pubsub | ||
Writes metrics to Graphite | ||
Sends metric data on Windows | ||
Sends events to a generic HTTP or HTTPS endpoint | ||
Writes metrics to InfluxDB | ||
Writes events to IRC | ||
Prints events to the STDOUT of the shell | ||
Pushes messages to the Juggernaut websockets server | ||
Writes events to a Kafka topic | ||
Sends metrics, annotations, and alerts to Librato based on Logstash events | ||
Ships logs to Loggly | ||
Sends events using the | ||
Writes metrics to MetricCatcher | ||
Writes events to MongoDB | ||
Sends passive check results to Nagios | ||
Sends passive check results to Nagios using the NSCA protocol | ||
Writes metrics to OpenTSDB | ||
Sends notifications based on preconfigured services and escalation policies | ||
Pipes events to another program’s standard input | ||
Pushes events to a RabbitMQ exchange | ||
Sends events to a Redis queue using the | ||
Creates tickets using the Redmine API | ||
Writes events to the Riak distributed key/value store | ||
Sends metrics to Riemann | ||
Sends Logstash events to the Amazon Simple Storage Service | ||
Discards any events received | ||
Sends events to Amazon’s Simple Notification Service | ||
Stores and indexes logs in Solr | ||
Pushes events to an Amazon Web Services Simple Queue Service queue | ||
Sends metrics using the | ||
Prints events to the standard output | ||
Writes events using the STOMP protocol | ||
Sends events to a | ||
Writes events over a TCP socket | ||
Sends events to the Timber.io logging service | ||
Sends events over UDP | ||
Sends Logstash events to HDFS using the | ||
Publishes messages to a websocket | ||
Sends events to the Elastic Workplace Search solution | ||
Posts events over XMPP | ||
Sends events to a Zabbix server |
(七)其他
参考官网介绍
二、具体的logstash配置实例
注意:配置文件不要写注释,不然会加载失败。
#配置输入
input {
#file输入插件,数据来源于文件,这里是.log日志文件
file{
# 指定文件路径,注意只能是绝对路径,不能是相对路径。这里有个细节,如果需要配置文件排除,那么必须给定一个文件夹路径。这个配置项是必须要写的
path => ["D:/lihua/javacode/jmqtt/iot-jmqtt/code/jmqttlogs/*.log","D:/lihua/javacode/jmqtt/iot-jmqtt/code/jmqttlogs/"]
# 这个是设置需要排除的文件,需要结合path使用。
exclude => ["brokerLog.log","remotingLog.log"]
# type这个配置是一个field,不是必须的,并且它的值没有具体要求,可以灵活设置
type => "test"
}
}
# 配置过滤器,可以过滤(处理、解析)input 输入的数据
filter {
# 解析日志的插件。具体使用后面介绍。
grok {
# 配置规则,这个规则可以通过官方提供的在线工具生成
match => { "message" => "(?<timestamp>%{TIMESTAMP_ISO8601}) \[%{LOGLEVEL:loglevel}\] (?<logger>[A-Za-z0-9$_.]+) – %{GREEDYDATA:messagebody}$" }
}
#json解析插件:发现并解析日志中存在的json
json {
# 指定需要解析哪个字段(field),解析后会将json里面的属性变成field
source => "messagebody"
}
# 数据转换插件,通常用来转换field的值,比如转换成小写
mutate {
# 将指定的字段转换成小写,注意:es的索引库的名字不能存在大写字母
lowercase => [ "logger","loglevel" ]
# 删除一些不需要的field和add_field配套,并且这两个配置项大多数插件都提供有。
remove_field => ["path","timestamp"]
}
}
# 配置输出
output {
# elasticsearch 输出插件,将日志输出到es中存储
elasticsearch {
# 配置es地址
hosts => ["192.168.1.83:9200"]
# 配置索引库,如果这个索引库不存在那么会创建。注意索引库的名字不能存在大写字母
index => "jmqttlogs-%{type}-%{logger}-%{loglevel}-%{+YYYY.MM}"
}
# 控制台输出插件,配置了这个插件logstash的运行控制台才会输出调试日志
stdout { codec => rubydebug }
}