EFK+Kafka日志系统:filebeat ---> kafka(ZK) ---> logstash ---> ES ---> kibana
注意:EFK家族(Filebeat,logstash,ES,Kibana)的组件保持版本一致;
----------------------------------------------------------------------------基础定义------------------------------------------------------------------------------
1.Filebeat
由两个主要组件组成:inputs 和 harvesters (收割机/采集器)。这两个组件一起工作以跟踪文件,并将log数据发送到指定的输出(Kafka)。
1.1. harvester
一个harvester负责读取一个单个文件的内容。harvester逐行读取每个文件(一行一行地读取每个文件),并把这些内容发送到输出。每个文件启动一个harvester。harvester负责打开和关闭这个文件,这就意味着在harvester运行时文件描述符保持打开状态。
在harvester正在读取文件内容的时候,文件被删除或者重命名了,那么Filebeat会续读这个文件。这就有一个问题了,就是只要负责这个文件的harvester没用关闭,那么磁盘空间就不会释放。默认情况下,Filebeat保存文件打开直到close_inactive到达。
2.2. input
一个input负责管理harvesters,并找到所有要读取的源。如果input类型是log,则input查找驱动器上与已定义的glob路径匹配的所有文件,并为每个文件启动一个harvester。每个input都在自己的Go例程中运行。
下面的例子配置Filebeat从所有匹配指定的glob模式的文件中读取行:
filebeat.inputs:
- type: log
paths:
- /var/log/*.log
- /var/path2/*.log
2.Kafka(自带zk)
kafka官网:http://kafka.apache.org/
kafka下载页面:http://kafka.apache.org/downloads
kafka配置快速入门:http://kafka.apache.org/quickstart
新版本的kafka自带有zookeeper。
Kafka 作为一个高度可扩展可容错的消息系统;
Topic
Topic 被称为主题,在 Kafka 中,使用一个类别属性来划分消息的所属类,划分消息的这个类称为 Topic。
Topic 相当于消息的分配标签,是一个逻辑概念。主题好比是数据库的表,或者文件系统中的文件夹。
Partition
Partition 译为分区,Topic 中的消息被分割为一个或多个的 Partition,它是一个物理概念,对应到系统上就是一个或若干个目录,一个分区就是一个提交日志。消息以追加的形式写入分区,先后以顺序的方式读取。
注意:由于一个主题包含无数个分区,因此无法保证在整个 Topic 中有序,但是单个 Partition 分区可以保证有序。消息被迫加写入每个分区的尾部。Kafka 通过分区来实现数据冗余和伸缩性。
分区可以分布在不同的服务器上,也就是说,一个主题可以跨越多个服务器,以此来提供比单个服务器更强大的性能。
Segment
Segment 被译为段,将 Partition 进一步细分为若干个 Segment,每个 Segment 文件的大小相等。
Broker
Kafka 集群包含一个或多个服务器,每个 Kafka 中服务器被称为 Broker。Broker 接收来自生产者的消息,为消息设置偏移量,并提交消息到磁盘保存。
Broker 为消费者提供服务,对读取分区的请求作出响应,返回已经提交到磁盘上的消息。
Broker 是集群的组成部分,每个集群中都会有一个 Broker 同时充当了集群控制器(Leader)的角色,它是由集群中的活跃成员选举出来的。
每个集群中的成员都有可能充当 Leader,Leader 负责管理工作,包括将分区分配给 Broker 和监控 Broker。
集群中,一个分区从属于一个 Leader,但是一个分区可以分配给多个 Broker(非 Leader),这时候会发生分区复制。
这种复制的机制为分区提供了消息冗余,如果一个 Broker 失效,那么其他活跃用户会重新选举一个 Leader 接管。
Producer
生产者,即消息的发布者,其会将某 Topic 的消息发布到相应的 Partition 中。
生产者在默认情况下把消息均衡地分布到主题的所有分区上,而并不关心特定消息会被写到哪个分区。不过,在某些情况下,生产者会把消息直接写到指定的分区。
Consumer
消费者,即消息的使用者,一个消费者可以消费多个 Topic 的消息,对于某一个 Topic 的消息,其只会消费同一个 Partition 中的消息。
3.logstash
4.ES
5.kibana
-----------------------------------------------------------------------EFK系统搭建及配置-------------------------------------------------------------------------
1.docker安装filebeat
请运维大爷统一安装,开发找不到所有的机器(背景:服务都在docker上,日志以文件的形式存在磁盘,filebeat需要收集所有的服务的日志)
配置:
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
#=========================== Filebeat inputs =============================
filebeat.inputs:
- type: log
enabled: true
paths:
- /data0/logs/*/*/*.log
ignore_older: 10m
multiline.pattern: ^(\d{4})-(\d{2})-(\d{2})
multiline.negate: true
multiline.match: after
- type: log
enabled: true
ignore_older: 10m
paths:
- /data0/logs/*/*/dotlog*
- /data0/logs/*/*/*/dotlog*
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 15
#index.codec: best_compression
#_source.enabled: false
#================================ General =====================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here, or by using the `-setup` CLI flag or the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
#============================== Kibana =====================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
host: "elk.fangdd.net:80"
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
#host: "localhost:5601"
#============================= Elastic Cloud ==================================
# These settings simplify using filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
#================================ Outputs =====================================
# Configure what output to use when sending the data collected by the beat.
#-------------------------- Elasticsearch output ------------------------------
#setup.template.name: "docker-"
#setup.template.pattern: "docker-*"
#
#output.elasticsearch:
# # Array of hosts to connect to.
# hosts: ["elastic.esf.fdd:80"]
# worker: 20
# index: "docker-%{+yyyy-MM-dd}"
# pipeline: log_pipeline
# bulk_max_size: 51200
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
#----------------------------- kafka output --------------------------------
output.kafka:
enabled: true
hosts: ["10.50.255.125:9092","10.50.255.117:9092","10.50.255.107:9092"]
topic: "docker-log"
#----------------------------- Logstash output --------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
#================================ Logging =====================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]
#============================== Xpack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#xpack.monitoring.enabled: true
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well. Any setting that is not set is
# automatically inherited from the Elasticsearch output configuration, so if you
# have the Elasticsearch output configured, you can simply uncomment the
# following line.
#xpack.monitoring.elasticsearch: true
#todo filebeat打到Kafka,是否有必要配置xpack???
xpack.monitoring:
enabled: true
elasticsearch:
hosts: ["elastic.esf.fdd:80"]
2.Kafka安装
2.1 下载:wget http://apache.cs.utah.edu/kafka/2.3.0/kafka_2.11-2.3.0.tgz 或者 本机下载 scp到服务器
2.2 解压:tar -zxvf kafka_2.11-2.3.0.tgz
2.3 重命名解压文件 mv kafka_2.11-2.3.0 kafka (非必要操作,好看)
2.4 进入解压文件 cd kafka
2.5 准备配置zk和Kafka(config下两个文件:zookeeper.properties,server.properties)---注意要新生成文件用来存放日志等
kafka下面需要新建:zookeeperDir文件夹(zookeeperDir这个名字是根据配置文件来的), zookeeperDir文件夹下创建myid文件,myid文件里面标记zk的唯一标识,每个机器都不能一样(0,1,2...)
zk配置:zookeeper.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# the directory where the snapshot is stored.
dataDir=/data0/java/kafka/zookeeperDir
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=1024
tickTime=2000
initLimit=20
syncLimit=10
server.0=10.50.255.125:2888:3888
server.1=10.50.255.117:2888:3888
server.2=10.50.255.107:2888:3888
kafka配置:server.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=2
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092
# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3
# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600
############################# Log Basics #############################
# A comma separated list of directories under which to store log files
log.dirs=/tmp/kafka-logs
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=16
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
############################# Internal Topic Settings #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
############################# Log Flush Policy #############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=168
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=10.50.255.125:2181,10.50.255.117:2181,10.50.255.107:2181
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
############################# Group Coordinator Settings #############################
# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0
Kafka的配置:注意 日志的位置 /tmp/Kafka-logs 该位置需要手动创建文件------根据server.properties文件;注意 设置的分片 为16,该参数与logstash的线程,设置为相同时最优;
启动:进入kafka的目录
先启动zk:bin/zookeeper-server-start.sh config/zookeeper.properties &
再启动Kafka:bin/kafka-server-start.sh config/server.properties &
------------》》》》》》》》》》》》》》》
以上Kafka流程主要是用于测试----》生产环境不要使用自带的ZK
(自带zk可以减少网络开销,但是会以可用性为代价,如某节点的Kafka挂了会影响到zk)
3.logstash部署
3.1 下载:wget https://artifacts.elastic.co/downloads/logstash/logstash-6.0.0.tar.gz
3.2 解压:tar -xzvf logstash-6.0.0.tar.gz
3.3 新建配置文件并配置
本文新建了配置文件:kafka-logstash-es.conf,新建日志目录:logstash/logstash-log-test文件夹(这个目录根据启动参数设置)
input {
kafka {
bootstrap_servers => "10.50.255.125:9092,10.50.255.117:9092,10.50.255.107:9092"
topics => "docker-log"
auto_offset_reset => "latest"
consumer_threads => 16
decorate_events => false
}
}
output {
elasticsearch {
hosts => ["elastic.esf.fdd:80"]
index => "docker-%{+yyyy-MM-dd}"
document_type => "log"
timeout => 300
}
}
3.4 启动
bin/logstash -f config/kafka-logstash-es.conf --config.reload.automatic --path.data=/home/java/logstash-6.0.0/logstash-log-test/
注释:--path.data=/home/java/logstash-6.0.0/logstash-log-test/ 表示logstash指存放数据的路径
4.ES部署
5.kibana部署
------------------------------------------------------------------------EFK相关组件详解------------------------------------------------------------------------
1. Filebeat如何保持文件状态
Filebeat保存每个文件的状态,并经常刷新状态到磁盘上的注册文件(registry)。状态用于记住harvester读取的最后一个偏移量,并确保所有日志行被发送(到输出)。如果输出,比如Elasticsearch 或者 Logstash等,无法访问,那么Filebeat会跟踪已经发送的最后一行,并只要输出再次变得可用时继续读取文件。当Filebeat运行时,会将每个文件的状态新保存在内存中。当Filebeat重新启动时,将使用注册文件中的数据重新构建状态,Filebeat将在最后一个已知位置继续每个harvester。
对于每个输入,Filebeat保存它找到的每个文件的状态。因为文件可以重命名或移动,所以文件名和路径不足以标识文件。对于每个文件,Filebeat存储惟一标识符,以检测文件是否以前读取过。
如果你的情况涉及每天创建大量的新文件,你可能会发现注册表文件变得太大了。
(画外音:Filebeat保存每个文件的状态,并将状态保存到registry_file中的磁盘。当重新启动Filebeat时,文件状态用于在以前的位置继续读取文件。如果每天生成大量新文件,注册表文件可能会变得太大。为了减小注册表文件的大小,有两个配置选项可用:clean_remove和clean_inactive。对于你不再访问且被忽略的旧文件,建议您使用clean_inactive。如果想从磁盘上删除旧文件,那么使用clean_remove选项。)
2. Filebeat如何确保至少投递一次(at-least-once)?
Filebeat保证事件将被投递到配置的输出中至少一次,并且不会丢失数据。Filebeat能够实现这种行为,因为它将每个事件的投递状态存储在注册表文件中。
在定义的输出被阻塞且没有确认所有事件的情况下,Filebeat将继续尝试发送事件,直到输出确认收到事件为止。
如果Filebeat在发送事件的过程中关闭了,则在关闭之前它不会等待输出确认所有事件。当Filebeat重新启动时,发送到输出(但在Filebeat关闭前未确认)的任何事件将再次发送。这确保每个事件至少被发送一次,但是你最终可能会将重复的事件发送到输出。你可以通过设置shutdown_timeout选项,将Filebeat配置为在关闭之前等待特定的时间。
3.如何解决logstash单点问题(没有集群的概念)
“推荐 logstash 消费 kafka 消息。多个 logstash 订阅同一个主题,使用同一个 group ,这样一条消息只可能被一个 logstash 节点消费,当某一个节点挂了,并不影响其他节点对该 topic 消息的消费,这样即可解决 logstash 单点问题”
------------------------------------------------------------------------注---------------------------------------------------------------------------
注:
filebeat定义详解参考:
https://www.cnblogs.com/cjsblog/p/9495024.html
kafka定义详解参考:
https://mlog.club/article/64550
kafka参考:
https://blog.csdn.net/qq_36431213/article/details/99363190
https://www.jianshu.com/p/898ad61c59fd
https://blog.csdn.net/qq_34834325/article/details/78743490
logstash参考:
https://www.jianshu.com/p/72a1a5d04f12
http://kevoo.org/2019/06/13/logstash-配置从-kafka-读取数据,输入到-es/
https://www.cnblogs.com/jiashengmei/p/8857053.html
https://www.maiyewang.com/2019/03/11/filebeat-kafka-logstash-elasticsearch-kibana日志收集系统搭建/
https://my.oschina.net/u/3707537/blog/1840798
https://www.jianshu.com/p/af60e3c52f5f