ELK+Filebeat+Kafka安装部署
ELK+Filebeat+Kafka安装部署
本篇文章主要介绍如何部署安装 ElasticSearch、Logstash、Kibana、Filebeat、Kafka来进行日志收集。本次主要使用二进制包进行部署安装。
系统说明
服务分布
服务 | IP地址 |
---|---|
Kibana | 172.16.181.176 |
ElasticSearch | 172.16.181.176-178 |
Logstash | 172.16.181.161-162 |
Kafka | 172.16.181.150 |
Filebeat | 所有需要收集日志的服务器上 |
服务架构
安装部署Kafka
说明
本次Kafka部署采用的是单节点方式部署,后续根据实际需求可能会升级到集群模式
下载并安装Kafka
cd /data/elk
wget https://archive.apache.org/dist/kafka/2.0.0/kafka_2.12-2.0.0.tgz
tar -xf kafka_2.12-2.0.0.tgz
cd kafka_2.12-2.0.0
Filebeat-7.5.0版本对于Kafka的版本支持,最高到2.1.0。1
Kafka-2.0.0.0 有2.11和2.12两个版本,两个版本只是针对不同版本的Scala进行构建的,只是单纯使用Kafka的两个版本都是可以的。
修改zookeeper.properties
# the directory where the snapshot is stored.
dataDir=/data/elk/kafka_2.12-2.0.0/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
修改server.properties
broker.id=0
#修改监听端口
listeners=PLAINTEXT://172.16.181.181.150:9092
advertised.listeners=PLAINTEXT://172.16.181.181.150:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
#修改日志存放路径
log.dirs=/data/elk/kafka_2.12-2.0.0/logs/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
启动Zookeeper和Kafka服务
cd /data/elk/kafka_2.12-2.0.0/bin
bash kafka-server-start.sh /data/elk/kafka_2.12-2.0.0/config/server.properties
bash kafka-server-start.sh -daemon /data/elk/kafka_2.12-2.0.0/config/server.properties
创建topic
cd /data/elk/kafka_2.12-2.0.0/bin
bash -x kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic logs
kafka基本操作
cd /data/elk/kafka_2.12-2.0.0/bin
#列出所有的topic
bash kafka-topics.sh --zookeeper localhost:2181 --list
#查看指定topic的信息
bash kafka-topics.sh --zookeeper localhost:2181 --describe --topic logs
#列出所有的consumer groups
bash kafka-consumer-groups.sh --bootstrap-server 172.16.181.181.150:9092 --list
#查看指定consumer groups的信息
bash kafka-consumer-groups.sh --bootstrap-server 172.16.142.80:9092 --describe --group logstash
## TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
## topic名字 分区ID 当前已消费的条数 总条数 未消费的条数 消费id 客户端ip 客户端id
#增加topic数量,只能增加,不能减少
bash kafka-topics.sh --alter --zookeeper localhost:2181 --topic logs --partitions 2
安装部署ElasticSearch
创建并切换用户
useradd elk -d /data/elk
su - elk
下载并解压ElasticSearch安装包
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.5.0-linux-x86_64.tar.gz
tar -xf elasticsearch-7.5.0-linux-x86_64.tar.gz
cd elasticsearch-7.5.0
修改elasticsearch.yml
#ElasticSearch集群名称
cluster.name: elk
#ElasticSearch节点名称
node.name: es-1
#是否为master,true为master节点,false则不做master节点
node.master: true
#是否为data,true为数据存储节点,false则不做为数据存储节点
node.data: true
#是否为路由节点,true为路由节点,false则不做为路由节点;
#若为路由节点,则node.master 和 node.data 必须都为false。
node.ingest: false
search.remote.connect: false
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
path.data: /data/elk/elasticsearch-7.5.0/data
path.logs: /data/elk/elasticsearch-7.5.0/logs
network.host: 0.0.0.0
http.port: 9200
http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: "Authorization,X-Requested-With,Content-Length,Content-Type"
#ElasticSearch7.0以前的集群发现配置方法
#discovery.zen.ping.unicast.hosts: ["172.16.181.176"]
#discovery.zen.minimum_master_nodes: 1
#discovery.zenminimum_master_nodes: 1
#ElasticSearch7.0以后的集群发现配置方法
discovery.seed_hosts: ["172.16.181.176"]
cluster.initial_master_nodes: ["172.16.181.176"]
transport.tcp.port: 9300
transport.tcp.compress: true
#gateway.expected_nodes: 1
#gateway.expected_master_nodes: 1
#gateway.expected_data_nodes: 1
#gateway.recover_after_time: 5m
#gateway.recover_after_master_nodes: 1
#gateway.recover_after_data_nodes: 2
#gateway.recover_after_nodes: 2
修改jvm.options
# ***此次主要修改以下配置,其余配置采用默认配置 ***
#分配系统一半的内存,但最大最好不要超过32G
-Xms8g
-Xmx8g
## GC configuration
#-XX:+UseConcMarkSweepGC
-XX:+UseG1GC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
编写ElasticSearch管理脚本esManage.sh
#!/bin/bash
whois=`whoami`
startES(){
if [ $whois == 'elk' ] ;then
export JAVA_HOME=/data/elk/elasticsearch-7.5.0/jdk/
export PATH=$JAVA_HOME/bin:$PATH
cd /data/elk/elasticsearch-7.5.0/bin/
./elasticsearch -d
Pid=`ps -ef| grep 'elasticsearch-7.5.0/config'| grep -v grep | awk '{print $2}'`
echo "ElasticSearch started, PID is $Pid"
elif [ $whois == 'root' ] ;then
su elk<<!
export JAVA_HOME=/data/elk/elasticsearch-7.5.0/jdk/
export PATH=$JAVA_HOME/bin:$PATH
cd /data/elk/elasticsearch-7.5.0/bin/
./elasticsearch -d
exit
!
Pid=`ps -ef| grep 'elasticsearch-7.5.0/config'| grep -v grep | awk '{print $2}'`
echo "ElasticSearch started, PID is $Pid"
else
please use root or elk user to start !
fi
}
stopES(){
PID=`ps -ef| grep 'elasticsearch-7.5.0/config'| grep -v grep | awk '{print $2}'`
echo $PID | xargs kill -9 && echo ElasticSearch stoped !
}
restartES(){
PID=`ps -ef| grep 'elasticsearch-7.5.0/config'| grep -v grep | awk '{print $2}'`
if [ 1$Pid == 1 ]; then
echo 'ElasticSearch is not running. Now it is starting ! '
startES
else
stopES
startES
fi
}
statusES(){
PID=`ps -ef| grep 'elasticsearch-7.5.0/config'| grep -v grep | awk '{print $2}'`
if [ 1$Pid == 1 ]; then
echo "ElasticSearch is not running!"
else
echo "ElasticSearch is running, Pid is $Pid"
fi
}
case $1 in
start)
startES
;;
stop)
stopES
;;
restart)
restartES
;;
status)
statusES
;;
*)
echo 'Please use "bash esManage.sh start|stop|restart|status" !'
;;
esac
启动ElasticSearch
cd /data/elk/elasticsearch-7.5.0
bash esManage.sh start
安装部署Kibana
下载并安装Kibana
cd /data/elk
wget https://artifacts.elastic.co/downloads/kibana/kibana-7.5.0-linux-x86_64.tar.gz
tar -xf kibana-7.5.0-linux-x86_64.tar.gz
cd kibana-7.5.0-linux-x86_64
修改kibana.yml
server.host: "172.16.181.176"
elasticsearch.hosts: ["http://172.16.181.176:9200"]
i18n.locale: "zh-CN"
编写Kibana管理脚本kibanaManage.sh
#!/bin/bash
whois=`whoami`
startKibana(){
if [ $whois == 'elk' ] ;then
mkdir -p /data/elk/kibana-7.5.0-linux-x86_64/logs
cd /data/elk/kibana-7.5.0-linux-x86_64/bin/
nohup ./kibana >> ../logs/kibana.log 2>&1 &
sleep 3 #增加脚本延迟,等待程序启动完成
Pid=`ss -ntlp |grep 5601 |grep -v grep |awk -F[,=] '{print $3}'`
echo "Kibana started, PID is $Pid"
elif [ $whois == 'root' ] ;then
su elk<<!
mkdir -p /data/elk/kibana-7.5.0-linux-x86_64/logs
cd /data/elk/kibana-7.5.0-linux-x86_64/bin/
nohup ./kibana >> ../logs/kibana.log 2>&1 &
sleep 3 #增加脚本延迟,等待程序启动完成
exit
!
Pid=`ss -ntlp|grep 5601 |grep -v grep |awk -F[,=] '{print $3}'`
echo "Kibana started, PID is $Pid"
else
please use root or elk user to start !
fi
}
stopKibana(){
ss -ntlp |grep 5601 |grep -v grep |awk -F[,=] '{print $3}' |xargs kill -9 && \
echo 'Kiaban is not running! '
}
restartKibana(){
Pid=`ss -ntlp |grep 5601 |grep -v grep |awk -F[,=] '{print $3}'`
if [ 1$Pid == 1 ] ; then
echo 'Kibana is not running! Now it is starting! '
startKibana
else
stopKibana
startKibana
fi
}
statusKibana(){
Pid=`ss -ntlp |grep 5601 |grep -v grep |awk -F[,=] '{print $3}'`
if [ 1$Pid == 1 ]; then
echo 'Kibana is not running !'
else
echo "Kibana is running, pid is $Pid"
fi
}
case $1 in
start)
startKibana
;;
stop)
stopKibana
;;
restart)
restartKibana
;;
status)
statusKibana
;;
*)
echo 'Please use "bash kibanaManage.sh start|stop|restart|status" !'
;;
esac
启动Kibana
cd /data/elk/kibana-7.5.0-linux-x86_64
bash kibanaManage.sh start
安装部署Logstash
下载并安装 Logstash
cd /data/elk
wget https://artifacts.elastic.co/downloads/logstash/logstash-7.5.0.tar.gz
tar -xf logstash-7.5.0.tar.gz
cd logstash-7.5.0
修改logstash.conf
input{ # 输入组件
kafka{ # 从kafka消费数据
bootstrap_servers => ["172.16.181.181.150:9092"] # kafka节点的IP+端口号
topics_pattern => "logs" # 使用正则匹配topic
group_id => "logstash" # 相同的group_id
client_id => "logstash-1" # 不同的client_id
codec => json
consumer_threads => 1 # 消费线程数量,多个Logstash实列的线程数相加最好等于 topic分区数
decorate_events => true # 可向事件添加Kafka元数据,比如主题、消息大小的选项,这将向logstash事件中添加一个名为kafka的字段
auto_offset_reset => "latest" # 自动重置偏移量到最新的偏移量
}
}
filter {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:logdate}"]
}
date {
match => ["logdate", "yyyy-MM-dd HH:mm:ss.SSS"]
target => "@timestamp"
}
}
output { # 输出组件
elasticsearch {
hosts => ["172.16.181.176:9200"]
index => "%{[fields][log_topics]}-%{[beat][name]}-%{+YYYY.MM.dd}"
#template_overwrite => true
#manage_template => false
}
}
编写Logstash管理脚本logManage.sh
#!/bin/bash
whois=`whoami`
startLogstash(){
if [ $whois == 'elk' ] ;then
cd /data/elk/logstash-7.5.0/bin/
nohup ./logstash -f /data/elk/logstash-7.5.0/config/logstash.conf &
sleep 1
Pid=`ps -ef| grep 'logstash-7.5.0'| grep -v grep | awk '{print $2}'`
echo "Logstash started, PID is $Pid"
elif [ $whois == 'root' ] ;then
su elk<<!
cd /data/elk/logstash-7.5.0/bin/
nohup ./logstash -f /data/elk/logstash-7.5.0/config/logstash.conf &
exit
!
sleep 1
Pid=`ps -ef| grep 'logstash-7.5.0'| grep -v grep | awk '{print $2}'`
echo "Logstash started, PID is $Pid"
else
please use root or elk user to start !
fi
}
stopLogstash(){
Pid=`ps -ef| grep 'logstash-7.5.0'| grep -v grep | awk '{print $2}'`
if [ 1$Pid == 1 ]; then
echo 'Logstash is not running!'
else
echo $Pid | xargs kill -9 && echo Logstash was stoped !
fi
}
restartLogstash(){
Pid=`ps -ef|grep 'logstash-7.5.0' |grep -v grep |awk '{print $2}'`
if [ 1$Pid == 1 ]; then
echo 'Logstash is not running! Now it is starting!'
startLogstash
else
stopLogstash
startLogstash
fi
}
statusLogstash(){
Pid=`ps -ef |grep 'logstash-7.5.0' |grep -v grep |awk '{print $2}'`
if [ 1$Pid == 1 ]; then
echo 'Logstash is not running!'
else
ps -ef |grep 'logstash-7.5.0' |grep -v grep
fi
}
case $1 in
start)
startLogstash
;;
stop)
stopLogstash
;;
restart)
restartLogstash
;;
status)
statusLogstash
;;
*)
echo 'Please use "bash logManage.sh start|stop|restart|status" !'
;;
esac
启动Logstash
cd /data/elk/logstash-7.5.0
bash logManage.sh start
安装部署Filebeat
下载并安装Filebeat
cd /data/elk
wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.5.0-linux-x86_64.tar.gz
tar -xf filebeat-7.5.0-linux-x86_64.tar.gz
cd filebeat-7.5.0-linux-x86_64
修改配置文件filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /data/java/reading/logs/reading.log
fields:
log_topics: reading
encoding: utf-8
multiline.pattern: '^\d{4}\-\d{2}\-\d{2}\s\d+\:\d+\:\d+\.\d+'
multiline.negate: true
multiline.match: after
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
setup.template.settings:
index.number_of_shards: 2
name: "172.16.181.140"
output.kafka:
enabled: true
hosts: ["172.16.181.181.150:9092"]
topic: logs
compression: gzip
max_message_bytes: 100000
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
close_inactive: 1m
close_timeout: 3h
clean_inactive: 24h
ignore_older: 20h
编写Filebeat管理脚本beatManage.sh
#!/bin/bash
startBeat(){
nohup /data/filebeat-6.5.4/filebeat -e -c /data/filebeat-6.5.4/filebeat.yml &
sleep 1
Pid=`ps -ef|grep 'filebeat-6.5.4' |grep -v grep |awk '{print $2}'`
echo Filebeat is startup, Pid: $Pid
}
stopBeat(){
Pid=`ps -ef | grep 'filebeat-6.5.4' | grep -v grep | awk '{print $2}'`
echo $Pid | xargs kill -9 && echo Filebeat is stopd!
}
restartBeat(){
Pid=`ps -ef | grep 'filebeat-6.5.4' | grep -v grep | awk '{print $2}'`
if [ 1$Pid == 1 ]; then
startBeat
else
stopBeat
startBeat
fi
}
statusBeat(){
Pid=`ps -ef | grep 'filebeat-6.5.4' | grep -v grep | awk '{print $2}'`
if [ 1$Pid == 1 ]; then
echo "Filebeat is not running!"
else
echo "Filebeat is running, Pid is $Pid"
fi
}
case $1 in
start)
startBeat
;;
stop)
stopBeat
;;
restart)
restartBeat
;;
status)
statusBeat
;;
*)
echo 'Please use "bash beatManage.sh start|stop|restart|status" !'
;;
esac
启动Filebeat
cd /data/elk/filebeat-7.5.0-linux-x86_64/
bash beatManage.sh start
其他
问题
1、
问题现象
"error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2000]/[2000] maximum shards open;"}
原因
因为elasticsearch7以上默认只有1000个分片,超过这个数新收集的日志就没地方存储、展示
解决办法
在ES节点上执行该指令即可:
curl -XPUT -H "Content-Type:application/json" -d '{"persistent":{"cluster":{"max_shards_per_node":10000}}}' 'spacer.gifhttp://es-host:9200/_cluster/settings'
或者在Kibana上的Dev tools上执行以下操作
PUT /_cluster/settings
{
"transient": {
"cluster": {
"max_shards_per_node":10000
}
}
}
ElasticSearch索引删除
#!/bin/bash
cd /data/elk/scripts
#脚本运行日志
CLEAN_LOG="/data/elk/scripts/clean-es-index.log"
#保留时间
keepDate=7
#保留天数内,距离当天日期最早的日期,比如今天24号,要删除5天前的数据,则predate等于的日期应该为19号
preDate=`date "+%Y%m%d" -d "-$keepDate days"`
#ES服务IP地址和端口
esServer="172.16.181.176:9200"
#从目前最早日期的索引开始删除
beginDate=`curl -s $esServer/_cat/indices?v |awk '{print $3}' |grep [0-9][0-9]$ |grep -v grep |awk -F[-] '{print $NF}' |sort |head -1`
#要删除多少天的数据,
#保存天数内距离当天日期最早的日期和目前索引最早开始的日期差,
#再除以86400,
#结果即为一共要删除的天数
delDate=`expr $(expr $(date -d "$preDate" +%s) - $(date +%s -d "$beginDate")) / 86400`
echo "----------------------------clean time is $(date '+%Y-%m-%d %H:%M:%S') ------------------------------" >>${CLEAN_LOG}
for ((i=1;i<=${delDate};i++)); do
#根据日期,从所有的索引中过滤出索要删除的索引
indices=`curl -s $esServer/_cat/indices?v |awk '{print $3}' |grep "$beginDate" |grep -v grep`
#循环,逐个删除
for index in $indices; do
delResult=$(curl -s -XDELETE $esServer/$index?pretty |sed -n '2p')
echo "$(date '+%Y-%m-%d %H:%M:%S') delResult is ${delResult} $index was cleaned" >>${CLEAN_LOG}
echo "" >>${CLEAN_LOG}
done
#获取下一个需要删除索引的日期
beginDate=$(date -d "$beginDate $i days" "+%Y%m%d")
done
https://www.elastic.co/guide/en/beats/filebeat/7.5/kafka-output.html#kafka-compatibility ↩︎