环境:
本地电脑VMWare创建虚拟机:
虚拟机:Ubuntu-20.04
k8s集群
主机名 | IP | k8s版本 |
master01 | 192.168.66.25 | v1.28.2 |
node01 | 192.168.66.35 | v1.28.2 |
node02 | 192.168.66.36 | v1.28.2 |
kafka和zookeeper集群
主机名 | IP | 服务 | 版本 |
kafka01 | 192.168.66.77 | zookeeper,kafka | 3.7.2,2.13-3.5.2 |
kafka02 | 192.168.66.78 | zookeeper,kafka | 3.7.2,2.13-3.5.2 |
kafka03 | 192.168.66.79 | zookeeper,kafka | 3.7.2,2.13-3.5.2 |
ELK集群
主机名 | IP | 服务 | 版本 |
es01 | 192.168.66.71 | elasticsearch | 7.17.0 |
es02 | 192.168.66.72 | elasticsearch | 7.17.0 |
es03 | 192.168.66.73 | elasticsearch | 7.17.0 |
es02 | 192.168.66.72 | logstash | 7.17.0 |
kafka01 | 192.168.66.77 | kibaba | 7.17.0 |
收集日志架构,在k8s集群中以daemonset的方式部署logstash-daemonset,使用logstash-daemonset收集k8s集群系统日志和微服务日志,收集出来的日志交给kafka集群,logstash消费kafka的数据交给elasticsarch集群,使用kibana展示收集到的日志。
部署
1:部署zookeeper集群
zookeeper和kafka都是需要运行在java环境中,所以在三台节点上安装jdk
root@kafka01:#apt-get install openjdk-11-jdk
root@kafka02:#apt-get install openjdk-11-jdk
root@kafka03:#apt-get install openjdk-11-jdk
zookeeper官网下载zookeeper包:
#wget https://dlcdn.apache.org/zookeeper/zookeeper-3.7.2/apache-zookeeper-3.7.2-bin.tar.gz
解压apache-zookeeper-3.7.2-bin.tar.gz
#tar -zxvf apache-zookeeper-3.7.2-bin.tar.gz
解压之后做一个zookeeper的软连接,方便切换目录
#ln -sv apache-zookeeper-3.7.2-bin zookeeper
切换到zookeeper目录下,修改配置文件。zookeeper配置文件使用conf目录中的zoo_sample.cfg模板,修改配置文件名称,然后修改。
#cd zookeeper/conf
#cp zoo_sample.cfg zoo.cfg
修改配置文件:
# egrep -v '^#|^$' zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=100
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
server.1=192.168.66.77:2888:3888
server.2=192.168.66.78:2888:3888
server.3=192.168.66.79:2888:3888
zookeeper集群三个节点的配置文件一样。
我们设置zookeeper的数据文件存放在/data/zookeeper目录中,需要创建目录和创建myid文件,并且定义myid。注意:三个节点myid不一致
#mkdir /data/zookeeper
root@kafka01:/opt/zookeeper/conf# echo 1 > /data/zookeeper/myid
root@kafka02:/opt/zookeeper/conf# echo 2 > /data/zookeeper/myid
root@kafka03:/opt/zookeeper/conf# echo 2 > /data/zookeeper/myid
启动zookeeper集群,切换到/opt/zookeeper/bin目录下启动
root@kafka01:/opt/zookeeper/bin#./zkServer.sh start
root@kafka02:/opt/zookeeper/bin#./zkServer.sh start
root@kafka03:/opt/zookeeper/bin#./zkServer.sh start
验证zookeeper集群状态,在mode字段中出现follower或者leader,我们认为集群创建成功,如果mode是:only等字段,我们认为zookeeper启动是单节点,不是集群需要排查
root@kafka01:/opt/zookeeper/bin# ./zkServer.sh status
root@kafka02:/opt/zookeeper/bin# ./zkServer.sh status
root@kafka03:/opt/zookeeper/bin# ./zkServer.sh status
从上面截图中我们可以看到,kafka03节点上的zookeeper是leader,kafka01,kafka02节点上的zookeeper是follower。zookeeper的三个端口:2181,2888,3888,2181是客户端连接端口,3888是zookeeper集群之间通信端口,2888是zookeeper集群leader节点开放的端口,主要是探测心跳端口。
2:kafka集群部署
官网下载kafka:Apache Kafka
root@kafka03:/opt/#wget https://downloads.apache.org/kafka/3.5.2/kafka_2.13-3.5.2.tgz
kafka的版本:2.13是scala版本号,3.5.2是kafka版本号
解压kafka,并且做软连接,方便切换目录
root@kafka01:/opt#tar -zxvf kafka_2.13-3.5.2.tgz
root@kafka01:/opt#ln -sv kafka_2.13-3.5.2 kafka
切换到kafka的目录下修改配置文件,kafka的配置文件是server.properties
root@kafka01:/opt#cd kafka/config
root@kafka03:/opt/kafka/config# egrep -v '^#|^$' server.properties
broker.id=1
listeners=PLAINTEXT://192.168.66.77:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/data/kafka
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.retention.check.interval.ms=300000
zookeeper.connect=192.168.66.77:2181,192.168.66.78:2181,192.168.66.79:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
root@kafka02:/opt/kafka/config# egrep -v '^#|^$' server.properties
broker.id=2
listeners=PLAINTEXT://192.168.66.78:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/data/kafka
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.retention.check.interval.ms=300000
zookeeper.connect=192.168.66.77:2181,192.168.66.78:2181,192.168.66.79:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
egrep -v '^#|^$' server.properties
broker.id=3
listeners=PLAINTEXT://192.168.66.79:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/data/kafka
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.retention.check.interval.ms=300000
zookeeper.connect=192.168.66.77:2181,192.168.66.78:2181,192.168.66.79:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
kafka三个节点配置文件不一样,不一样的地方是:broker.id,listeners,我们将kafka的数据放在/data/kafka目录下,需要创建目录
root@kafka01:/opt/kafka/config#mkdir /data/kafka
root@kafka02:/opt/kafka/config#mkdir /data/kafka
root@kafka03:/opt/kafka/config#mkdir /data/kafka
启动kafka
切换到/opt/kafka/bin目录下启动kafka,以守护进程的方式启动,并且指定配置文件
root@kafka01:/opt/kafka/config#cd ../bin
root@kafka01:/opt/kafka/bin#./kafka-server-start.sh -daemon /opt/kafka/config/server.properties
root@kafka02:/opt/kafka/config#cd ../bin
root@kafka02:/opt/kafka/bin#./kafka-server-start.sh -daemon /opt/kafka/config/server.properties
root@kafka03:/opt/kafka/config#cd ../bin
root@kafka03:/opt/kafka/bin#./kafka-server-start.sh -daemon /opt/kafka/config/server.properties
使用jps和netstat命令,查看三台节点kafka是否已运行
root@kafka03:/opt/kafka/bin# jps
检查kafka 9092端口是否被监听
root@kafka03:/opt/kafka/bin# netstat -nltpu | grep 9092
本地电脑部署kafka tools或者在kakfa任意一台机器中部署kafka manager (cmak)管理kafka集群
kafka tools下载地址:Offset Explorer
下载适合本地电脑环境的kafka tools
kafka manager下载:https://github.com/yahoo/CMAK
以上两个工具可以自行百度安装,用来管理kafka集群,我用的是kafka manager验证kafka集群
3:部署elasticsearch集群
logstash,elasticsearch和kibana官网下载地址:Elasticsearch Platform — Find real-time answers at scale | Elastic
logstash:https://www.elastic.co/cn/downloads/past-releases#logstash
elasticsearch:Past Releases of Elastic Stack Software | Elastic
kibana:https://www.elastic.co/cn/downloads/past-releases#kibana
我们安装的ELK是集成了JDK,不需要单独的部署JDK,但是logstash,elasticsearch和kibana的版本要一样
elasticsearch:
root@es01:/opt#wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.0-amd64.deb
kibana:
root @es01:/opt#wget https://artifacts.elastic.co/downloads/kibana/kibana-7.17.0-amd64.deb
logstash:
root @es01:/opt#wget https://artifacts.elastic.co/downloads/logstash/logstash-7.17.0-amd64.deb
安装ELK
root @es01:/opt#dpkg -i elasticsearch-7.17.0-amd64.deb
root @es02:/opt#dpkg -i elasticsearch-7.17.0-amd64.deb
root @es03:/opt#dpkg -i elasticsearch-7.17.0-amd64.deb
修改配置文件
root @es01:/opt#cd /etc/elasticsearch
root @es01:/opt#egrep -v '^#|^$' elasticsearch.yml
cluster.name: my-es
node.name: es01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.66.71
http.port: 9200
discovery.seed_hosts: ["es01","es02","es03"]
cluster.initial_master_nodes: ["192.168.66.71","192.168.66.78","192.168.66.79"]
http.cors.enabled: true
http.cors.allow-origin: "*"
root@es02:/etc/elasticsearch# egrep -v '^#|^$' elasticsearch.yml
cluster.name: my-es
node.name: es02
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.66.72
http.port: 9200
discovery.seed_hosts: ["es01", "es02","es03"]
cluster.initial_master_nodes: ["192.168.66.71", "192.168.66.72","192.168.66.73"]
http.cors.enabled: true
http.cors.allow-origin: "*"
root@es03:/etc/elasticsearch# egrep -v '^#|^$' elasticsearch.yml
cluster.name: my-es
node.name: es03
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.66.73
http.port: 9200
discovery.seed_hosts: ["es01", "es02","es03"]
cluster.initial_master_nodes: ["192.168.66.71", "192.168.66.72","192.168.66.73"]
http.cors.enabled: true
http.cors.allow-origin: "*"
es01,es02和es03配置文件区别,node.name是集群唯一标识,不能重复,network.host本机IP地址,discovery.seed_hosts,cluster.initial_master_nodes是集群的节点,需要配置;http.cors.enabled: true,http.cors.allow-origin: "*"是elasticsearch用浏览器访问需要跨域,所以必须配置
启动elasticsearch服务,以及设置开机自启
root @es01:/opt#systemctl start elasticsearch.service
root @es01:/opt#systemctl enable elasticsearch.service
同样的方式启动es02,es03上面的elasticsearch服务和设置开机自启
验证elasticsearch服务,以及elasticsearch集群是否正常
root @es01:/opt#systemctl status elasticsearch.service
root @es01:/opt#netstat -nltpu | grep 9200
root @es01:/opt#curl http://192.168.66.71:9200/_cat/nodes
可以看到es01显示的是*,说明elasticsearch主节点是es01,es02和es03是从节点
浏览器访问elasticsearch,chrome浏览器安装elasticsearch head插件,下载插件需要使用vpn,连接谷歌应用商店
4:logstash-daemonset部署
k8s环境的部署,使用的是kubeadm+containerd部署,我们收集日志主要收集系统日志:/var/log/目录下和微服务日志:/var/log/pods目录下的日志,logstash-daemonset服务以daemonset的形式部署
logstash配置文件logstash.conf
input {
file {
#path => "/var/lib/docker/containers/*/*-json.log" #docker
path => "/var/log/pods/*/*/*.log"
start_position => "beginning"
type => "jsonfile-daemonset-applog"
}
file {
path => "/var/log/*.log"
start_position => "beginning"
type => "jsonfile-daemonset-syslog"
}
}
output {
if [type] == "jsonfile-daemonset-applog" {
kafka {
bootstrap_servers => "${KAFKA_SERVER}"
topic_id => "${TOPIC_ID}"
batch_size => 16384 #logstash每次向ES传输的数据量大小,单位为字节
codec => "${CODEC}"
} }
if [type] == "jsonfile-daemonset-syslog" {
kafka {
bootstrap_servers => "${KAFKA_SERVER}"
topic_id => "${TOPIC_ID}"
batch_size => 16384
codec => "${CODEC}" #系统日志不是json格式
}}
}
logstash.yml
http.host: "0.0.0.0"
#xpack.monitoring.elasticsearch.hosts: [ "http://elasticsearch:9200" ]
logstash配置文件以configmap的形式挂在到logstash-daemonset pods中
创建configmap,logstash需要修改两个配置文件logstash.conf和logstash.yml所以我们需要创建两个configmap
root@master01:/opt/logstash-daemonset/#kubectl create configmap logstash-conf --from-file=logstash.conf -n kube-system
root@master01:/opt/logstash-daemonset/#kubectl describe configmap -n kube-system logstash-conf
root@master01:/opt/logstash-daemonset/dockerfile# kubectl create configmap logstash-yml --from-file=logstash.yml -n kube-system
root@master01:/opt/logstash-daemonset/dockerfile#kubectl describe configmap -n kube-system logstash-yml
创建logstash-daemonset pod
root @es01:/opt#vim logstash-daemonset.yml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: logstash-elasticsearch
namespace: kube-system
labels:
k8s-app: logstash-logging
spec:
selector:
matchLabels:
name: logstash-elasticsearch
template:
metadata:
labels:
name: logstash-elasticsearch
spec:
tolerations:
# this toleration is to have the daemonset runnable on master nodes
# remove it if your masters can't run pods
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
containers:
- name: logstash-elasticsearch
image: logststash:v1.17.0
env:
- name: "KAFKA_SERVER"
value: "192.168.66.77:9092,192.168.66.78:9092,192.168.66.79:9092"
- name: "TOPIC_ID"
value: "jsonfile-log-topic"
- name: "CODEC"
value: "json"
# resources:
# limits:
# cpu: 1000m
# memory: 1024Mi
# requests:
# cpu: 500m
# memory: 1024Mi
securityContext:
runAsUser: 0 #####定义pod中的程序启动后是root用户,由于收集系统用户日志,所以logstash需要以root用户身份启动,否则收集有些日志会报权限问题
command: ["/usr/share/logstash/bin/logstash","-f","/usr/share/logstash/pipline/logstash.conf"]
volumeMounts:
- name: varlog #定义宿主机系统日志挂载路径
mountPath: /var/log #宿主机系统日志挂载点
- name: varlibdockercontainers #定义容器日志挂载路径,和logstash配置文件中的收集路径保持一直
- name: logstash-conf
mountPath: /usr/share/logstash/pipline/logstash.conf
subPath: logstash.conf
- name: logstash-yml
mountPath: /usr/share/logstash/config/logstash.yml
subPath: logstash.yml
#mountPath: /var/lib/docker/containers #docker挂载路径
mountPath: /var/log/pods #containerd挂载路径,此路径与logstash的日志收集路径必须一致
readOnly: false
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log #宿主机系统日志
- name: varlibdockercontainers
hostPath:
#path: /var/lib/docker/containers #docker的宿主机日志路径
path: /var/log/pods #containerd的宿主机日志路径
- name: logstash-conf
configMap:
name: logstash-conf
defaultMode: 493 ####定义文件挂在后的权限
items:
- key: logstash.conf
path: logstash.conf
- name: logstash-yml
configMap:
name: logstash-yml
defaultMode: 493 ####定义文件挂在后的权限
items:
- key: logstash.yml
path: logstash.yml
root@master01:/opt/logstash-daemonset# kubectl apply -f DaemonSet-logstash.yaml
查看创建出的pod
root@master01:/opt/logstash-daemonset# kubectl get pods -n kube-system
此时在kafka中查看,可以看到已经创建出了一个topic
5:部署logstash
在es02节点上部署的logstash,主要是消费kafka消息,由于资源的原因,此处我们部署logstash之部署了一个节点,实际生产环境中是需要部署至少三个节点的logstash集群
root@es02:/opt# dpkg -i logstash-7.17.0-amd64.deb
修改logstash配置文件,配置文件位置/etc/logstash/,在/etc/logstash/conf.d/目录下创建一个新文件
root@es02:/etc/logstash/conf.d# vim logstash-daemon-es.conf
input {
kafka {
bootstrap_servers => "192.168.66.77:9092,192.168.66.78:9092,192.168.6.79:9092"
topics => ["jsonfile-log-topic"]
codec => "json"
}
}
output {
#if [fields][type] == "app1-access-log" {
if [type] == "jsonfile-daemonset-applog" {
elasticsearch {
hosts => ["192.168.66.71:9200","192.168.66.73:9200"]
index => "jsonfile-daemonset-applog-%{+YYYY.MM.dd}"
}}
if [type] == "jsonfile-daemonset-syslog" {
elasticsearch {
hosts => ["192.168.66.71:9200","192.168.66.73:9200"]
index => "jsonfile-daemonset-syslog-%{+YYYY.MM.dd}"
}}
}
启动logstash,检查端口,服务状态
root@es02:/etc/logstash/conf.d# systemctl start logstash.service
root@es02:/etc/logstash/conf.d# systemctl status logstash.service
root@es02:/etc/logstash/conf.d# netstat -nltpu | grep 9600
6:部署kibana
root @kafka01:/opt#dpkg -i kibana-7.17.0-amd64.deb
修改配置文件
server.host: "192.168.66.77"
server.name: "kafka01"
elasticsearch.hosts: ["http://192.168.66.71:9200"]
i18n.locale: "zh-CN"
启动kibana,检查服务状态
root@kafka01:/opt# systemctl start kibana
root@kafka01:/opt# systemctl status kibana
root@kafka01:/opt# netstat -nltpu | grep 5601
本地浏览器访问kibana,设置索引
验证查看收集到的日志