Kafka+Zookeeper+ELK(集群搭建)
参考博客:https://www.cnblogs.com/panwenbin-logs/p/7807105.html
第五部分:整个集群测试后期会补上!
准备环境:**
master:192.168.141.107
slave1:192.168.141.105
slave2:192.168.141.103
所需软件:
kafka_2.11-2.0.0.tgz
logstash-6.3.2.tar.gz
elasticsearch-6.3.2.tar.gz
kibana-6.3.2-linux-x86_64.tar.gz
elasticsearch-head.zip
角色分配:
主机名称 | IP | 角色 |
---|---|---|
master | 192.168.141.107 | Kafka+Zookeeper+Elasticsearch |
slave1 | 192.168.141.105 | Kafka+Zookeeper+Elasticsearch+Logstash |
slave2 | 192.168.141.103 | Kafka+Zookeeper+Elasticsearch+Kibana |
安装步骤:
第一步:Kafka+zookeeper集群安装配置
Kafka+zookeeper集群可以分开搭建,也可以使用Kafka自带的zookeeper。
1)解压
在 /usr/local/ 目录下创建 elk 目录
[root@master local]# mkdir elk
[root@master local]# cd elk
[root@master elk]# tar -zxvf kafka_2.11-2.0.0.tgz
为 kafka 创建软链接
[root@master elk]# ln -sv kafka_2.11-2.0.0 kafka
‘kafka’ -> ‘kafka_2.11-2.0.0’
[root@master elk]# cd kafka
2)配置zookeeper
打开 zookeeper 的配置文件
[root@master kafka]# vim config/zookeeper.properties
# the directory where the snapshot is stored.
dataDir=/kafkaData/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
tickTime=2000
initLimit=20
syncLimit=10
server.1=192.168.141.107:2888:3888
server.2=192.168.141.105:2888:3888
server.3=192.168.141.103:2888:3888
创建zookeepr所需要的目录和myid文件
[root@master kafka]# mkdir -pv /kafkaData/zookeeper
mkdir: created directory ‘/kafkaData’
mkdir: created directory ‘/kafkaData/zookeeper’
向 myid 文件中写入数字"1",并确认写入
[root@master kafka]# echo "1" > /kafkaData/zookeeper/myid
[root@master kafka]# cat /kafkaData/zookeeper/myid
1
3)配置Kafka
打开 kafka 的配置文件
[root@master kafka]# vim config/server.properties
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=1
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT:192.168.141.107//:9092
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092
# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3
# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
############################# Log Basics #############################
# A comma separated list of directories under which to store log files
log.dirs=/kafkaData/kafka-logs
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=10
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
############################# Internal Topic Settings #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
############################# Log Flush Policy #############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=192.168.141.107:2181,192.168.141.105:2181,192.168.141.103:2181
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
############################# Group Coordinator Settings #############################
# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0
将 /usr/local/elk 发送到其他两个节点
[root@master ~]# scp -r /usr/local/elk/ root@192.168.141.105:/usr/local/
[root@master ~]# scp -r /usr/local/elk/ root@192.168.141.103:/usr/local/
需要在另外两个节点创建目录:
[root@slave1 ~]# mkdir -pv /kafkaData/zookeeper
mkdir: created directory ‘/kafkaData’
mkdir: created directory ‘/kafkaData/zookeeper’
[root@slave2 ~]# mkdir -pv /kafkaData/zookeeper
mkdir: created directory ‘/kafkaData’
mkdir: created directory ‘/kafkaData/zookeeper’
其他两个节点配置相同,不同之处:
① zookeeper 的配置
echo "x" > /kafkaData/zookeeper/myid
② kafka 的配置
broker.id=x
host.name=本机IP
4)启动 Zookeeper 和 Kafka
启动 Zookeeper
[root@master kafka]# bin/zookeeper-server-start.sh config/zookeeper.properties
[root@slave1 kafka]# bin/zookeeper-server-start.sh config/zookeeper.properties
[root@slave2 kafka]# bin/zookeeper-server-start.sh config/zookeeper.properties
检查 Zookeeper 的启动
[root@master ~]# netstat -nlpt | grep -E "2181|2888|3888"
tcp6 0 0 :::2181 :::* LISTEN 4382/java
tcp6 0 0 192.168.141.107:3888 :::* LISTEN 4382/java
[root@slave1 ~]# netstat -nlpt | grep -E "2181|2888|3888"
tcp6 0 0 :::2181 :::* LISTEN 4029/java
tcp6 0 0 192.168.141.105:2888 :::* LISTEN 4029/java
tcp6 0 0 192.168.141.105:3888 :::* LISTEN 4029/java
[root@slave2 ~]# netstat -nlpt | grep -E "2181|2888|3888"
tcp6 0 0 :::2181 :::* LISTEN 3649/java
tcp6 0 0 192.168.141.103:3888 :::* LISTEN 3649/java
启动 Kafka
[root@master kafka]# bin/kafka-server-start.sh config/server.properties
[root@slave1 kafka]# bin/kafka-server-start.sh config/server.properties
[root@slave2 kafka]# bin/kafka-server-start.sh config/server.properties
验证集群启动成功
创建一个主题
[root@master kafka]# bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 1 --topic summer
Created topic "summer".
查看已经创建的主题
[root@master kafka]# bin/kafka-topics.sh --list --zookeeper 192.168.141.107:2181
summer
[root@master kafka]# bin/kafka-topics.sh --list --zookeeper 192.168.141.105:2181
summer
[root@master kafka]# bin/kafka-topics.sh --list --zookeeper 192.168.141.103:2181
summer
查看主题详情:
[root@master kafka]# bin/kafka-topics.sh --describe --zookeeper 192.168.141.107:2181 --topic summer
Topic:summer PartitionCount:1 ReplicationFactor:2 Configs:
Topic: summer Partition: 0 Leader: 2 Replicas: 2,3 Isr: 2,3
[root@master kafka]# bin/kafka-topics.sh --describe --zookeeper 192.168.141.105:2181 --topic summer
Topic:summer PartitionCount:1 ReplicationFactor:2 Configs:
Topic: summer Partition: 0 Leader: 2 Replicas: 2,3 Isr: 2,3
[root@master kafka]# bin/kafka-topics.sh --describe --zookeeper 192.168.141.103:2181 --topic summer
Topic:summer PartitionCount:1 ReplicationFactor:2 Configs:
Topic: summer Partition: 0 Leader: 2 Replicas: 2,3 Isr: 2,3
说明 Kafka+zookeeper 集群搭建已经成功!!!
第二步:搭建 Elasticsearch 集群
搭建 ELK 集群环境需要依赖 JVM,所以需要搭建 JDK 环境(本实验搭建的是 jdk1.8,由于搭建较为简单,相关搭建步骤请参看网上相关资料)
1)安装和配置 Elasticsearch
解压
[root@master elk]# tar -zxvf elasticsearch-6.3.2.tar.gz
给 elasticsearch-6.3.2 创建软链接
[root@master elk]# ln -sv elasticsearch-6.3.2 elasticsearch
‘elasticsearch’ -> ‘elasticsearch-6.3.2’
配置 elasticsearch
[root@master elasticsearch]# vim config/elasticsearch.yml
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: es-cluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /kafkaData/es/data
#
# Path to log files:
#
path.logs: /kafkaData/es/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 192.168.141.107
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["192.168.141.107", "192.168.141.105", "192.168.141.103"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
discovery.zen.minimum_master_nodes: 1
# 下面两行是为了解决跨域问题
http.cors.enabled: true
http.cors.allow-origin: "*"
创建 elasticsearch.yml 所需目录
[root@master elasticsearch]# mkdir -pv /kafkaData/es/{data,logs}
mkdir: created directory ‘/kafkaData/es’
mkdir: created directory ‘/kafkaData/es/data’
mkdir: created directory ‘/kafkaData/es/logs’
2)启动 elasticsearch
Elasticsearch为了安全考虑,不允许使用root启动,解决方法新建一个用户,用此用户进行相关的操作,创建一个普通用户,并添加相应的sudo权限。
[root@master elasticsearch]# useradd elk
[root@master elasticsearch]# chown -R elk:elk /Data/es/
[root@master elasticsearch]# chown -R elk:elk /usr/local/elk/elasticsearch-6.3.2
[root@master elasticsearch]# chmod 640 /etc/sudoers
[root@master elasticsearch]# vim /etc/sudoers
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
elk ALL=(ALL) ALL
[root@master elasticsearch]# chmod 440 /etc/sudoers
配置其他环境参数
[root@master elasticsearch]# echo "elasticsearch hard nofile 65536" >> /etc/security/limits.conf
[root@master elasticsearch]# echo "elasticsearch soft nofile 65536" >> /etc/security/limits.conf
[root@master elasticsearch]# echo "vm.max_map_count=262144 " >> /etc/sysctl.conf
[root@master elasticsearch]# sysctl -p
vm.max_map_count = 262144
vm.max_map_count = 262144
其他节点配置同上,不同之处:
network.host:x.x.x.x
node.name:xxxxxx
启动成功后:
第三步:搭建 Logstash
解压
[root@slave1 elk]# tar -zxvf logstash-6.3.2.tar.gz
添加软链接
[root@slave1 elk]# ln -sv logstash-6.3.2 logstash
第四步:搭建 Kibana
解压
[root@slave2 elk]# tar -zxvf kibana-6.3.2-linux-x86_64.tar.gz
添加软链接
[root@slave2 elk]# ln -sv kibana-6.3.2-linux-x86_64 kibana
配置
[root@slave2 config]# vim kibana.yml
# Kibana is served by a back end server. This setting specifies the port to use.
server.port: 5601
# Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values.
# The default is 'localhost', which usually means remote machines will not be able to connect.
# To allow connections from remote users, set this parameter to a non-loopback address.
server.host: "192.168.141.103"
# The URL of the Elasticsearch instance to use for all your queries.
elasticsearch.url: "http://192.168.141.107:9200"