Kafka+Zookeeper+ELK（集群搭建）

最新推荐文章于 2024-08-05 16:52:09 发布

安多的风ch

最新推荐文章于 2024-08-05 16:52:09 发布

阅读量732

点赞数

分类专栏： Linux

本文链接：https://blog.csdn.net/u014231889/article/details/81902942

版权

Linux 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Kafka+Zookeeper+ELK（集群搭建）

参考博客：https://www.cnblogs.com/panwenbin-logs/p/7807105.html
第五部分：整个集群测试后期会补上！

准备环境：**

master：192.168.141.107

slave1：192.168.141.105

slave2：192.168.141.103

所需软件：

kafka_2.11-2.0.0.tgz

logstash-6.3.2.tar.gz

elasticsearch-6.3.2.tar.gz

kibana-6.3.2-linux-x86_64.tar.gz

elasticsearch-head.zip

角色分配：

主机名称	IP	角色
master	192.168.141.107	Kafka+Zookeeper+Elasticsearch
slave1	192.168.141.105	Kafka+Zookeeper+Elasticsearch+Logstash
slave2	192.168.141.103	Kafka+Zookeeper+Elasticsearch+Kibana

安装步骤：

第一步：Kafka+zookeeper集群安装配置

Kafka+zookeeper集群可以分开搭建，也可以使用Kafka自带的zookeeper。

1）解压

在 /usr/local/ 目录下创建 elk 目录
[root@master local]# mkdir elk
[root@master local]# cd elk
[root@master elk]# tar -zxvf kafka_2.11-2.0.0.tgz
为 kafka 创建软链接
[root@master elk]# ln -sv kafka_2.11-2.0.0 kafka
‘kafka’ -> ‘kafka_2.11-2.0.0’
[root@master elk]# cd kafka

2）配置zookeeper

打开 zookeeper 的配置文件
[root@master kafka]# vim config/zookeeper.properties 
# the directory where the snapshot is stored.
dataDir=/kafkaData/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
tickTime=2000
initLimit=20
syncLimit=10
server.1=192.168.141.107:2888:3888
server.2=192.168.141.105:2888:3888
server.3=192.168.141.103:2888:3888
创建zookeepr所需要的目录和myid文件
[root@master kafka]# mkdir -pv /kafkaData/zookeeper
mkdir: created directory ‘/kafkaData’
mkdir: created directory ‘/kafkaData/zookeeper’
向 myid 文件中写入数字"1"，并确认写入
[root@master kafka]# echo "1" > /kafkaData/zookeeper/myid
[root@master kafka]# cat /kafkaData/zookeeper/myid
1

3）配置Kafka

打开 kafka 的配置文件
[root@master kafka]# vim config/server.properties 
############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=1

############################# Socket Server Settings #############################

# The address the socket server listens on. It will get the value returned from 
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT:192.168.141.107//:9092

# Hostname and port the broker will advertise to producers and consumers. If not set, 
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3

# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)

############################# Log Basics #############################

# A comma separated list of directories under which to store log files
log.dirs=/kafkaData/kafka-logs

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=10

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Internal Topic Settings  #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=192.168.141.107:2181,192.168.141.105:2181,192.168.141.103:2181

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000


############################# Group Coordinator Settings #############################

# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0

将 /usr/local/elk 发送到其他两个节点
[root@master ~]# scp -r /usr/local/elk/ root@192.168.141.105:/usr/local/
[root@master ~]# scp -r /usr/local/elk/ root@192.168.141.103:/usr/local/

需要在另外两个节点创建目录：
[root@slave1 ~]# mkdir -pv /kafkaData/zookeeper
mkdir: created directory ‘/kafkaData’
mkdir: created directory ‘/kafkaData/zookeeper’
[root@slave2 ~]# mkdir -pv /kafkaData/zookeeper
mkdir: created directory ‘/kafkaData’
mkdir: created directory ‘/kafkaData/zookeeper’

其他两个节点配置相同，不同之处：
① zookeeper 的配置
echo "x" > /kafkaData/zookeeper/myid
② kafka 的配置
broker.id=x
host.name=本机IP

4）启动 Zookeeper 和 Kafka

启动 Zookeeper
[root@master kafka]# bin/zookeeper-server-start.sh config/zookeeper.properties 
[root@slave1 kafka]# bin/zookeeper-server-start.sh config/zookeeper.properties 
[root@slave2 kafka]# bin/zookeeper-server-start.sh config/zookeeper.properties 

检查 Zookeeper 的启动
[root@master ~]# netstat -nlpt | grep -E "2181|2888|3888"
tcp6       0      0 :::2181                 :::*                    LISTEN      4382/java          
tcp6       0      0 192.168.141.107:3888    :::*                    LISTEN      4382/java 

[root@slave1 ~]# netstat -nlpt | grep -E "2181|2888|3888"
tcp6       0      0 :::2181                 :::*                    LISTEN      4029/java           
tcp6       0      0 192.168.141.105:2888    :::*                    LISTEN      4029/java           
tcp6       0      0 192.168.141.105:3888    :::*                    LISTEN      4029/java
[root@slave2 ~]# netstat -nlpt | grep -E "2181|2888|3888"
tcp6       0      0 :::2181                 :::*                    LISTEN      3649/java           
tcp6       0      0 192.168.141.103:3888    :::*                    LISTEN      3649/java

启动 Kafka
[root@master kafka]# bin/kafka-server-start.sh config/server.properties
[root@slave1 kafka]# bin/kafka-server-start.sh config/server.properties
[root@slave2 kafka]# bin/kafka-server-start.sh config/server.properties

验证集群启动成功
创建一个主题
[root@master kafka]# bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 1 --topic summer
Created topic "summer".
查看已经创建的主题
[root@master kafka]# bin/kafka-topics.sh --list --zookeeper 192.168.141.107:2181
summer
[root@master kafka]# bin/kafka-topics.sh --list --zookeeper 192.168.141.105:2181
summer
[root@master kafka]# bin/kafka-topics.sh --list --zookeeper 192.168.141.103:2181
summer
查看主题详情：
[root@master kafka]# bin/kafka-topics.sh --describe --zookeeper 192.168.141.107:2181 --topic summer
Topic:summer    PartitionCount:1    ReplicationFactor:2 Configs:
    Topic: summer   Partition: 0    Leader: 2   Replicas: 2,3   Isr: 2,3
[root@master kafka]# bin/kafka-topics.sh --describe --zookeeper 192.168.141.105:2181 --topic summer
Topic:summer    PartitionCount:1    ReplicationFactor:2 Configs:
    Topic: summer   Partition: 0    Leader: 2   Replicas: 2,3   Isr: 2,3
[root@master kafka]# bin/kafka-topics.sh --describe --zookeeper 192.168.141.103:2181 --topic summer
Topic:summer    PartitionCount:1    ReplicationFactor:2 Configs:
    Topic: summer   Partition: 0    Leader: 2   Replicas: 2,3   Isr: 2,3

说明 Kafka+zookeeper 集群搭建已经成功！！！

第二步：搭建 Elasticsearch 集群

搭建 ELK 集群环境需要依赖 JVM，所以需要搭建 JDK 环境（本实验搭建的是 jdk1.8，由于搭建较为简单，相关搭建步骤请参看网上相关资料）

1）安装和配置 Elasticsearch

解压
[root@master elk]# tar -zxvf elasticsearch-6.3.2.tar.gz 
给 elasticsearch-6.3.2 创建软链接
[root@master elk]# ln -sv elasticsearch-6.3.2 elasticsearch
‘elasticsearch’ -> ‘elasticsearch-6.3.2’
配置 elasticsearch
[root@master elasticsearch]# vim config/elasticsearch.yml 
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: es-cluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /kafkaData/es/data
#
# Path to log files:
#
path.logs: /kafkaData/es/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 192.168.141.107
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["192.168.141.107", "192.168.141.105", "192.168.141.103"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
discovery.zen.minimum_master_nodes: 1
# 下面两行是为了解决跨域问题
http.cors.enabled: true
http.cors.allow-origin: "*"

创建 elasticsearch.yml 所需目录
[root@master elasticsearch]# mkdir -pv /kafkaData/es/{data,logs}
mkdir: created directory ‘/kafkaData/es’
mkdir: created directory ‘/kafkaData/es/data’
mkdir: created directory ‘/kafkaData/es/logs’

2）启动 elasticsearch

Elasticsearch为了安全考虑，不允许使用root启动，解决方法新建一个用户，用此用户进行相关的操作,创建一个普通用户，并添加相应的sudo权限。
[root@master elasticsearch]# useradd elk
[root@master elasticsearch]# chown -R  elk:elk  /Data/es/
[root@master elasticsearch]# chown -R  elk:elk  /usr/local/elk/elasticsearch-6.3.2
[root@master elasticsearch]# chmod 640 /etc/sudoers
[root@master elasticsearch]# vim /etc/sudoers
## Allow root to run any commands anywhere 
root    ALL=(ALL)       ALL
elk     ALL=(ALL)       ALL
[root@master elasticsearch]# chmod 440 /etc/sudoers

配置其他环境参数
[root@master elasticsearch]# echo  "elasticsearch hard nofile 65536"  >> /etc/security/limits.conf
[root@master elasticsearch]# echo "elasticsearch soft nofile 65536"  >> /etc/security/limits.conf
[root@master elasticsearch]# echo "vm.max_map_count=262144 "  >>    /etc/sysctl.conf
[root@master elasticsearch]# sysctl -p
vm.max_map_count = 262144
vm.max_map_count = 262144

其他节点配置同上，不同之处：
network.host：x.x.x.x
node.name:xxxxxx

启动成功后：
这里写图片描述

第三步：搭建 Logstash

解压
[root@slave1 elk]# tar -zxvf logstash-6.3.2.tar.gz 
添加软链接
[root@slave1 elk]# ln -sv logstash-6.3.2 logstash

第四步：搭建 Kibana

解压
[root@slave2 elk]# tar -zxvf kibana-6.3.2-linux-x86_64.tar.gz 
添加软链接
[root@slave2 elk]# ln -sv kibana-6.3.2-linux-x86_64 kibana
配置
[root@slave2 config]# vim kibana.yml 
# Kibana is served by a back end server. This setting specifies the port to use.
server.port: 5601

# Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values.
# The default is 'localhost', which usually means remote machines will not be able to connect.
# To allow connections from remote users, set this parameter to a non-loopback address.
server.host: "192.168.141.103"

# The URL of the Elasticsearch instance to use for all your queries.
elasticsearch.url: "http://192.168.141.107:9200"