zookeeper kafka 集群配置（centos6.5）

最新推荐文章于 2020-07-15 14:58:01 发布

井口者

最新推荐文章于 2020-07-15 14:58:01 发布

阅读量306

点赞数 1

分类专栏：大数据文章标签：集群

本文链接：https://blog.csdn.net/qiu265843468/article/details/78332187

版权

大数据专栏收录该内容

12 篇文章 0 订阅

订阅专栏

版本：
zookeeper-3.4.10
kafka_2.11-0.9.0.1

1.需要先设置好服务器的用户名和hosts表

vi /etc/sysconfig/network
修改hostname
HOSTNAME=server01
vi /etc/hosts

127.0.0.1   localhost localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.25.1.201 server01
172.25.1.202 server02
172.25.1.203 server03

2.依赖jdk,我装的是1.8.0

3.三台服务器集群（zk喜欢奇数）

我用的是zookeeper 3.4.10
在opt/

tar zxvf zookeeper-3.4.10

#配置环境变量
vi /etc/profile
加上

# Zookeeper
export ZOOKEEPER_HOME=/opt/zookeeper-3.4.10
export PATH=$ZOOKEEPER_HOME/bin:$PATH

#修改配置文件

cd /opt/zookeeper-3.4.10/conf
mv zoo_simple.cfg zoo.cfg
vi zoo.cfg

#完整文件如下

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=server01:2888:3888
server.2=server02:2888:3888
server.3=server03:2888:3888

我只增加了最后三行，其它用的默认（复制这个配置文件到另两个服务器）
每台服务器分别配置myid
在上边指定的
dataDir 下

vi myid

写下myid的值，就是配置文件中的eserver.x的x。例如server.1就为1
启动服务器
/bin下

zkServer.sh start

输出信息：

ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

判断启动是否成功

zkServer.sh status

有可能报错

ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.

先判断是否关闭防火墙
用

netstat -tunlp

看3888接口是否正常监听，我的是这样的

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1755/sshd           
tcp        0      0 127.0.0.1:631               0.0.0.0:*                   LISTEN      1644/cupsd          
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1943/master         
tcp        0      0 ::ffff:172.25.1.201:3888    :::*                        LISTEN      26473/java          
tcp        0      0 :::22                       :::*                        LISTEN      1755/sshd           
tcp        0      0 ::1:631                     :::*                        LISTEN      1644/cupsd          
tcp        0      0 :::45528                    :::*                        LISTEN      26473/java          
tcp        0      0 ::1:25                      :::*                        LISTEN      1943/master         
tcp        0      0 :::40666                    :::*                        LISTEN      26968/java          
tcp        0      0 :::9092                     :::*                        LISTEN      26968/java          
tcp        0      0 :::2181                     :::*                        LISTEN      26473/java          
udp        0      0 0.0.0.0:631                 0.0.0.0:*                               1644/cupsd          
udp        0      0 0.0.0.0:68                  0.0.0.0:*                               254

成功后
server01

[root@server01 zookeeper-3.4.10]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower

server02

[root@server02 zookeeper-3.4.10]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower

server03

[root@server03 zookeeper-3.4.10]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: leader

二、kafka集群搭建

解压kafka安装包
配置文件 server.properties(172.25.1.201)

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=1

############################# Socket Server Settings #############################

listeners=PLAINTEXT://:9092

# The port the socket server listens on
port=9092

# Hostname the broker will bind to. If not set, the server will bind to all interfaces
host.name=172.25.1.201

# Hostname the broker will advertise to producers and consumers. If not set, it uses the
# value for "host.name" if configured.  Otherwise, it will use the value returned from
# java.net.InetAddress.getCanonicalHostName().
#advertised.host.name=<hostname routable by clients>

# The port to publish to ZooKeeper for clients to use. If this is not set,
# it will publish the same port that the broker binds to.
#advertised.port=<port accessible by clients>

# The number of threads handling network requests
num.network.threads=3

# The number of threads doing disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma seperated list of directories under which to store log files
log.dirs=/tmp/kafka-logs

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion
log.retention.hours=168

# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=172.25.1.201:2181,172.25.1.202:2181,172.25.1.203:2181

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000

主要配置文件为server.properties，对于producer和consumer分别有producer.properties和consumer.properties，但是一般不需要单独配置，可以从server.properties中读取。
3. 启动各节点，分发此配置文件，修改broker.id和listeners地址，建立相应的目录。

[root@slave1 kafka]# ./bin/kafka-server-start.sh -daemon config/server.properties
[root@slave2 kafka]# ./bin/kafka-server-start.sh -daemon config/server.properties
[root@slave3 kafka]# ./bin/kafka-server-start.sh -daemon config/server.properties

验证是否成功
4.1. 创建一个topic名为my-test

[root@slave1 kafka]# bin/kafka-topics.sh --create --zookeeper 172.25.1.203:2181 --replication-factor 3 --partitions 1 --topic my-test
Created topic "my-test"

4.2. 发送消息，ctrl+c终止

[root@slave1 kafka]# bin/kafka-console-producer.sh --broker-list 172.25.1.201:9092 --topic my-test
今天是个好日子
hello

4.3 另一台机器上消费消息

[root@slave2 kafka]# bin/kafka-console-consumer.sh --zookeeper 172.25.1.203:2181 --from-beginning --topic my-test
今天是个好日子
hello

Kafka HelloWord
在kafka的手册中给出了java版的producer和cousumer的代码示例.
修改下地址，逗号隔开，该地址是集群的子集，用来探测集群。
5.1.Producer代码示例

    import java.util.Properties;
    import org.apache.kafka.clients.producer.KafkaProducer;
    import org.apache.kafka.clients.producer.ProducerRecord;
    public class Producer {
        public static void main(String[] args) {
            Properties props = new Properties();
            props.put("bootstrap.servers",
                    "172.25.1.201:9092,172.25.1.202:9092,172.25.1.203:9092");//该地址是集群的子集，用来探测集群。
            props.put("acks", "all");// 记录完整提交，最慢的但是最大可能的持久化
            props.put("retries", 3);// 请求失败重试的次数
            props.put("batch.size", 16384);// batch的大小
            props.put("linger.ms", 1);// 默认情况即使缓冲区有剩余的空间，也会立即发送请求，设置一段时间用来等待从而将缓冲区填的更多，单位为毫秒，producer发送数据会延迟1ms，可以减少发送到kafka服务器的请求数据
            props.put("buffer.memory", 33554432);// 提供给生产者缓冲内存总量
            props.put("key.serializer",
                    "org.apache.kafka.common.serialization.StringSerializer");// 序列化的方式，
                                                                                // ByteArraySerializer或者StringSerializer
            props.put("value.serializer",
                    "org.apache.kafka.common.serialization.StringSerializer");
            KafkaProducer<String, String> producer = new KafkaProducer<>(props);
            for (int i = 0; i < 10000; i++) {
                // 三个参数分别为topic, key,value，send()是异步的，添加到缓冲区立即返回，更高效。
                producer.send(new ProducerRecord<String, String>("my-topic",
                        Integer.toString(i), Integer.toString(i)));
            }
            producer.close();
        }
    }

5.2.Consumer代码示例

    import java.util.Arrays;
    import java.util.Properties;
    import org.apache.kafka.clients.consumer.ConsumerRecord;
    import org.apache.kafka.clients.consumer.ConsumerRecords;
    import org.apache.kafka.clients.consumer.KafkaConsumer;
    public class Consumer {
        public static void main(String[] args) {
            Properties props = new Properties();
            props.put("bootstrap.servers",
                    "172.25.1.201:9092,172.25.1.202:9092,172.25.1.203:9092");// 该地址是集群的子集，用来探测集群。
            props.put("group.id", "test");// cousumer的分组id
            props.put("enable.auto.commit", "true");// 自动提交offsets
            props.put("auto.commit.interval.ms", "1000");// 每隔1s，自动提交offsets
            props.put("session.timeout.ms", "30000");// Consumer向集群发送自己的心跳，超时则认为Consumer已经死了，kafka会把它的分区分配给其他进程
            props.put("key.deserializer",
                    "org.apache.kafka.common.serialization.StringDeserializer");// 反序列化器
            props.put("value.deserializer",
                    "org.apache.kafka.common.serialization.StringDeserializer");
            KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
            consumer.subscribe(Arrays.asList("my-topic"));// 订阅的topic,可以多个
            while (true) {
                ConsumerRecords<String, String> records = consumer.poll(100);
                for (ConsumerRecord<String, String> record : records) {
                    System.out.printf("offset = %d, key = %s, value = %s",
                            record.offset(), record.key(), record.value());
                    System.out.println();
                }
            }
        }
    }

5.3.分别运行即可。看到comsumer打印出消息日志。
最后运行的时候报错找不到slave01:9092
把服务器上的hosts里面的主机名对应的ip复制到win7的hosts就可以了。

井口者

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
zookeeper kafka 集群配置（centos6.5）

版本：zookeeper-3.4.10kafka_2.11-0.9.0.11.需要先设置好服务器的用户名和hosts表vi /etc/sysconfig/network 修改hostname HOSTNAME=server01 vi /etc/hosts127.0.0.1 localhost localhost4 localhost4.localdomain4::1
复制链接

扫一扫

专栏目录