原创 2013年12月02日 15:56:40

Zookeeper Directories

The following gives the zookeeper structures and algorithms used for co-ordination between consumers and brokers.


When an element in a path is denoted [xyz], that means that the value of xyz is not fixed and there is in fact a zookeeper znode for each possible value of xyz. For example /topics/[topic] would be a directory named /topics containing a sub-directory for each topic name. Numerical ranges are also given such as [0...5] to indicate the subdirectories 0, 1, 2, 3, 4. An arrow -> is used to indicate the contents of a znode. For example /hello -> world would indicate a znode /hello containing the value "world".

Broker Node Registry

/brokers/ids/[0...N] --> host:port (ephemeral node)

This is a list of all present broker nodes, each of which provides a unique logical broker id which identifies it to consumers (which must be given as part of its configuration). On startup, a broker node registers itself by creating a znode with the logical broker id under /brokers/ids. The purpose of the logical broker id is to allow a broker to be moved to a different physical machine without affecting consumers. An attempt to register a broker id that is already in use (say because two servers are configured with the same broker id) is an error.

Since the broker registers itself in zookeeper using ephemeral znodes, this registration is dynamic and will disappear if the broker is shutdown or dies (thus notifying consumers it is no longer available).

Broker Topic Registry

/brokers/topics/[topic]/[0...N] --> nPartions (ephemeral node)

Each broker registers itself under the topics it maintains and stores the number of partitions for that topic.

Consumers and Consumer Groups

Consumers of topics also register themselves in Zookeeper, in order to balance the consumption of data and track their offsets in each partition for each broker they consume from.

Multiple consumers can form a group and jointly consume a single topic. Each consumer in the same group is given a shared group_id. For example if one consumer is your foobar process, which is run across three machines, then you might assign this group of consumers the id "foobar". This group id is provided in the configuration of the consumer, and is your way to tell the consumer which group it belongs to.

The consumers in a group divide up the partitions as fairly as possible, each partition is consumed by exactly one consumer in a consumer group.

Consumer Id Registry

In addition to the group_id which is shared by all consumers in a group, each consumer is given a transient, unique consumer_id (of the form hostname:uuid) for identification purposes. Consumer ids are registered in the following directory.

/consumers/[group_id]/ids/[consumer_id] --> {"topic1": #streams, ..., "topicN": #streams} (ephemeral node)
Each of the consumers in the group registers under its group and creates a znode with its consumer_id. The value of the znode contains a map of <topic, #streams>. This id is simply used to identify each of the consumers which is currently active within a group. This is an ephemeral node so it will disappear if the consumer process dies.

Consumer Offset Tracking

Consumers track the maximum offset they have consumed in each partition. This value is stored in a zookeeper directory

/consumers/[group_id]/offsets/[topic]/[broker_id-partition_id] --> offset_counter_value ((persistent node)

Partition Owner registry

Each broker partition is consumed by a single consumer within a given consumer group. The consumer must establish its ownership of a given partition before any consumption can begin. To establish its ownership, a consumer writes its own id in an ephemeral node under the particular broker partition it is claiming.

/consumers/[group_id]/owners/[topic]/[broker_id-partition_id] --> consumer_node_id (ephemeral node)


apache kafka系列之在zookeeper中存储结构

1.topic注册信息 /brokers/topics/[topic] : Schema: { "fields" :     [ {"name": "version", "type": ...
  • lizhitao
  • lizhitao
  • 2014年04月15日 10:57
  • 20882


本文主要查看kafka在zookeeper中的一些存储结构,便于更好的理解kafka的工作原理,其测试环境如下:kafka zookeeper 3.4.51 Broker node 注...


 Zookeeper在kafka中的应用 @20150606   简介 Kafka使用zookeeper作为其分布式协调框架,很好的将消息生产、消息存储、消息消费的过程结合在一起。同时借...


Kafka介绍 Kafka的介绍可参考:http://blog.csdn.net/eric_sunah/article/details/44201711 Zookeeper在Kaf...

Kafka 实战 - 启动报错 IllegalArgumentException: Path length must be > 0

kafka修改zookeeper的path导致的启动报错:FATAL Fatal error during KafkaServerStable startup. Prepare to shutdown...

apache kafka系列之在zookeeper中存储结构

apache kafka系列之在zookeeper中存储结构 http://my.oschina.net/u/1419751/blog/360060 1.top...


由于项目原因,最近经常碰到Kafka消息队列某topic在集群宕机重启后无法消费的情况。碰到这种情况,有三步去判断原因所在: step A:如果用kafka串口(即console-consume...


kafka自带了很多工具类,在源码kafka.tools里可以看到: 这些类该如何使用呢,kafka的设计者早就为我们考虑到了,在${KAFKA_HOME}/bin下,有很多的脚本,其中有一个...


Zookeeper Directories The following gives the zookeeper structures and algorithms used for co-ordin...