zookeeper 与 kafka的协同工作

转载 2016年05月31日 19:17:54



18down voteaccepted

First of all, zookeeper is needed only for high level consumer. SimpleConsumer does not require zookeeper to work.

The main reason zookeeper is needed for a high level consumer is to track consumed offsets and handle load balancing.

Now in more detail.

Regarding offset tracking, imagine following scenario: you start a consumer, consume 100 messages and shut the consumer down. Next time you start your consumer you'll probably want to resume from your last consumed offset (which is 100), and that means you have to store the maximum consumed offset somewhere. Here's where zookeeper kicks in: it stores offsets for every group/topic/partition. So this way next time you start your consumer it may ask "hey zookeeper, what's the offset I should start consuming from?". Kafka is actually moving towards being able to store offsets not only in zookeeper, but in other storages as well (for now only zookeeper and kafka offset storages are available and i'm not sure kafka storage is fully implemented).

Regarding load balancing, the amount of messages produced can be quite large to be handled by 1 machine and you'll probably want to add computing power at some point. Lets say you have a topic with 100 partitions and to handle this amount of messages you have 10 machines. There are several questions that arise here actually:

  • how should these 10 machines divide partitions between each other?
  • what happens if one of machines die?
  • what happens if you want to add another machine?

And again, here's where zookeeper kicks in: it tracks all consumers in group and each high level consumer is subscribed for changes in this group. The point is that when a consumer appears or disappears, zookeeper notifies all consumers and triggers rebalance so that they split partitions near-equally (e.g. to balance load). This way it guarantees if one of consumer dies others will continue processing partitions that were owned by this consumer.



  • 2008年04月17日 13:39
  • 846KB
  • 下载

git-svn — 让git和svn协同工作 【工具版】

My company is using SVN to be code management tool. I have no rights to commit, and I must export th...


  • 2014年11月03日 14:17
  • 3.33MB
  • 下载


1. 本节主要演示一个简单的模拟bc计算器的程序,主要功能就是解析整型数的四则运算,先给出bison程序: %{ #include #include %} /* 定义两个记号,D_INT表...


  • 2008年06月18日 11:42
  • 7.34MB
  • 下载


(一)项目背景: 1、响应国家的号召:国务院办公厅关于转发全国政务公开领导小组2012年全国政务公开和政务服务工作要点的通知(国办发[2012]29号)。五、加强政务服务体系建设。要通过网上政务大...
  • koukeni
  • koukeni
  • 2013年04月12日 08:50
  • 227


  • 2009年02月27日 09:20
  • 8.49MB
  • 下载
您举报文章:zookeeper 与 kafka的协同工作