apache kafka系列之ZookeeperConsumer实现

最新推荐文章于 2024-04-12 19:18:46 发布

幽灵之使

最新推荐文章于 2024-04-12 19:18:46 发布

阅读量5.6k

点赞数

分类专栏： apache kafka 文章标签： apache kafka系列 ZookeeperConsumer实现分

apache kafka 专栏收录该内容

74 篇文章

订阅专栏

kafka的ZookeeperConsumer数据获取的步骤如下：

入口ZookeeperConsumerConnector def consume[T](topicCountMap: scala.collection.Map[String,Int], decoder: Decoder[T])
: Map[String,List[KafkaStream[T]]] 方法
客户端启动后会在消费者注册目录上添加子节点变化的监听ZKRebalancerListener，ZKRebalancerListener实例会在内部创建一个线程，这个线程定时检查监听的事件有没有执行（消费者发生变化）,如果没有变化则wait1秒钟，当发生了变化就调用 syncedRebalance 方法，去rebalance消费者。

while (!isShuttingDown.get) {
          try {
            lock.lock()
            try {
              if (!isWatcherTriggered)
                cond.await(1000, TimeUnit.MILLISECONDS) // wake up periodically so that it can check the shutdown flag
            } finally {
              doRebalance = isWatcherTriggered
              isWatcherTriggered = false
              lock.unlock()
            }
            if (doRebalance)
              syncedRebalance
          } catch {
            case t => error("error during syncedRebalance", t)
          }

syncedRebalance方法在内部会调用def rebalance(cluster: Cluster): Boolean方法，去执行操作。
这个方法的伪代码如下：

while (!isShuttingDown.get) {
          try {
            lock.lock()
            try {
              if (!isWatcherTriggered)
                cond.await(1000, TimeUnit.MILLISECONDS) // wake up periodically so that it can check the shutdown flag
            } finally {
              doRebalance = isWatcherTriggered
              isWatcherTriggered = false
              lock.unlock()
            }
            if (doRebalance)
              syncedRebalance
          } catch {
            case t => error("error during syncedRebalance", t)
          }

syncedRebalance方法在内部会调用def rebalance(cluster: Cluster): Boolean方法，去执行操作。
这个方法的伪代码如下：

// 关闭所有的数据获取者
closeFetchers
// 解除分区的所有者
releasePartitionOwnership
// 按规则得到当前消费者拥有的分区信息并保存到topicRegistry中
topicRegistry=getCurrentConsumerPartitionInfo
// 修改并重启Fetchers
updateFetchers

updateFetcher是这样实现的。

private def updateFetcher(cluster: Cluster) {
      // 遍历topicRegistry中保存的当前消费者的分区信息，修改Fetcher的partitions信息 
      var allPartitionInfos : List[PartitionTopicInfo] = Nil
      for (partitionInfos <- topicRegistry.values)
        for (partition <- partitionInfos.values)
          allPartitionInfos ::= partition
      info("Consumer " + consumerIdString + " selected partitions : " +
        allPartitionInfos.sortWith((s,t) => s.partition < t.partition).map(_.toString).mkString(","))

      fetcher match {
        case Some(f) =>
          // 调用fetcher的startConnections方法，初始化Fetcher并启动它
          f.startConnections(allPartitionInfos, cluster)
        case None =>
      }
    }

Fetcher在startConnections时，它先把topicInfo按brokerid去分组

for(info <- topicInfos) {
      m.get(info.brokerId) match {
        case None => m.put(info.brokerId, List(info))
        case Some(lst) => m.put(info.brokerId, info :: lst)
      }
    }

然后检查每组topicInfo对应的broker是否在当前集群中注册了

val brokers = ids.map { id =>
      cluster.getBroker(id) match {
        case Some(broker) => broker
        case None => throw new IllegalStateException("Broker " + id + " is unavailable, fetchers could not be started")
      }
    }

最后对每个broker创建一个FetcherRunnable线程，并启动它。这个线程负责从服务器上不断获取数据，把数据插入内部阻塞队列的操作。

// 对每个分区分别创建FetchRequest

val fetches = partitionTopicInfos.map(info =>
          new FetchRequest(info.topic, info.partition.partId, info.getFetchOffset, config.fetchSize))
// 批量执行fetch操作
        val response = simpleConsumer.multifetch(fetches : _*)

....
// 遍历返回获取到的数据
for((messages, infopti) <- response.zip(partitionTopicInfos)) {
          try {
            var done = false
// 当zk中存放的offset值不在kafka机器上存在时，比如consumer好久没有启动，相应的offset的数据已经在kafka集群中被过期删除清理掉了
            if(messages.getErrorCode == ErrorMapping.OffsetOutOfRangeCode) {
              info("offset for " + infopti + " out of range")
              // see if we can fix this error
              val resetOffset = resetConsumerOffsets(infopti.topic, infopti.partition)
              if(resetOffset >= 0) {
                infopti.resetFetchOffset(resetOffset)
                infopti.resetConsumeOffset(resetOffset)
                done = true
              }
            }
// 如果成功了，把消息放到队列中，实际上是把当前分区信息、当前获取到的消息、当前获取使用的fetchoffset封装FetchedDataChunk对象，放到分区消息对象的内部队列中（chunkQueue.put(new FetchedDataChunk(messages, this, fetchOffset))）。
            if (!done)
              read += infopti.enqueue(messages, infopti.getFetchOffset)
          }

客户端用ConsumerIterator不断的从分区信息的内部队列中取数据。ConsumerIterator实现了IteratorTemplate的接口，它的内部保存一个Iterator的属性current，每次调用makeNext时会检查它，如果有则从中取否则从队列中取。

  protected def makeNext(): MessageAndMetadata[T] = {
    var currentDataChunk: FetchedDataChunk = null
    // if we don't have an iterator, get one，从内部变量中取数据
    var localCurrent = current.get()
    if(localCurrent == null || !localCurrent.hasNext) {
// 内部变量中取不到值，检查timeout的值
      if (consumerTimeoutMs < 0)
        currentDataChunk = channel.take // 是负数(-1)，则表示永不过期，如果接下来无新数据可取，客户端线程会在channel.take阻塞住
      else {
// 设置了过期时间，在没有新数据可用时，pool会在相应的时间返回，返回值为空，则说明没有取到新数据，抛出timeout的异常
        currentDataChunk = channel.poll(consumerTimeoutMs, TimeUnit.MILLISECONDS)
        if (currentDataChunk == null) {
          // reset state to make the iterator re-iterable
          resetState()
          throw new ConsumerTimeoutException
        }
      }
// kafka把shutdown的命令也做为一个datachunk放到队列中，用这种方法来保证消息的顺序性
      if(currentDataChunk eq ZookeeperConsumerConnector.shutdownCommand) {
        debug("Received the shutdown command")
        channel.offer(currentDataChunk)
        return allDone
      } else {
        currentTopicInfo = currentDataChunk.topicInfo
        if (currentTopicInfo.getConsumeOffset != currentDataChunk.fetchOffset) {
          error("consumed offset: %d doesn't match fetch offset: %d for %s;\n Consumer may lose data"
                        .format(currentTopicInfo.getConsumeOffset, currentDataChunk.fetchOffset, currentTopicInfo))
          currentTopicInfo.resetConsumeOffset(currentDataChunk.fetchOffset)
        }
// 把取出chunk中的消息转化为iterator
        localCurrent = if (enableShallowIterator) currentDataChunk.messages.shallowIterator
                       else currentDataChunk.messages.iterator
// 使用这个新的iterator初始化current，下次可直接从current中取数据
        current.set(localCurrent)
      }
    }
// 取出下一条数据，并用下一条数据的offset值设置consumedOffset
    val item = localCurrent.next()
    consumedOffset = item.offset
// 解码消息，封装消息和它的topic信息到MessageAndMetadata对象，返回
    new MessageAndMetadata(decoder.toEvent(item.message), currentTopicInfo.topic)
  }

ConsumerIterator的next方法

  override def next(): MessageAndMetadata[T] = {
    val item = super.next()
    if(consumedOffset < 0)
      throw new IllegalStateException("Offset returned by the message set is invalid %d".format(consumedOffset))
// 使用makeNext方法设置的consumedOffset，去修改topicInfo的消费offset
    currentTopicInfo.resetConsumeOffset(consumedOffset)
    val topic = currentTopicInfo.topic
    trace("Setting %s consumed offset to %d".format(topic, consumedOffset))
    ConsumerTopicStat.getConsumerTopicStat(topic).recordMessagesPerTopic(1)
    ConsumerTopicStat.getConsumerAllTopicStat().recordMessagesPerTopic(1)
// 返回makeNext得到的item
    item
  }

KafkaStream对ConsumerIterator做了进一步的封装，我们调用stream的next方法就可以取到数据了（内部通过调用ConsumerIterator的next方法实现）

注意：
ConsumerIterator的实现可能会造成数据的重复发送（这要看生产者如何生产数据），FetchedDataChunk是一个数据集合，它内部会包含很多数据块，一个数据块可能包含多条消息，但同一个数据块中的消息只有一个offset，所以当一个消息块有多条数据，处理完部分数据发生异常时，消费者重新去取数据，就会再次取得这个数据块，然后消费过的数据就会被重新消费。

这篇文章转载自田加国：http://www.tianjiaguo.com/system-architecture/kafka/kafka的zookeeperconsumer实现/