kafka-python 停止消费

最新推荐文章于 2024-06-13 21:23:00 发布

AI算法网奇

最新推荐文章于 2024-06-13 21:23:00 发布

阅读量5.2k

点赞数

本文链接：https://blog.csdn.net/jacke121/article/details/81014376

版权

kafka-python 停止消费

使用kafka-python时，不消费数据，也没有异常

换为pykafka库后解决

使用例子

from pykafka import KafkaClient
client = KafkaClient(hosts= '127.0.0.1:9092')
topic = client.topics[ 'logsget']
consumer = topic.get_simple_consumer()
for msg in consumer:
if msg is not None:
print msg.value

pykafka中consumer_group的设置

先看符合要求的代码

kafka生产者

import logging
import logger
from pykafka import KafkaClient
client = KafkaClient(hosts="**")
logging.info(client.topics)

input=raw_input('please enter your topic here:')
topic = client.topics[input]

producer=topic.get_sync_producer()
while True:
    event = raw_input("Add what to event log?: ('Q' to end.): ")
    if event == 'Q':
        break
    else:
       # msg = event.encode('UTF-8', 'ignore')

        producer.produce(event)

kafka消费者：

import logging
from pykafka import KafkaClient
client = KafkaClient(hosts="**")

input=raw_input('please enter the topic here:')
topic = client.topics[input]
print input
print topic
i=0

#if you want to consumer next time,you can change the consumer_group
consumer=topic.get_simple_consumer(consumer_group='sec3',auto_commit_enable=True,auto_commit_interval_ms=1,consumer_id='sec2')

for message in consumer:
    if message is not None:
        print message.offset,message.value

那么先运行一次
consumer group为sec2的时候，正常得到消息。再试一次，读取不到了。

https://img-blog.csdn.net/20161123170220803

修改group_id:

https://img-blog.csdn.net/20161123170550969

修改之后，实现kafka的订阅原理：消费之后不能再次消费，如果想得到数据必须修改group_id。

探索过程

看了下，在java中,group_id应该很好配置：

  private static ConsumerConfig createConsumerConfig()
    {
        Properties props = new Properties();
        props.put("zookeeper.connect", "**");
        props.put("group.id", "password");
        props.put("zookeeper.session.timeout.ms", "40000");
        props.put("zookeeper.sync.time.ms", "200");
        props.put("auto.commit.interval.ms", "1000");
        return new ConsumerConfig(props);
    }

然后自己也尝试：

consumer=topic.get_simple_consumer(consumer_group='sec3',consumer_id='sec2')

但是每次再消费的时候总是能得到这个数据。
但显然不行，目前kafka每秒处理几千条。那么一天就上千万条了。我下次再消费的数据的时候，是不希望得到历史的这么多数据的。
查了很多资料，没有提到这个问题的，全是复制别人的东西，还不如看官方文档。于是看文档

>>>from pykafka import KafkaClient
>>>client = KafkaClient(hosts="**")
>>>topic = client.topics[input]
>>>consumer=topic.get_simple_consumer()
>>>help(consumer)

pykafka基本文档也就这些。没办法，只能挨个看参数有什么用了。
加入了commit_enable 的参数之后，终于符合要求了。