python kafka收不到消息,Python Kafka使用者从头开始没有收到消息吗?

I installed Kafka on my Windows PC. Created a topic quickstart-events and sent some messages. Running the console consumer with parameter --from-beginning can receive the messages.

.\bin\windows\kafka-console-consumer.bat --topic quickstart-events --from-beginning --bootstrap-server localhost:9092

Picked up JAVA_TOOL_OPTIONS: -agentpath:"C:\WINDOWS\system32\Aternity\Java\JavaHookLoader.dll"="C:\ProgramData\Aternity\hooks"

msg1

msg2

msg3

msg4

However, running the Python code with parameter auto_offset_reset='earliest' will print the message the first time. Then, it doesn't print any message after the first run?

from kafka import KafkaConsumer, KafkaProducer

consumer = KafkaConsumer('quickstart-events', bootstrap_servers=['localhost:9092'], auto_offset_reset='smallest')

for msg in consumer:

print(msg)

解决方案

TL;DR

you need to provide a new group.id every time you want to read the topic from beginning while keeping the setting auto_offset_reset='earliest':

KafkaConsumer('quickstart-events', bootstrap_servers=['localhost:9092'], auto_offset_reset='smallest', group_id='newGroup')

If your code is printing the output when running for the first time but not in subsequent runs anymore, and your problem is also solved when restarting Kafka (your PC) you are hitting the concept of the Consumer Group in Kafka. As this is quite an essential concept I highly recommend to get familiar with it here.

The consumer Group of an application ensures that it does not read a message twice. Each Consumer has a consumer group name (even though you might not see in directly in your code). The offset position of the consumer Group is stored in an internal Kafka topic.

Now running the code for the first time after restarting Kafka, Kafka does not know yet the consumer group and applies the policy provided in the auto_offset_reset configuration. In your case it reads from earliest available commit. The second time you run your code, it does not need to look into this policy because it already knows the consumer and it will not allow the consumer to consume the message again.

Therefore, if you restart Kafka, this internal knowledge of the consumer is also gone and again the auto_offset_reset policy is applied.

Just keep in mind that this is rather a hack and should not be done to often on productive systems as consumerGroups will be idle.

As a sid note: You console-consumer creates a new consumer group every single time you run it. The setting "--from-beginning" just ensures that auto_offset_reset is set to 'earliest'.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Kafka 读取不到数据的原因可能有很多,以下是一些常见的可能性: 1. 消息被消费完了:如果你在一个 topic 的所有 partition 中都消费了所有消息,那么 Kafka 就无法再读取到新的消息。你可以使用 `kafka-console-consumer` 命令行工具检查一下某个 topic 的 partition 中是否还有未消费的消息。 2. 消费者没有正确配置:如果你的消费者组中有多个消费者,那么每个消费者应该配置不同的 `group.id`,否则它们会共享 partition 中的消息,导致消息重复消费或者某些消息没有被消费。另外,如果你的消费者组中有消费者挂掉了,那么 Kafka 会重新平衡 partition 的分配情况,这时你可能需要等待一段时间才能再次读取到消息。 3. 消费者的 offset 不正确:Kafka 中每个 partition 的消息都有一个 offset,表示消息在 partition 中的位置。消费者需要记录自己在每个 partition 上已经消费的最后一个 offset,这样下次消费的时候就可以从正确的位置开始。如果你的消费者的 offset 不正确,可能会导致消息被重复消费或者某些消息没有被消费。 4. 网络问题:如果你的 Kafka 集群和消费者之间的网络出现问题,可能会导致消费者无法读取到消息。你可以检查一下集群的网络连接情况,或者尝试使用其他消费者来读取同样的消息。 5. 其他问题:还有一些其他的可能性,比如消息过期、数据格式问题等,你可以通过查看 Kafka 的日志或者启用更详细的日志来进一步排查问题。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值