在spark JOB中消费kafka队列数据时,通过zookeeper记录了kafka的偏移量,有时数据量较大,JOB处理不过来,这事需要kafka修改偏移量offset,如:
开始尝试调用kafka内置的类kafka.tools.UpdateOffsetsInZK,修改offset,如下:
[bsauser@bsa222 kafka]$ bin/kafka-run-class.sh kafka.tools.UpdateOffsetsInZK latest config/consumer.properties tam_format_alarm
updating partition 0 with new offset: 6776033
updating partition 1 with new offset: 6782580
updating partition 2 with new offset: 6778624
updating partition 3 with new offset: 6786418
updating partition 4 with new offset: 6780299
updated the offset for 5 partitions
但是重启spark JOB之后,发现并不成功。突然想到应该跟新zookeeper中该消费group id的偏移量:
操作之前先查看下topic offset的最大值和最小值,进入