1、创建连接kafka的对象
kafka_hosts = ["4.4.4.2","4.4.4.3","4.4.4.4"]
kafka_port = 6667
import time
kafka_producer = KafkaProducer(bootstrap_servers = ["{kafka_host1}:{kafka_port}".format(kafka_host1 = kafka_hosts[0],kafka_port = kafka_port),
"{kafka_host2}:{kafka_port}".format(kafka_host2 = kafka_hosts[1],kafka_port= kafka_port),
"{kafka_host3}:{kafka_port}".format(kafka_host3 = kafka_hosts[2],kafka_port = kafka_port)],max_request_size=101626282, buffer_memory=101626282,acks="all")
2、参数解释
max_request_size=101626282, buffer_memory=101626282
为了发送大文件
acks="all"
需要所有副本返回才可以
3、调用对象的send的方法发送数据
res = kafka_producer.send(topic="full-sight-insert", value=i)
print(res.get())
打印数据在kakfa中的位置
RecordMetadata(topic='full-sight-insert', partition=2, topic_partition=TopicPartition(topic='full-sight-insert', partition=2), offset=640782, timestamp=-1, checksum=-2084735199, serialized_key_size=-1, serialized_value_size=7122)
4、通过python消费kafka的消息
consumer = KafkaConsumer(topic,
auto_offset_reset='earliest',
bootstrap_servers=['10.1.4.12:9092'])
filename = os.path.join(filepath, str(topic) + ".txt")
while True:
message = consumer.poll(timeout_ms=120000)
if message:
for i in message.values():