?如何实现kafka仅发送一次不重复,加pid,sequence,幂等性
kafka
定义:
一种高吞吐的分布式消息队列系统,特点是生产者消费者模式,保证先进先出,依靠group管理进行消费组管理,多个partition
集群:
producer
broker
负责读写存
consumer
zookeeper
存储消费偏移量broker发送offset,consumer消费offset
topic,partition等元数据
kafka内部
Topic分很多partition,给每条消息一个offset,每个partition有多个repalica,partition内部有序,外部无序,若保sink出有序,可仅申请一个partition
Partition包含多个Segment(包含log文件和index文件)
log包含msg信息,每条msg有递增的offset
index是对log的索引
与zookeeper
kafka存放z元信息
1控制器contoller
2管理员操作admin
3配置config
4代理节点和主题brokers
代理节点ids
节点id
主题topics
主题名
分区索引
5控制器选举次数controller_epoch
外部写入kafka
同步
异步
描述:
写入kafka缓冲区后,kafka返回已收到,后异步写入分区中
自定义主题分区过程:
过程:
实现Partitioner类,Hash取模获取分区值
设置partitioner.class属性值
public int partition(String topic,Object key,byte[] keyBytes,Object value...){
partition=Math.abs(key.hashCode())%Cluster.partitionCountForTopic(topic)}
消费
描述:
consumer_group 》consumer 》程序
单线程
extend Thread
构造器,
private Properties configure(){
Properties props=new Properties
props.put
#集群地址
#指定消费者组
#"enable,auto.commit"自动提交
#"auto.commit.interval.ms"自动提交时间隔
#key value 反序列化
}
public void run(){
new KafkaConsumer()
new TopicPartition("name",0)
consumer.assign
}
多线程
创建日志对象
private final static Logger LOG=LoggerFactory.getLogger(XX.class)
声明一个消费实例
private final KafkaConsumer<String,String> consumer
声明一个线程池
private ExectorService executorService;
public XX(){
new Properties()
指定集群(bootstrap.servers)
指定消费组(group.id)
自动提交,及间隔
k v 反序列化
实例化消费对象
订阅消费主题
consumer.subscribe(Array.asList("topic_name"))
}
/*执行多线程消费者实例*/
public void execute(){
executorService=Executor.newFixedThreadPool(6)
while(true){
ConsumerRecords<String,String> records=consumer.poll(100)
if(null!=records){
executorService.submit(new KafkaConsumerThread(records,consumer))
}
}
}
幂等性实现
private void maybeWaitForPid() {
if (transactionState == null)
return;
while (!transactionState.hasPid()) {
try {
Node node = awaitLeastLoadedNodeReady(requestTimeout);
if (node != null) {
ClientResponse response = sendAndAwaitInitPidRequest(node);
if (response.hasResponse() && (response.responseBody() instanceof InitPidResponse)) {
InitPidResponse initPidResponse = (InitPidResponse) response.responseBody();
transactionState.setPidAndEpoch(initPidResponse.producerId(), initPidResponse.epoch());
} else {
log.error("Received an unexpected response type for an InitPidRequest from {}. " +
"We will back off and try again.", node);
}
} else {
log.debug("Could not find an available broker to send InitPidRequest to. " +
"We will back off and try again.");
}
} catch (Exception e) {
log.warn("Received an exception while trying to get a pid. Will back off and retry.", e);
}
log.trace("Retry InitPidRequest in {}ms.", retryBackoffMs);
time.sleep(retryBackoffMs);
metadata.requestUpdate();
}
}
mill=`date "+%N"`
tdate=`date "+%Y-%m-%d %H:%M:%S,${smill:0:3}"`
#执行分布式开启
function statrt()
{
for i in ${hosts[@]}
do
smill=`date "+%N"`
stdate=`date "+%Y-%m-%d %H:%M:%S,${smill:0:3}"`
ssh hadoop@i "source /etc/profile;echo [$stdate] INFO[Kafka Broker $i] begins to execute the startup operation.;kafka-server-start.sh $KAFKA_HOME/config/server.properties>/dev/null" &
sleep 1
done
}
#执行分布式关闭
function stop()
{
...
}
#查看Kafka代理节点状态
function status()
{
for i in ${hosts[@]}
do
smill=`date "+%N"`
stdate=`date "+%Y-%m-%d %H:%M:%S,${smill:0:3}"`
ssh hadoop@i "source /etc/profile;echo [$stdate] INFO[Kafka Broker $i] status message is ";jps|grep Kafka;" &
sleep 1
done
}
#判断输入的Kafka命令参数是否都有效
case "$1" in
start)
start
;;
kafka 梳理
最新推荐文章于 2024-07-08 18:33:12 发布