kafka原理解析之-消息交付语义

最新推荐文章于 2023-03-31 15:41:46 发布

置顶 zwq00451

最新推荐文章于 2023-03-31 15:41:46 发布

阅读量373

点赞数

分类专栏： kafka 事务文章标签： kafka

本文链接：https://blog.csdn.net/zwq00451/article/details/111075556

版权

kafka 同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

事务

2 篇文章 0 订阅

订阅专栏

消息交付语义

kafka消息交付语义假设存在完美无缺的 broker， 从producer 和 consumer 角度讨论数据保证机制，主要表现重试生产消息或重新消费消息（可能是不同的消费实例）时的情况。

Kafka提供了三种消息交付语义，如下。

At most once——消息可能会丢失但绝不重传。
At least once——消息可以重传但绝不丢失。
Exactly once——这正是人们想要的, 每一条消息只被传递一次.

本文只讲第三种Exactly once。在0.11.x版本之前，Apache Kafka支持at-least-once delivery语义以及partition内部的顺序delivery。

Kafka在 0.11.x之后支持了exactly-once语义，包括三个逻辑的实现：

1. 幂等：partition内部的exactly-once顺序语义（生产者视角）

幂等操作：指可以执行多次，而不会产生与仅执行一次不同结果的操作，Producer的send操作现在是幂等的。在任何导致producer重试的情况下，相同的消息，如果被producer发送多次，也只会被写入Kafka一次。要开启此功能，并让所有partition获得exactly-once delivery、无数据丢失和in-order语义，需要修改broker的配置：enable.idempotence = true(ack = all).
工作方式：类似于TCP，发送到Kafka的每批消息将包含一个序列号，该序列号用于重复数据的删除。与TCP不同，TCP只能在transient
in-memory中提供保证；而序列号将被持久化存储topic中，因此即使leader
replica失败，接管的任何其他broker也将能感知到消息是否重复。
这种机制的开销相当低：它只是在每批消息中添加了几个额外字段:
PID，在Producer初始化时分配，作为每个Producer会话的唯一标识；
序列号（sequence number），Producer发送的每条消息（更准确地说是每一个消息批次，即RecordBatch）都会带有此序列号，从0开始单调递增。Broker根据它来判断写入的消息是否可接受。

2. 事务：跨partition的原子性写操作（消费者视角）

consumer 负责控制它在partition中读取数据的位置。如果 consumer
永远不崩溃，那么它可以将这个位置信息只存储在内存中。但如果 consumer 发生了故障，我们希望这个 topic partition
被另一个进程接管，那么新进程需要选择一个合适的位置开始进行处理。存在两种情况：

当从一个 kafka topic 中消费并输出到另一个 topic 时.
使用新事务型producer对跨partition进行写操作，该API允许producer发送批量消息到多个partition。即：将consumer 的offset存储为一个 topic 中的消息，所以我们可以在输出topic接收已经被处理的数据的时候，在同一个事务中向 Kafka 写入
offset，因此真正意义上实现了end-to-end的exactly-once delivery语义。
以下是一段示例代码：

 Producer<String, String> producer = new KafkaProducer<String, String>(props); 	
 // 初始化事务，包括结束该	Transaction ID对应的未完成的事务（如果有） 						
 //保证新的事务在一个正确的状态下启动 	
 producer.initTransactions();
 // 开始事务
 producer.beginTransaction();
 // 消费数据 	
 ConsumerRecords<String,String> records = consumer.poll(100); 	
 try{
  		// 发送数据
		producer.send(new ProducerRecord<String, String>("Topic", "Key", "Value"));
		// 发送消费数据的Offset，将上述数据消费与数据发送纳入同一个Transaction内
		producer.sendOffsetsToTransaction(offsets, "group1");
		// 数据发送及Offset发送均成功的情况下，提交事务   
		producer.commitTransaction(); 	
} catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException e) 	{
		// 数据发送或者Offset发送出现异常时，终止事务
		producer.abortTransaction(); 	
} finally {
		// 关闭Producer和Consumer    
		producer.close();
		consumer.close(); 	
}

该代码片段描述了使用producer事务API原子性发送消息至多个partition。

注意：某个Kafka topic partition内部的消息可能是事务完整提交后的消息，也可能是事务执行过程中的部分消息。而从consumer的角度来看，有两种策略去读取事务写入的消息，通过"isolation.level"来进行配置：
a）、read_committed：可以读取已经完整提交的事务写入数据；
b）、read_uncommitted：完全不等待事务提交，按照offsets order去读取消息，也就是兼容0.11.x版本前Kafka的语义；
我们必须通过配置consumer端的配置isolation.level，来正确使用事务API，通过使用 new Producer API并且对一些unique ID设置transaction.id（该配置属于producer端），该unique ID用于提供事务状态的连续性。

当从一个 kafka topic 中消费并输出到外部系统时.
保证exactly-once的通常做法是让 consumer 将其offset 存储在与其输出相同的位置。

举例，Kafka Connect连接器，它将所读取的数据和数据的 offset 一起写入到HDFS，以保证数据和 offset 都被更新，或者两者都不被更新。对于其它很多需要这些较强语义，并且没有主键来避免消息重复的数据系统，我们也遵循类似的模式。

3. 流处理 Exactly-once

基于幂等和原子性，通过Streams API实现exactly-once流处理成为可能。如果要在流应用中实现相关语义，只需要配置processing.guarantee = exactly_once，这会影响所有的流处理环境中的语义，包括将处理作业和由加工作业创建的所有物理状态同时写回到Kafka的操作。这就是为什么Kafka Streams API提供的exactly-once保证是迄今为止任何流处理系统中的最强实现的原因。它为以Kafka作为数据源的流处理应用程序提供端对端的exactly-once保证，Streams应用程序将任何Kafka的物化状态在最终环节写回到Kafka。仅依靠外部数据系统实现物化状态的流处理系统仅支持对exactly-once的较弱保证。即使他们使用Kafka作为流处理来源，在需要从故障中恢复的情况下，也只能rollback他们的Kafka消费者offset以重新消费并处理消息，而不能回滚关联状态，当更新不是幂等的时候会导致结果不正确。

附录：kafka事务机制原理

事务性消息传递

这一节所说的事务主要指原子性，也即Producer将多条消息作为一个事务批量发送，要么全部成功要么全部失败。为了实现这一点，Kafka
0.11.0.0引入了一个服务器端的模块，名为Transaction Coordinator，用于管理Producer发送的消息的事务性。该Transaction
Coordinator维护Transaction
Log，该log存于一个内部的Topic内。由于Topic数据具有持久性，因此事务的状态也具有持久性。Producer并不直接读写Transaction
Log，它与Transaction Coordinator通信，然后由Transaction
Coordinator将该事务的状态插入相应的Transaction Log。Transaction Log的设计与Offset
Log用于保存Consumer的Offset类似。
事务中Offset的提交
许多基于Kafka的应用，尤其是Kafka
Stream应用中同时包含Consumer和Producer，前者负责从Kafka中获取消息，后者负责将处理完的数据写回Kafka的其它Topic中。为了实现该场景下的事务的原子性，Kafka需要保证对Consumer
Offset的Commit与Producer对发送消息的Commit包含在同一个事务中。否则，如果在二者Commit中间发生异常，根据二者Commit的顺序可能会造成数据丢失和数据重复：
若先Commit Producer发送数据的事务再Commit Consumer的Offset，即At Least
Once语义，可能造成数据重复。 若先Commit Consumer的Offset，再Commit
Producer数据发送事务，即At Most Once语义，可能造成数据丢失。
用于事务特性的控制型消息
为了区分写入Partition的消息被Commit还是Abort，Kafka引入了一种特殊类型的消息，即ControlMessage。该类消息的Value内不包含任何应用相关的数据，并且不会暴露给应用程序。它只用于Broker与Client间的内部通信。对于Producer端事务，Kafka以ControlMessage的形式引入一系列TransactionMarkerConsumer即可通过该标记判定对应的消息被Commit了还是Abort了，然后结合该Consumer配置的隔离级别决定是否应该将该消息返回给应用程序。

在这里插入图片描述

Data flow
At a high level, the data flow can be broken into four distinct types.
A: the producer and transaction coordinator interaction
When executing transactions, the producer makes requests to the transaction coordinator at the following points:
1.The initTransactions API registers a transactional.id with the coordinator. At this point, the coordinator closes any pending transactions with that transactional.id and bumps the epoch to fence out zombies. This happens only once per producer session.
2.When the producer is about to send data to a partition for the first time in a transaction, the partition is registered with the coordinator first.
3.When the application calls commitTransaction or abortTransaction, a request is sent to the coordinator to begin the two phase commit protocol.
B: the coordinator and transaction log interaction
As the transaction progresses, the producer sends the requests above to update the state of the transaction on the coordinator. The transaction coordinator keeps the state of each transaction it owns in memory, and also writes that state to the transaction log (which is replicated three ways and hence is durable).
The transaction coordinator is the only component to read and write from the transaction log. If a given broker fails, a new coordinator is elected as the leader for the transaction log partitions the dead broker owned, and it reads the messages from the incoming partitions to rebuild its in-memory state for the transactions in those partitions.
C: the producer writing data to target topic-partitions
After registering new partitions in a transaction with the coordinator, the producer sends data to the actual partitions as normal. This is exactly the same producer.send flow, but with some extra validation to ensure that the producer isn’t fenced.
D: the coordinator to topic-partition interaction
After the producer initiates a commit (or an abort), the coordinator begins the two phase commit protocol.
In the first phase, the coordinator updates its internal state to “prepare_commit” and updates this state in the transaction log. Once this is done the transaction is guaranteed to be committed no matter what.The coordinator then begins phase 2, where it writes transaction commit markers to the topic-partitions which are part of the transaction.These transaction markers are not exposed to applications, but are used by consumers in read_committed mode to filter out messages from aborted transactions and to not return messages which are part of open transactions (i.e., those which are in the log but don’t have a transaction marker associated with them).
Once the markers are written, the transaction coordinator marks the transaction as “complete” and the producer can start the next transaction.

zwq00451

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
kafka原理解析之-消息交付语义

消息交付语义kafka消息交付语义讨论的是假设存在完美无缺的 broker， producer 和 consumer 因为某些原因，需要重试生产消息或重新消费消息（可能是不同的消费实例）时的情况。Kafka提供了三种消息交付语义，如下。At most once——消息可能会丢失但绝不重传。At least once——消息可以重传但绝不丢失。Exactly once——这正是人们想要的, 每一条消息只被传递一次.本文只讲第三种Exactly once。在0.11.x版本之前，Apache
复制链接

扫一扫

专栏目录