1. 传递保证语义的三个级别
- At most once:消息最多被消费一次
- 消息可能丢失,但不可能重复消费
- 发生场景
- 消费者拉取完消息后,先提交offset,再执行业务逻辑
- 如果提交完offset后宕机,则业务逻辑没执行,这一部分消息就丢失了
- At least once:消息最少被消费一次
- 消息绝不会丢,但可能重复消费
- 发生场景
- 消费者拉取完消息后,先执行业务逻辑,再提交offset
- 如果执行完业务逻辑后宕机,offset没提交,下次会再执行一次业务逻辑,导致重复消费
- Exactly once
- 每条消息只会被传递一次
- 可行性
- 将offset和消息处理结果放在一个事务中,当宕机或者Rebalance发生,从数据库中找到对应的offset,调用KafkaConsumer.seek()手动设置消费位置
2. Rebalance设计
3. KafkaConsumer
- org.apache.kafka.clients.consumer.KafkaConsumer类 实现了 org.apache.kafka.clients.consumer.Consumer接口,后者定义了KafkaConsumer对外的API
- org.apache.kafka.clients.consumer.Consumer接口
-
public interface Consumer<K, V> extends Closeable { //1. 订阅指定Topic,并为消费者自动分配分区 void subscribe(Collection<String> var1); void subscribe(Collection<String> var1, ConsumerRebalanceListener var2); void subscribe(Pattern var1, ConsumerRebalanceListener var2); void subscribe(Pattern var1); //2. 用户手动订阅Topic,并且指定消费分区。与subscribe()方法互斥 void assign(Collection<TopicPartition> var1); //3. 同步/异步提交消费者已经完成的offset void commitSync(); void commitSync(Duration var1); void commitSync(Map<TopicPartition, OffsetAndMetadata> var1); void commitSync(Map<TopicPartition, OffsetAndMetadata> var1, Duration var2); void commitAsync(); void commitAsync(OffsetCommitCallback var1); void commitAsync(Map<TopicPartition, OffsetAndMetadata> var1, OffsetCommitCallback var2); //4. 指定消费者起始消费的位置 void seek(TopicPartition var1, long var2); void seek(TopicPartition var1, OffsetAndMetadata var2); void seekToBeginning(Collection<TopicPartition> var1); void seekToEnd(Collection<TopicPartition> var1); //5. 负责从服务端获取消息 ConsumerRecords<K, V> poll(Duration var1); //6. 暂停/继续Consumer void pause(Collection<TopicPartition> var1); void resume(Collection<TopicPartition> var1); }
-
- 1
-
public class KafkaConsumer<K, V> implements Consumer<K, V> { private static final long NO_CURRENT_THREAD = -1L; private static final AtomicInteger CONSUMER_CLIENT_ID_SEQUENCE = new AtomicInteger(1); private static final String JMX_PREFIX = "kafka.consumer"; static final long DEFAULT_CLOSE_TIMEOUT_MS = 30000L; final Metrics metrics; private final Logger log; private final String clientId; //consumer唯一标识 private String groupId; //groupId private final ConsumerCoordinator coordinator; //控制着Consumer和服务端GroupCoordinator之间的通信逻辑 private final Deserializer<K> keyDeserializer; //反序列化器 private final Deserializer<V> valueDeserializer; //反序列化器 private final Fetcher<K, V> fetcher; //负责从服务端获取信息 private final ConsumerInterceptors<K, V> interceptors; //拦截器 private final Time time; private final ConsumerNetworkClient client; //负责Consumer与Kafka服务端的网络通信 private final SubscriptionState subscriptions; //维护了消费者的消费状态 private final Metadata metadata; //记录了kafka的元信息 private final long retryBackoffMs; private final long requestTimeoutMs; private final int defaultApiTimeoutMs; private volatile boolean closed; private List<PartitionAssignor> assignors; private final AtomicLong currentThread; private final AtomicInteger refcount; private boolean cachedSubscriptionHashAllFetchPositions; }
-