kafka源码---消费者(1)

最新推荐文章于 2024-08-17 17:22:58 发布

w-小菜

最新推荐文章于 2024-08-17 17:22:58 发布

阅读量601

点赞数

分类专栏：中间件

本文链接：https://blog.csdn.net/W1427259949/article/details/106966501

版权

中间件专栏收录该内容

19 篇文章 1 订阅

订阅专栏

★《kafka源码剖析》读书笔记

一，概念

1.offset

问题：消费者如何确定自己消费到了分区的哪个位置？

问题：如果消费者宕机了，当下一个消费者来继续处理这个分区的时候，如何继续消费？

都指向了消费位移的概念，也就是我们需要一个变量来保存消费位置。我们将这个变量称为 offset。

在旧版本中，消费者会把消费位置记录到zookeeper中，新版本为了缓解zookeeper的压力，在kafka服务端中添加了一个名为“_comsumer_offsets”的内部 topic，简称为offset topic，它保存消费者提交的offset,使用offset topic记录消费位置是默认选项。

问题：消费者什么时候提交offset值？

1.自动提交方式，当使用poll()拉取的时候，把上次poll()消费的offset提交给offset topic。

缺点：当消费者还没有消费完poll()拉取的消息，但是宕机了，自然没有提交offset，下一次继续消费的时候，会存在重复消费！

2.当poll()拉取消息的时候，还没有消费就把本次的offset提交了，

缺点：避免了重复消费，但是如果还没有消费就宕机了，这就存在了消息丢失。

所以，出现了传递保证，三个级别：

1.At most once:消息可能会丢，但绝对不会重复，对应上面的第二种方案

2.At least once:消息不会丢失，但是可能会重复，对应上面的第一种方案

3.Exactly once:每条消息只会被传递一次（最好状态）

很少有让消息丢失的情况，我们大多数都希望达到第三种情况，但是保证第三种情况，不仅仅是消费者，生产者也要保证消息只会被成功发送一次保存在服务器分区中，如果分区中已经存在同样的消息，自然无法避免重复消息（除非有一个判断机制，比如消息唯一id）

生产者确保exactly once：

1.每个分区只有一个生产者写入消息，当宕机，重启之后，自己去确认最后一次消息发送到了哪里，再决定是重发还是继续

2.为每个消息设置一个全局主键，生产者不做其他处理，可以直接重传，消费者来进行去重。

消费者确保exactly once:

关闭offset自动提交，不再使用offset topic这个内部topic，而是消费者自己保存offset,利用事务的原子性，当poll()一次，必须完整消费完，才提交offset，如果失败就回滚。将offset保存到数据库中，

2.消费组与消费者

这里和其他的MQ有不同的地方，多了一个消费组，消费者是放在消费组里面的，分区是映射到消费者的，每个分区只能被一个消费组中的一个消费者所消费，如果要同一个分区要映射到多个消费者，这些消费者必然不是一个消费组的，但是一个消费者是可以任意映射分区，只要不重复。

当一个消费组内加入新的消费者的时候，前面的消费者会释放一些分区出来，拿给新的消费者进行消费，而这里的和redis的集群里面很像，消费者分开处理topic的分区。

每个不同的消费组中的消费者都去订阅同一个分区，那么就实现了广播模式。当有消费者宕机，或者消费者加入消费组，都会触发rebalance,在rebalance期间，消费者是不能消费的，直到rebalance完成，这个时候就需要使用第一小节中的offset重新开始消费。

问题：当消费组连接上服务器的时候，如何进行分区到消费者的映射分配呢？

方案1：kafka最开始使用zk的watcher实现，每个consumer group都在zk下维护一个/consumer/group_id/ids，记录消费者的id，同级别下，还有offsets节点，记录group在某个分区上的消费位置，owners记录分区与消费者的映射关系(重点），同样broker也在zk中有节点，保存了broker分区信息，leader信息，isr信息，

缺点：zk负担很重，而且zk存在脑裂问题，不好！

方案2：kafka后续做出调整，将全部的consumer group 分成多个子集，每个consumer group 子集在服务端对应一个GroupCoordinator对其进行管理，消费者不再依赖zk，而只有GroupCoordinator在zk上面有watcher，这样就大大的减小了zk的负担，当有新的消费者加入，或者旧的宕机，就会改变zk的值，就会触发GroupCoordinator绑定的watcher,GroupCoordinator就会进行rebalance。

简述过程：

1.当消费者准备计入Consumer Group，或者GroupCoordinator发生故障转移时，消费者并不知道GroupCoordinator网络位置，消费者会向任意broker发送一个ConmuserMetadataRequest请求，里面附带了GroupId，然后broker收到返回response，里面就带了groupId对应的CroupCoordinator信息。

2.消费者根据ConsumerMetadataResponse中的GroupCoordinator信息，连接到GroupCoordinator并周期性发送HeartbeatReqeust，心跳检测，征明消费者还活着，如果GroupCoordinator长期收不到，就默认为消费者死了，会发起新一轮Rebalance。

3.如果HeartbeatResponse中带有IllegalGeneration异常，说明GroupCoordinator发起了rebalance操作，此时消费者发送JoinGroupRequest给GroupCoordinator（通知GroupCoordinator，消费者要加入指定的GroupId)

4.GroupCoordinator分配完成之后，将分配结果写入zookeeper,并通过JoinGroupResponse返回给消费者，消费者根据JoinGroupResponse结果进行消费数据。

5.consumer计入Group之后，周期发送heartbeanRequest，如果发现异常返回就发送JoinGroupRequest，循环操作。

缺点：1.分区分配的操作在服务端的GroupCoordinator中完成，要求服务端实现Partition的分配策略，当使用新的分配策略，需要改动服务端的代码or配置，还需要重启，比较麻烦。

2.不同的Rebalance策略有不同的验证需求，当需要自定义分区分配策略就会很麻烦

方案3：将rebalance工作放到了消费者处理，Consumer Group管理以然在GroupCoordinator中，当consumer发现GroupCoordinator后，就进入joinGroup阶段，发送JoinGroupRequest请求，服务端收齐所有消费者之后（这里如何确定所有消费者呢？），会从消费者中选出一个Leader,把所有消费者信息发送给leader,让它来进行分区分配，服务端只需要指定分配的方式就可以了，具体的操作交给leader来解决，leader把结果返回给GroupCoordinator，然后服务端再把结果返回给Group中所有的消费者。

二，KafkaConuser分析

消费者的启动：

public void doWork() {
    //订阅 topic，可以一次订阅多个topic
    consumer.subscribe(Collections.singletonList(this.topic));
    // 从服务端拉取信息，每次poll()可以拉取多个信息  ★
    ConsumerRecords<Integer, String> records = consumer.poll(1000);
    //遍历，输出信息
    for (ConsumerRecord<Integer, String> record : records) {
        System.out.println("Received message: (" + record.key() + ", " + record.value() + ") at offset " + record.offset());
    }
}

分为两步，1.订阅主题，记得上面我们说过，消费者构成消费组，一个消费组只能映射主题中的一个分区，但是这里直接订阅的主题topic,没有区分分区！

2.使用consumer进行消息拉取，也就是工作类：kafkaConsumer类

kafkaConsumer是线程不安全，而kafkaProducer是线程安全的，先看一下重要字段：

public class KafkaConsumer<K, V> implements Consumer<K, V> {
    private final String clientId;
    private final ConsumerCoordinator coordinator;  //分配策略管理
    private final Deserializer<K> keyDeserializer;
    private final Deserializer<V> valueDeserializer;
    private final Fetcher<K, V> fetcher;        //负责从服务端获取消息
    private final ConsumerInterceptors<K, V> interceptors;

    private final Time time;
    private final ConsumerNetworkClient client;   //负责通信
    private final SubscriptionState subscriptions; //订阅指定的topic
    private final Metadata metadata;
    private final long retryBackoffMs;
    private final long requestTimeoutMs;
    private volatile boolean closed = false;
    private List<PartitionAssignor> assignors;
...
}

目录结构：

1.SubscriptionState 消费位置存储

2.ConsumerNetworkClient 通信

3.ConsumerCoordinator 负责与服务端GroupCoordinator交互

4.PartitionAssignor 分区分配

2.1 SubscriptionState

第一步中，调用kafkaConsumer.subsribue()方法，实则是调用SubscriptionState类的subscribe()方法：

public void subscribe(Collection<String> topics, ConsumerRebalanceListener listener) {
   ... 检查topics 是否合法
        this.subscriptions.subscribe(new HashSet<>(topics), listener);  
        metadata.setTopics(subscriptions.groupSubscription());
}

来到Subscriptions.subsribe()方法：

public void subscribe(Set<String> topics, ConsumerRebalanceListener listener) {
    //用户没有指定listener,则默认使用NoOpConsumerRebalanceListener，里面所有方法都是空的
    if (listener == null)
        throw new IllegalArgumentException("RebalanceListener cannot be null");
    //设置topic模式
    setSubscriptionType(SubscriptionType.AUTO_TOPICS);
    this.listener = listener;
    //【入】，将订阅的topics 放入GroupSubscribe集合
    changeSubscription(topics);
}


private void changeSubscription(Set<String> topicsToSubscribe) {
    if (!this.subscription.equals(topicsToSubscribe)) { //不相同才改变
        this.subscription = topicsToSubscribe;
        this.groupSubscription.addAll(topicsToSubscribe);
    }
}

这里的订阅并没有发生网络连接，只是将topic存储了起来，设置了订阅模式，请看下面对SubscriptionState的详细分析：

consumer在拉取消息的时候，发送FetchRequest请求，里面要确定一个offset，而consumer为了快速获得这个值，使用SubscriptionState来追踪TopicPartitioni和offset的对应关系，

主要字段：

public class SubscriptionState {
    // 四种topic模式
    private enum SubscriptionType {
        NONE, AUTO_TOPICS, AUTO_PATTERN, USER_ASSIGNED
    }
    private SubscriptionType subscriptionType; // 订阅topic的模式

    // 表示AUTO_PATTREN模式，使用正则表达式对所有的topic进行匹配，对匹配成功的所有topic进行订阅
    private Pattern subscribedPattern;
    // 表示AUTO_TOPICS,AUTO_PATTERN模式，使用此集合记录所有订阅的topic
    private Set<String> subscription;
    // group leader记录所有消费者订阅的topic,其他consumer只保存自身订阅的topic
    private final Set<String> groupSubscription;
   // 表示AUTO_ASSIGNED模式
    private final PartitionStates<TopicPartitionState> assignment; //表示topicPartition的消费状态
    //是否需要从GroupCoordinator获取最近提交的offset
    private boolean needsFetchCommittedOffsets;
    //  默认为offsetResetStrategy策略
    private final OffsetResetStrategy defaultResetStrategy;
    //  用于监听分区分配操作 
    private ConsumerRebalanceListener listener;
...
}

1.SubscriptionType,四个枚举类型的含义：

1.NONE: 初始值

2.AUTO_TOPICS: 按照指定的topic名字进行订阅，自动分配分区

3.AUTO_PATTERN: 按照正则表达式匹配topic,自动分配分区

4.USER_ASSIGNED: 用户手动指定消费者消费的topic以及分区编号

三种模式是互斥的。

2.TopicPartitionState 消费状态

他是本类中的内部类，所有字段：

private static class TopicPartitionState {
    private Long position; /  下一次要拉取消息的offset
    private Long highWatermark; // the high watermark from last fetch
    private Long lastStableOffset;
    private OffsetAndMetadata committed;  / 最近一次提交的
    private boolean paused;  / 当前topicPartition是否处于暂停状态
    private OffsetResetStrategy resetStrategy;  / 枚举，重置position策略
...
}

2.2 ConsumerNetworkClient

第一步订阅成功之后，第二步开始进行poll()拉取消息，使用client进行拉取。

ConsumerNetworkClient它封装了NetworkClient。 NetwordkClient 它依赖于Kselector,InFlightRequest,Metadata组件，负责管理客户端与kafka集群中各个Node间的连接，

public class ConsumerNetworkClient implements Closeable {
    private final KafkaClient client;      //NetworkClient对象
    private final UnsentRequests unsent = new UnsentRequests();  //缓冲队列
    private final Metadata metadata;  //用于管理kafka集群元数据
    private final Time time;
    private final long retryBackoffMs;  
    private final long unsentExpiryMs;  //缓冲的超时 时长
    private final AtomicBoolean wakeupDisabled = new AtomicBoolean();
...
}

最核心方法poll()，拉取消息进行消费

 public void poll(long timeout, long now, PollCondition pollCondition, boolean disableWakeup) {
        firePendingCompletedRequests();

        synchronized (this) {
            /1.使用client 进行send() request
            trySend(now);
            if (pollCondition == null || pollCondition.shouldBlock()) {
                if (client.inFlightRequestCount() == 0)
                    timeout = Math.min(timeout, retryBackoffMs);
                //3，
                client.poll(Math.min(MAX_POLL_TIMEOUT_MS, timeout), now);
                now = time.milliseconds();
            } else {
                client.poll(0, now);
            }
            //4.
            checkDisconnects(now);
            if (!disableWakeup) {
                //5.
                maybeTriggerWakeup();
            }
            // throw InterruptException if this thread is interrupted
            maybeThrowInterruptException();
            //6.
            trySend(now);
            //7.
            failExpiredRequests(now);
             unsent.clean();
        }
        firePendingCompletedRequests();
    }

1.trySend()，发送clientRequest请求，unsent是ConsumerNetworkClient内部类

private boolean trySend(long now) {
    boolean requestsSent = false;
    for (Node node : unsent.nodes()) {
            /获取迭代器，一个node有多个request?
        Iterator<ClientRequest> iterator = unsent.requestIterator(node);
        while (iterator.hasNext()) {
            ClientRequest request = iterator.next();
            /检查 是否可以发送请求
            if (client.ready(node, now)) {
                client.send(request, now);  /发送
                iterator.remove();  //移除
                requestsSent = true;  //发送成功
            }
        }
    }
    return requestsSent;
}

unsent缓冲队列， client的内部类，里面保存了一个集合：

private final ConcurrentMap<Node, ConcurrentLinkedQueue<ClientRequest>> unsent;

存放的是node对应的request请求

2.checkDisconnects()检测连接状态

把断开的Node，从unsent中移除

    private void checkDisconnects(long now) {
        for (Node node : unsent.nodes()) {
            if (client.connectionFailed(node)) {
                // Remove entry before invoking request callback to avoid callbacks handling
                // coordinator failures traversing the unsent list again.
                Collection<ClientRequest> requests = unsent.remove(node);
                for (ClientRequest request : requests) {
                    RequestFutureCompletionHandler handler = (RequestFutureCompletionHandler) request.callback();
                    AuthenticationException authenticationException = client.authenticationException(node);
                    if (authenticationException != null)
                        handler.onFailure(authenticationException);
                    else
                        handler.onComplete(new ClientResponse(request.makeHeader(request.requestBuilder().latestAllowedVersion()),
                            request.callback(), request.destination(), request.createdTimeMs(), now, true,
                            null, null));
                }
            }
        }
    }

2.3 ConsumerCoordinator

通过consumerCoordinator组件与服务端的GroupCoordinator进行交互，它继承了AbstractCoordinator抽象类

1.AbstractCoordinator

public abstract class AbstractCoordinator implements Closeable {
    public static final String HEARTBEAT_THREAD_PREFIX = "kafka-coordinator-heartbeat-thread";

    private enum MemberState {
        UNJOINED,    // the client is not part of a group
        REBALANCING, // the client has begun rebalancing
        STABLE,      // the client has joined and is sending heartbeats
    }

    private final Logger log;
    private final int sessionTimeoutMs;
    private final boolean leaveGroupOnClose;
    private final GroupCoordinatorMetrics sensors;
    private final Heartbeat heartbeat;      //心跳任务的辅助类
    protected final int rebalanceTimeoutMs;
    protected final String groupId;        //当前consumer所属grou的id
    protected final ConsumerNetworkClient client;  //负责网络通信和执行定时任务
    protected final Time time;
    protected final long retryBackoffMs;

    private HeartbeatThread heartbeatThread = null;
    private boolean rejoinNeeded = true;       //是否重新发送JoinGroupRequest请求的条件之一
    private boolean needsJoinPrepare = true;   //是否需要执行发送JoinGroupRequest请求前的准备操作
    private MemberState state = MemberState.UNJOINED;
    private RequestFuture<ByteBuffer> joinFuture = null;
    private Node coordinator = null;
    private Generation generation = Generation.NO_GENERATION;

    private RequestFuture<Void> findCoordinatorFuture = null;

2.ConsumerCoordinator

public final class ConsumerCoordinator extends AbstractCoordinator {
    private final Logger log;
    private final List<PartitionAssignor> assignors; //分区策略，在consumer发送出去的JoinGroupRequest中带着，server会从所有消费者都支持的Assignor中选择一个策略来进去分区。
    private final Metadata metadata;                  //kafka集群元数据
    private final ConsumerCoordinatorMetrics sensors;
    private final SubscriptionState subscriptions;      //之前分析过，但是这里有一种冗余的感觉
    private final OffsetCommitCallback defaultOffsetCommitCallback;
    private final boolean autoCommitEnabled;            //是否自动提交offset
    private final int autoCommitIntervalMs;             //自动提交的时间
    private final ConsumerInterceptors<?, ?> interceptors;  //拦截器
    private final boolean excludeInternalTopics;            //排除内部topic
    private final AtomicInteger pendingAsyncCommits;

    // this collection must be thread-safe because it is modified from the response handler
    // of offset commit requests, which may be invoked from the heartbeat thread
    private final ConcurrentLinkedQueue<OffsetCommitCompletion> completedOffsetCommits;

    private boolean isLeader = false;
    private Set<String> joinedSubscription;
    private MetadataSnapshot metadataSnapshot;  //存储metadata快照信息,用来检测topic是否发生分区数量变化，添加一个监听器，
    private MetadataSnapshot assignmentSnapshot;  //用来存储metadata快照信息，用来检测partition分配的过程中，有没有分区数量的变化。
    private long nextAutoCommitDeadline;
...
}

3.metadata的listener

//为metadata添加 监听器
private void addMetadataListener() {
    this.metadata.addListener(new Metadata.Listener() {
        @Override
        public void onMetadataUpdate(Cluster cluster, Set<String> unavailableTopics) {
            // if we encounter any unauthorized topics, raise an exception to the user 验证
            if (!cluster.unauthorizedTopics().isEmpty())
                throw new TopicAuthorizationException(new HashSet<>(cluster.unauthorizedTopics()));
            //更新 集合
            if (subscriptions.hasPatternSubscription())
                updatePatternSubscription(cluster);

            // check if there are any changes to the metadata which should trigger a rebalance
            //检测是否为Auto_Pattern或者auto_topics模式
            if (subscriptions.partitionsAutoAssigned()) {
                MetadataSnapshot snapshot = new MetadataSnapshot(subscriptions, cluster); //创建快照
                if (!snapshot.equals(metadataSnapshot))   //比较快照
                    metadataSnapshot = snapshot;          //记录快照
            }

            if (!Collections.disjoint(metadata.topics(), unavailableTopics))
                metadata.requestUpdate();
        }
    });
}

2.4 PartitionAssignor

Leader消费者收到JoinGroupResponse后，会按照其中指定的分区分配策略进行分区分配，每个分区分配策略就是一个PartitionAssignor接口的实现，该接口定义了两个内部类：Assignment和Subscriptioni两个内部类，

    class Subscription {
        private final List<String> topics;  //对应的主题，到底对应哪个分区，还不确定，待分配
        private final ByteBuffer userData;

   ...
}

    class Assignment {
        private final List<TopicPartition> partitions;  //表示的是分配给某消费者的topicPartition集合
        private final ByteBuffer userData;            //用户自定义的数据
...
}

分区分配需要两个重要的数据：metadata元数据信息，每个member的订阅信息，将用户的订阅信息封装成Subscriptioin。

Assignment保存了分区的分配结果，partitions表示的是分配给某消费者的topicPartition集合，userData是用户自定义的数据。

assign()完成分配的方法，onAssignment()是给每个消费者收到leader分配结果时的回调函数。

1.assign()

  public Map<String, Assignment> assign(Cluster metadata, Map<String, Subscription> subscriptions) {
        Set<String> allSubscribedTopics = new HashSet<>();
        for (Map.Entry<String, Subscription> subscriptionEntry : subscriptions.entrySet()) //取出所有topic
            allSubscribedTopics.addAll(subscriptionEntry.getValue().topics());

        Map<String, Integer> partitionsPerTopic = new HashMap<>();
        for (String topic : allSubscribedTopics) {    //遍历所有topic
            Integer numPartitions = metadata.partitionCountForTopic(topic);  //topic下的partition count
            if (numPartitions != null && numPartitions > 0)            //合法，就放入集合中，
                partitionsPerTopic.put(topic, numPartitions);
            else
                log.debug("Skipping assignment for topic {} since no metadata is available", topic);
        }
        //传入 topic-partitionCount + subscriptions信息
        //返回结果：consumerId-消费的分区（可能不是同一个topic的）
        Map<String, List<TopicPartition>> rawAssignments = assign(partitionsPerTopic, subscriptions); //【入】委派模式，不同策略不同实现

        // this class maintains no user data, so just wrap the results
        //整理分区结果
        Map<String, Assignment> assignments = new HashMap<>();
        for (Map.Entry<String, List<TopicPartition>> assignmentEntry : rawAssignments.entrySet())
            assignments.put(assignmentEntry.getKey(), new Assignment(assignmentEntry.getValue()));
        return assignments;
    }

2.RangeAssignor中的assign()

对每个topic，n=分区数/消费者数量，剩下的前m个消费者+1.最简单的平均分配方法存放的问题，这个存放topic隔离，意味着，前面m个消费者，可能在很多个topic中都是n+1状态。

public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
                                                    Map<String, Subscription> subscriptions) {
        Map<String, List<String>> consumersPerTopic = consumersPerTopic(subscriptions);
        Map<String, List<TopicPartition>> assignment = new HashMap<>();
        for (String memberId : subscriptions.keySet())
            assignment.put(memberId, new ArrayList<TopicPartition>());

        for (Map.Entry<String, List<String>> topicEntry : consumersPerTopic.entrySet()) {
            String topic = topicEntry.getKey();
            List<String> consumersForTopic = topicEntry.getValue();

            Integer numPartitionsForTopic = partitionsPerTopic.get(topic);
            if (numPartitionsForTopic == null)
                continue;

            Collections.sort(consumersForTopic);

            int numPartitionsPerConsumer = numPartitionsForTopic / consumersForTopic.size();
            int consumersWithExtraPartition = numPartitionsForTopic % consumersForTopic.size();

            List<TopicPartition> partitions = AbstractPartitionAssignor.partitions(topic, numPartitionsForTopic);
            for (int i = 0, n = consumersForTopic.size(); i < n; i++) {
                int start = numPartitionsPerConsumer * i + Math.min(i, consumersWithExtraPartition);
                int length = numPartitionsPerConsumer + (i + 1 > consumersWithExtraPartition ? 0 : 1);
                assignment.get(consumersForTopic.get(i)).addAll(partitions.subList(start, start + length));
            }
        }
        return assignment;
    }

3.RoundRobinAssignor中的assign()方法

它是通过轮询的方式，把【所有】的topic中的partition按字典顺序排序，然后对每个consumer进行轮询分配，如此，每个consumer绑定的partition count就比较平均

    public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
                                                    Map<String, Subscription> subscriptions) {
        Map<String, List<TopicPartition>> assignment = new HashMap<>();
        for (String memberId : subscriptions.keySet())
            assignment.put(memberId, new ArrayList<TopicPartition>());

        CircularIterator<String> assigner = new CircularIterator<>(Utils.sorted(subscriptions.keySet()));
        for (TopicPartition partition : allPartitionsSorted(partitionsPerTopic, subscriptions)) {
            final String topic = partition.topic();
            while (!subscriptions.get(assigner.peek()).topics().contains(topic))
                assigner.next();
            assignment.get(assigner.next()).add(partition);
        }
        return assignment;
    }