九、初探kafka消费者

消费端参数配置

序号参数默认值等级描述
1bootstrap.servers指定连接kafka集群所需的broker地址清单
2client.dns.lookupdefaultdefault,use_all_dns_ips,resolve_canonical_bootstrap_servers_only
3group.id消费者隶属的消费组名称,如果设置为空会抛出异常:Exception in thread “main” org.apache.kafka.common.errors.InvalidGroupIdException:The configured groupId is Invalid
4group.instance.id
5session.timeout.ms10000
6heartbeat.interval.ms3000ms当使用kafka分组管理功能时,心跳到消费者协调器之间的预计时间,心跳用于确保消费者的会话保持活动状态,当有新的消费者加入或离开组时方便重新平衡,该值必须必session.timeout.ms小,通常不高于1/3
7partition.assignment.strategyRangeAssignor分区分配策略
8metadata.max.age.ms300000ms(5分钟)用来配置元数据的过期时间,如果元数据在此参数所限定的时间范围内没有进行更新,则会被强制更新,即使没有任何分区变化或有新的broker加入
9enable.auto.commitTRUE
10auto.commit.interval.ms5000表示开启自动提交位移的时间间隔
11client.id
12client.rack
13max.partition.fetch.bytes1048576B(1MB)配置每个分区里返回给Consumer的最大数据量,与fetch.max.bytes参数相似,超过不会影响
14send.buffer.bytes:131072B(128KB)设置Socket发送消息缓冲区
15receive.buffer.bytes65536B(64KB)用来设置Socket接收消息缓冲区的大小,如果设置为-1,则使用操作系统的默认值
16fetch.min.bytes1BConsumer在一次拉取请求中能从kafka中拉取的最小数据量
17fetch.max.bytes5242880B(50M)拉取的最大数据量,这个设置不是绝对的最大值,如果在第一个非空分区中拉取的第一条消息大于该值,那么该消息将仍然可以消费,kafka所能接受的最大消息,Message.max,bytes(对应于主题端参数max.message.bytes)来设置
18fetch.max.wait.ms500ms指定kafka的等待时间
19reconnect.backoff.ms50ms用来配置尝试重新连接指定主机之前的等待时间,避免频繁的连接主机
20reconnect.backoff.max.ms1000
21retry.backoff.ms:100ms用来配置尝试重新发送失败的请求到指定的主题分区之前的等待时间,避免频繁发送
22auto.offset.reetlatestearliest、latest、none
23check.crcsTRUE
24metrics.sample.window.ms30000
25metrics.num.samples2
26metrics.recording.levelINFOINFO,DEBUG
27metric.reporters
28key.decerializer解码方式
29value.decerializer解码方式
30request.timeout.ms30000ms用来配置Consumer等待请求响应的最长时间
31default.api.timeout.ms60 * 1000
32connection.max.idle.ms540000ms(9分钟)指定在多久之后关闭限制的链接
33interceptor.class拦截器
34max.poll.records500条配置Consumer在一次拉取请求中拉取的最大消息数
35max.poll.interval.ms300000当通过消费组管理消费者时,该配置指定拉取消息新城最长空闲时间,若超过这个时间间隔还未发起poll操作,则消费组认为该消费组已离开了消费组,将进行再均衡操作
36exclude.internal.topicsTRUE:指定kafka中的内部主题是否可以向消费者公开,默认为true,如果为true,则只能使用subscribe(Collection)的方式而不能使用subscribe(Pattern)的方式来订阅内部主题
37isolation.levelread_uncommitted这个参数用来配置消费者的事务隔离级别,有效值为“read_uncommitted"和"read_committed",表示消费者所消费到的位置,如果为read_committed,那么消费者就会忽略事务未提交的消息,只能消费到LSO(LastStableOffet)的位置,默认为read_uncommitted,可以消费到HW处的位置
38allow.auto.create.topicsTRUE
39security.providers
40security.protocol

跟组重平衡有关的参数

  • session.timeout.ms
  • max.poll.interval.ms
  • heartbeat.interval.ms
  • group.id
  • group.instance.id
  • retry.backoff.ms
  • internal.leave.group.on.close

跟consumerMetadata有关的数据

  • retry.backoff.ms
  • metadata.max.age.ms
  • exclude.internal.topics
  • allow.auto.create.topics

consume启动样例

public class MyConsume {
    public static void main(String[] args) {
        Consumer<String,String> consumer = new KafkaConsumer<String, String>(MyConsume.getConsumeProp());
        consumer.subscribe(Collections.singletonList("test_2"));
        System.out.println("开始-------------------------------");
        while(true){
            ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
            if(!records.isEmpty()) {
                Map<TopicPartition,OffsetAndMetadata> offsets = new HashMap<>();
                try {
                    for (TopicPartition partition : records.partitions()) {
                        List<ConsumerRecord<String, String>> partitionRecords = records.records(partition);
                        for (ConsumerRecord<String, String> record : partitionRecords) {
                            System.out.println("-----------------------------------"+record.value());
                        }
                        long lastConsumeOffset = partitionRecords.get(partitionRecords.size() - 1).offset();
                        offsets.put(partition,new OffsetAndMetadata(lastConsumeOffset+1));
                    }
                    consumer.commitAsync();

                }catch (Exception e){

                }
            }
        }
    }
    public static Properties getConsumeProp(){
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092,localhost:9093,localhost:9094");
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
        props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,false);
        props.put(ConsumerConfig.GROUP_ID_CONFIG,"mykafka-group");
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
        return props;
    }
}

客户端初始化源码分析

org.apache.kafka.clients.consumer.KafkaConsumer#KafkaConsumer(java.util.Properties)

private KafkaConsumer(ConsumerConfig config, Deserializer<K> keyDeserializer, Deserializer<V> valueDeserializer) {
        try {
            //跟组重平衡有关的参数
            GroupRebalanceConfig groupRebalanceConfig = new GroupRebalanceConfig(config,
                    GroupRebalanceConfig.ProtocolType.CONSUMER);
            //初始化groupId、clientId
            this.groupId = Optional.ofNullable(groupRebalanceConfig.groupId);
            this.clientId = buildClientId(config.getString(CommonClientConfigs.CLIENT_ID_CONFIG), groupRebalanceConfig);

            LogContext logContext;

            // If group.instance.id is set, we will append it to the log context.
            if (groupRebalanceConfig.groupInstanceId.isPresent()) {
                logContext = new LogContext("[Consumer instanceId=" + groupRebalanceConfig.groupInstanceId.get() +
                        ", clientId=" + clientId + ", groupId=" + groupId.orElse("null") + "] ");
            } else {
                logContext = new LogContext("[Consumer clientId=" + clientId + ", groupId=" + groupId.orElse("null") + "] ");
            }

            this.log = logContext.logger(getClass());
            boolean enableAutoCommit = config.getBoolean(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG);
            if (!groupId.isPresent()) { // overwrite in case of default group id where the config is not explicitly provided
                if (!config.originals().containsKey(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG)) {
                    enableAutoCommit = false;
                } else if (enableAutoCommit) {
                    throw new InvalidConfigurationException(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG + " cannot be set to true when default group id (null) is used.");
                }
            } else if (groupId.get().isEmpty()) {
                log.warn("Support for using the empty group id by consumers is deprecated and will be removed in the next major release.");
            }

            log.debug("Initializing the Kafka consumer");
            this.requestTimeoutMs = config.getInt(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG);
            this.defaultApiTimeoutMs = config.getInt(ConsumerConfig.DEFAULT_API_TIMEOUT_MS_CONFIG);
            this.time = Time.SYSTEM;
            //客户端指标
            this.metrics = buildMetrics(config, time, clientId);
            this.retryBackoffMs = config.getLong(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG);

            // 加载客户端自定义拦截器
            Map<String, Object> userProvidedConfigs = config.originals();
            userProvidedConfigs.put(ConsumerConfig.CLIENT_ID_CONFIG, clientId);
            List<ConsumerInterceptor<K, V>> interceptorList = (List) (new ConsumerConfig(userProvidedConfigs, false)).getConfiguredInstances(ConsumerConfig.INTERCEPTOR_CLASSES_CONFIG,
                    ConsumerInterceptor.class);
            this.interceptors = new ConsumerInterceptors<>(interceptorList);
            //key及value序列化方式
            if (keyDeserializer == null) {
                this.keyDeserializer = config.getConfiguredInstance(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, Deserializer.class);
                this.keyDeserializer.configure(config.originals(), true);
            } else {
                config.ignore(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG);
                this.keyDeserializer = keyDeserializer;
            }
            if (valueDeserializer == null) {
                this.valueDeserializer = config.getConfiguredInstance(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, Deserializer.class);
                this.valueDeserializer.configure(config.originals(), false);
            } else {
                config.ignore(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG);
                this.valueDeserializer = valueDeserializer;
            }
            //offset重置策略
            OffsetResetStrategy offsetResetStrategy = OffsetResetStrategy.valueOf(config.getString(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG).toUpperCase(Locale.ROOT));
            //初始化SubscriptionState
            this.subscriptions = new SubscriptionState(logContext, offsetResetStrategy);
            ClusterResourceListeners clusterResourceListeners = configureClusterResourceListeners(keyDeserializer,
                    valueDeserializer, metrics.reporters(), interceptorList);
            //初始化ConsumerMetadata
            this.metadata = new ConsumerMetadata(retryBackoffMs,
                    config.getLong(ConsumerConfig.METADATA_MAX_AGE_CONFIG),
                    !config.getBoolean(ConsumerConfig.EXCLUDE_INTERNAL_TOPICS_CONFIG),
                    config.getBoolean(ConsumerConfig.ALLOW_AUTO_CREATE_TOPICS_CONFIG),
                    subscriptions, logContext, clusterResourceListeners);
            List<InetSocketAddress> addresses = ClientUtils.parseAndValidateAddresses(
                    config.getList(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG), config.getString(ConsumerConfig.CLIENT_DNS_LOOKUP_CONFIG));
            this.metadata.bootstrap(addresses);
            String metricGrpPrefix = "consumer";

            FetcherMetricsRegistry metricsRegistry = new FetcherMetricsRegistry(Collections.singleton(CLIENT_ID_METRIC_TAG), metricGrpPrefix);
            ChannelBuilder channelBuilder = ClientUtils.createChannelBuilder(config, time, logContext);
            IsolationLevel isolationLevel = IsolationLevel.valueOf(
                    config.getString(ConsumerConfig.ISOLATION_LEVEL_CONFIG).toUpperCase(Locale.ROOT));
            Sensor throttleTimeSensor = Fetcher.throttleTimeSensor(metrics, metricsRegistry);
            int heartbeatIntervalMs = config.getInt(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG);

            ApiVersions apiVersions = new ApiVersions();
            NetworkClient netClient = new NetworkClient(
                    new Selector(config.getLong(ConsumerConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG), metrics, time, metricGrpPrefix, channelBuilder, logContext),
                    this.metadata,
                    clientId,
                    100, // a fixed large enough value will suffice for max in-flight requests
                    config.getLong(ConsumerConfig.RECONNECT_BACKOFF_MS_CONFIG),
                    config.getLong(ConsumerConfig.RECONNECT_BACKOFF_MAX_MS_CONFIG),
                    config.getInt(ConsumerConfig.SEND_BUFFER_CONFIG),
                    config.getInt(ConsumerConfig.RECEIVE_BUFFER_CONFIG),
                    config.getInt(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG),
                    ClientDnsLookup.forConfig(config.getString(ConsumerConfig.CLIENT_DNS_LOOKUP_CONFIG)),
                    time,
                    true,
                    apiVersions,
                    throttleTimeSensor,
                    logContext);
            this.client = new ConsumerNetworkClient(
                    logContext,
                    netClient,
                    metadata,
                    time,
                    retryBackoffMs,
                    config.getInt(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG),
                    heartbeatIntervalMs); //Will avoid blocking an extended period of time to prevent heartbeat thread starvation
            //获取分区分配策略
            this.assignors = getAssignorInstances(config.getList(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG), config.originals());

            // no coordinator will be constructed for the default (null) group id
            //初始化ConsumerCoordinator及ConsumerGroupMetadata
            this.coordinator = !groupId.isPresent() ? null :
                new ConsumerCoordinator(groupRebalanceConfig,
                        logContext,
                        this.client,
                        assignors,
                        this.metadata,
                        this.subscriptions,
                        metrics,
                        metricGrpPrefix,
                        this.time,
                        enableAutoCommit,
                        config.getInt(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG),
                        this.interceptors);
            this.fetcher = new Fetcher<>(
                    logContext,
                    this.client,
                    config.getInt(ConsumerConfig.FETCH_MIN_BYTES_CONFIG),
                    config.getInt(ConsumerConfig.FETCH_MAX_BYTES_CONFIG),
                    config.getInt(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG),
                    config.getInt(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG),
                    config.getInt(ConsumerConfig.MAX_POLL_RECORDS_CONFIG),
                    config.getBoolean(ConsumerConfig.CHECK_CRCS_CONFIG),
                    config.getString(ConsumerConfig.CLIENT_RACK_CONFIG),
                    this.keyDeserializer,
                    this.valueDeserializer,
                    this.metadata,
                    this.subscriptions,
                    metrics,
                    metricsRegistry,
                    this.time,
                    this.retryBackoffMs,
                    this.requestTimeoutMs,
                    isolationLevel,
                    apiVersions);
            //客户端统计指标
            this.kafkaConsumerMetrics = new KafkaConsumerMetrics(metrics, metricGrpPrefix);

            config.logUnused();
            AppInfoParser.registerAppInfo(JMX_PREFIX, clientId, metrics, time.milliseconds());
            log.debug("Kafka consumer initialized");
        } catch (Throwable t) {
            // call close methods if internal objects are already constructed; this is to prevent resource leak. see KAFKA-2121
            // we do not need to call `close` at all when `log` is null, which means no internal objects were initialized.
            if (this.log != null) {
                close(0, true);
            }
            // now propagate the exception
            throw new KafkaException("Failed to construct kafka consumer", t);
        }
    }

总结

客户端初始化主要做了以下几件事:
1、读一些配置
2、初始化SubscriptionState
3、初始化客户端指标fetch-throttle-time-avg,fetch-throttle-time-max
4、初始化客户端元数据 ConsumerMetadata
5、读取分区分配策略
6、初始化客户端协调器

  • 初始化消费组元数据 ConsumerGroupMetadata
  • 初始化消费组有关的统计指标
    7、初始化消费者有关的统计指标

客户端绑定topic源码分析

org.apache.kafka.clients.consumer.KafkaConsumer#subscribe(java.util.Collection<java.lang.String>, org.apache.kafka.clients.consumer.ConsumerRebalanceListener)

public void subscribe(Collection<String> topics, ConsumerRebalanceListener listener) {
        acquireAndEnsureOpen();
        try {
            //校验groupId
            maybeThrowInvalidGroupIdException();
            if (topics == null)
                throw new IllegalArgumentException("Topic collection to subscribe to cannot be null");
            if (topics.isEmpty()) {
                // treat subscribing to empty topic list as the same as unsubscribing
                this.unsubscribe();
            } else {
                for (String topic : topics) {
                    if (topic == null || topic.trim().isEmpty())
                        throw new IllegalArgumentException("Topic collection to subscribe to cannot contain null or empty topic");
                }
                //如果没有分区分配规则,则抛错,清除没有绑定的topic
                throwIfNoAssignorsConfigured();
                fetcher.clearBufferedDataForUnassignedTopics(topics);
                log.info("Subscribed to topic(s): {}", Utils.join(topics, ", "));
                //更新客户端元数据
                if (this.subscriptions.subscribe(new HashSet<>(topics), listener))
                    metadata.requestUpdateForNewTopics();
            }
        } finally {
            release();
        }
    }

总结

1、校验groupId
2、如果绑定的topic为空则执行解绑操作
3、设置立即更新客户端元数据 ConsumerMetadata
客户端绑定topic是支持动态刷新的,如果topic有变动,绑定topic的代码会请求立即更新元数据

客户端poll做的事

org.apache.kafka.clients.consumer.KafkaConsumer#poll(org.apache.kafka.common.utils.Timer, boolean)

  private ConsumerRecords<K, V> poll(final Timer timer, final boolean includeMetadataInTimeout) {
        //获取轻量级的锁,防止多线程使用kafkaConsumer
        acquireAndEnsureOpen();
        try {
            //之前已经初始化的kafkaConsumer指标
            this.kafkaConsumerMetrics.recordPollStart(timer.currentTimeMs());

            if (this.subscriptions.hasNoSubscriptionOrUserAssignment()) {
                throw new IllegalStateException("Consumer is not subscribed to any topics or assigned any partitions");
            }

            do {
                //判断客户端是否能被唤醒,如果线程没有在执行不可中断请求,且被中断,则抛异常
                client.maybeTriggerWakeup();

                if (includeMetadataInTimeout) {
                    // 这个方法有三个作用
                    //1、更新元数据
                    //2、加入消费组,非阻塞
                    //3、获取消费位点
                    updateAssignmentMetadataIfNeeded(timer, false);
                } else {
                    while (!updateAssignmentMetadataIfNeeded(time.timer(Long.MAX_VALUE), true)) {
                        log.warn("Still waiting for metadata");
                    }
                }
                //fetch数据
                final Map<TopicPartition, List<ConsumerRecord<K, V>>> records = pollForFetches(timer);
                if (!records.isEmpty()) {
                    // before returning the fetched records, we can send off the next round of fetches
                    // and avoid block waiting for their responses to enable pipelining while the user
                    // is handling the fetched records.
                    //
                    // NOTE: since the consumed position has already been updated, we must not allow
                    // wakeups or any other errors to be triggered prior to returning the fetched records.
                    if (fetcher.sendFetches() > 0 || client.hasPendingRequests()) {
                        client.transmitSends();
                    }
                    //调用自定义拦截器过滤数据
                    return this.interceptors.onConsume(new ConsumerRecords<>(records));
                }
                //判断time是否有超时,没有的话继续poll数据
            } while (timer.notExpired());

            return ConsumerRecords.empty();
        } finally {
            release();
            this.kafkaConsumerMetrics.recordPollEnd(timer.currentTimeMs());
        }
    }

总结

1、与最小负载节点通信找到group节点
2、加入group,rebanlance
3、与其他节点同步rebanlance结果
4、fetch数据
5、处理数据
6、与group交互提交位点

客户端poll数据简易流程图

图一

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小飞侠fly

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值