kafka消费者(consumer)异步处理及性能优化

简介

本文的重点在于从kafka读取数据之后的异步处理,批量提交等旨在提高性能的操作。有关Kafka Customer, Kafka Customer Group,以及Kafka Customer的配置等内容不做过多的解释,如果需要了解这方面的内容可参考Kafka消费者-从Kafka读取数据(本人读了感觉讲的停详细的),了解更多的内容。

gradle

implementation "org.jetbrains.kotlin:kotlin-stdlib" // Generate optimized JSON (de)serializers
implementation 'org.jetbrains.kotlinx:kotlinx-coroutines-jdk8:1.3.7'
// kafka
implementation 'org.apache.kafka:kafka-clients:2.0.0'

code

1.kafka customer相关的配置项

data class KafkaConfig(
            val host: String,
            val topic: String,
            val consumer: KafkaConsumerConfig
    )

data class KafkaConsumerConfig(
    val groupId: String,
    val keyDeserializer: String,
    val valueDeserializer: String,
    val maxPoll: Int = 500,
    val timeoutRequest: Int = 3000,
    val timeoutSession: Int = 3000,
    val pollDuration: Long = 500,
    val autoOffsetReset: String,
    val pollInterval: Int = 500
)

本次我们直接把kafka consumer部分配置直接定义成数据类,当然也可以定义成一个map,通过配置文件直接加载。

2.从kafka读取数据


private typealias Record = ConsumerRecord<String, String>
private typealias RecordConsumer = KafkaConsumer<String, String>

class KafkaConsumer(
        // 使用kafka record的业务类
        private val service: Services,
        private val kafkaConfig: KafkaConfig,
) {
    private val kafkaConsumerConfig = kafkaConfig.consumer
    // 创建kafka consumer
    private val consumer: RecordConsumer = createConsumer()
    // 每次从kafka读取的record数
    private val pollTimeout = Duration.ofMillis(kafkaConsumerConfig.pollDuration)

    // Buffer up to 1000 records before blocking
    private val recordsChannel = Channel<Record>(MAX_UNCOMMITTED_ITEMS)

    // 用来保存当前从kafka读取到的每个分区的offset
    private val offsets = mutableMapOf<TopicPartition, OffsetAndMetadata>()

    // This function needs to be an extension of the coroutine scope since we're launching a background job.
    private fun CoroutineScope.forwardRecords(from: RecordConsumer, to: Channel<Record>) =
            launch(Dispatchers.IO) {
                while (true) {
                    val records = from.poll(pollTimeout)
                    records.forEach { to.send(it) }
                }
            }

    
    private fun CoroutineScope.consumeRecords() =
            launch(Dispatchers.IO) {
                recordsChannel.receiveAsFlow().collectIndexed { index, record ->
                    // 使用kafka record
                    service.set(record)
                    // put offset to map
                    offsets[TopicPartition(record.topic(), record.partition())] =
                            OffsetAndMetadata(record.offset() + 1, "")
                    commitIfNeeded(index)
                }
            }

    private fun createConsumer() = RecordConsumer(
            mapOf(
                    "bootstrap.servers" to kafkaConfig.host,
                    "group.id" to kafkaConsumerConfig.groupId,
                    "key.deserializer" to Class.forName(kafkaConsumerConfig.keyDeserializer),
                    "value.deserializer" to Class.forName(kafkaConsumerConfig.valueDeserializer),
                    "max.poll.interval.ms" to kafkaConsumerConfig.pollInterval,
                    "session.timeout.ms" to kafkaConsumerConfig.timeoutSession,
                    "auto.offset.reset" to kafkaConsumerConfig.autoOffsetReset,
                    "enable.auto.commit" to false, // Always commit manually
                    "max.poll.records" to kafkaConsumerConfig.maxPoll,
                    "request.timeout.ms" to kafkaConsumerConfig.timeoutRequest
            )
    )

     suspend fun fetchData() {
        consumer.use { consumer ->
            consumer.subscribe(listOf(kafkaConfig.topic))
            coroutineScope {
                forwardRecords(consumer, recordsChannel)
                consumeRecords()
            }
        }
    }

    /**
     * Will try to commit for another 3 times when TimeoutException and CommitFailedException are thrown
     */
    private fun tryCommit() {
        try {
            consumer.commitAsync(offsets, null)
            true
        } catch (e: CommitFailedException) {
            false
        } catch (e: TimeoutException) {
            false
        }
    }

    private fun commitIfNeeded(index: Int) {
        if (index % MAX_UNCOMMITTED_ITEMS == MAX_UNCOMMITTED_ITEMS - 1) {
            tryCommit()
        }
    }

    // Will be called by shutdown hook
     fun close() {
        logger.info("Closing kafka consumer.")
        recordsChannel.close()
        consumer.wakeup()
    }

    companion object {
        const val MAX_UNCOMMITTED_ITEMS = 1000
    }
}

本实例起了两个协程forwardRecords和consumeRecords分别为异步的从kafka读取数据和消费读取到的数据,然后通过一个channel把两者联系起来,在加上业务类异步处理,使得整个处理过程异步话,进一步提高了性能,在这通过offsets保存当前读到的各个分区的最新的offset,然后通过commitIfNeeded来批量commit。

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Kafka消费者异步手动提交偏移量一般需要以下几个步骤: 1. 创建Kafka消费者并订阅topic。 ```java Properties props = new Properties(); props.put("bootstrap.servers", "localhost:9092"); props.put("group.id", "test-group"); props.put("enable.auto.commit", "false"); props.put("auto.offset.reset", "earliest"); props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props); consumer.subscribe(Collections.singletonList("test-topic")); ``` 2. 在消费消息时手动提交偏移量。 ```java try { while (true) { ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(1000)); for (ConsumerRecord<String, String> record : records) { // 消费消息 System.out.printf("topic = %s, partition = %s, offset = %d, key = %s, value = %s\n", record.topic(), record.partition(), record.offset(), record.key(), record.value()); } // 异步提交偏移量 consumer.commitAsync(new OffsetCommitCallback() { @Override public void onComplete(Map<TopicPartition, OffsetAndMetadata> offsets, Exception exception) { if (exception != null) { System.err.println("Commit failed for " + offsets + ", " + exception.getMessage()); } } }); } } finally { // 关闭消费者 consumer.close(); } ``` 在上面的代码中,我们使用`consumer.commitAsync()`方法异步提交偏移量,并在回调函数中处理提交结果和异常情况。需要注意的是,如果在提交偏移量时出现异常,我们可以选择重试或者将消息进行处理,具体取决于业务需求。 另外,我们也可以使用`consumer.commitSync()`方法同步提交偏移量,但是这种方式可能会影响消费性能,因为它会阻塞消费线程。因此,异步提交偏移量是更常用的方式。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值