Kafka consumer多线程下not safe for multi-threaded access问题
默认配置下kafka consumer的offset的commit是自动的,如需改成手动提交可以修改参数:enable.auto.commit = false
在手动提交offset的模式下,只需要手动执行kafkaConsumer.commitSync()即可提交本次拉取消息的所有分区的offset信息,伪代码片段如下:
Properties kafkaProperties = new Properties();
......
kafkaProperties.put("enable.auto.commit", false);
kafkaProperties.put("auto.offset.reset", "latest");
kafkaProperties.put("isolation.level", "read_committed");
......
while(running) {
Consumer<String, String> kafkaConsumer = new KafkaConsumer<>(kafkaProperties);
ConsumerRecords<String, String> records = kafkaConsumer.poll(500L);
......
kafkaConsumer.commitSync();
}
有时候我们可能会使用到这样的方式:在kafkaConsumer poll到消息以后不会马上做处理,而是把消息先缓存到一个阻塞队列或者RingBuffer,再由其它线程去消费这个队列然后处理对应的消息。这样做的好处是可以将message的拉取逻辑和处理逻辑解耦;还有就是这种异步的方式可以提高效率,即在处理message的同时还是能正常拉取message,不会因为message的处理耗时影响message的拉取,只有当队列满了message的拉取才会被阻塞。伪代码片段如下:
private ArrayBlockingQueue<ConsumerRecords<String, String>> queue = new ArrayBlockingQueue<>(100);
/**
* 初始化方法
*/
public void init() {
Properties properties = new Properties();
properties.put("topic", "mytest");
......
properties.put("enable.auto.commit", false);
properties.put("auto.offset.reset", "latest");
properties.put("isolation.level", "read_committed");
this.kafkaConsumer = new KafkaConsumer<>(properties);
this.kafkaConsumer.subscribe(Collections.singletonList(topic));
this.running = true;
new Thread(() -> {
try {
testPrintAndCommit();
} catch (InterruptedException e) {
e.printStackTrace();
}
}).start();
}
/**
* 拉取消息并缓存在阻塞队列
*/
public void consumeMessage() {
while (running) {
ConsumerRecords<String, String> records = this.kafkaConsumer.getMessage(200L, TimeUnit.MILLISECONDS);
if (records != null && records.count() > 0) {
queue.put(records);
}
}
}
/**
* 打印消息并提交offset
*/
private void testPrintAndCommit() {
while (running) {
ConsumerRecords<String, String> records= queue.poll(200, TimeUnit.MILLISECONDS);
for (ConsumerRecord<String, String> record : records) {
System.out.println(record.value());
}
this.kafkaConsumer.commitSync();
}
}
但是如果直接运行该代码会抛异常:KafkaConsumer is not safe for multi-threaded access ,大概意思就是KafkaConsumer类不是线程安全的无法在多线程中同时对其访问。由于上面示例代码中kafkaConsumer.poll()和kafkaConsumer.commitSync()方法在两个不同的线程中同时进行操作,所以导致抛出这个异常。那么如何解决这个问题呢?常见的一些解决方法都是建议将kafkaConsumer的操作放在同一线程中或者使用自动提交的方式,但是这些方法都无法满足上述的需求。
其实根据上述的异常信息:KafkaConsumer is not safe for multi-threaded access,说明该对象在多线程同时访问时不是线程安全,那么是不是意味着如果能保证这个对象在多线程访问下的线程安全是否就不会抛这个异常?于是在kafkaConsumer的poll()方法和commitSync()方法上面加上读写锁进行测试:
private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
public void consumeMessage() {
while (running) {
ConsumerRecords<String, String> records;
final Lock lock = readWriteLock.readLock();
lock.lock();
try {
records = records = this.kafkaConsumer.getMessage(200L, TimeUnit.MILLISECONDS);
} finally {
lock.unlock();
}
if (records != null && records.count() > 0) {
queue.put(records);
}
}
}
private void testPrintAndCommit() {
while (running) {
ConsumerRecords<String, String> records= queue.poll(200, TimeUnit.MILLISECONDS);
for (ConsumerRecord<String, String> record : records) {
System.out.println(record.value());
}
final Lock lock = readWriteLock.writeLock();
lock.lock();
try {
this.kafkaConsumer.commitSync();
} finally {
lock.unlock();
}
}
}
在poll的时候加上读锁,在commit的时候加上写锁,以确保poll和commit的同步串行执行。再次测试执行代码就不会再抛 KafkaConsumer is not safe for multi-threaded access 异常。
下面看一下Kafka是如何实现这个功能的:
找到KafkaConsumer的 poll方法和commitSync方法(这里使用的是kafka-client 1.1.1 版本):
public ConsumerRecords<K, V> poll(long timeout) {
acquireAndEnsureOpen();
try {
......
} finally {
release();
}
}
public void commitSync() {
acquireAndEnsureOpen();
try {
coordinator.commitOffsetsSync(subscriptions.allConsumed(), Long.MAX_VALUE);
} finally {
release();
}
}
可以看到在两个方法的开始和结束都调用了acquireAndEnsureOpen()和release()方法,查看这两个方法的内容:
private void acquireAndEnsureOpen() {
acquire();
if (this.closed) {
release();
throw new IllegalStateException("This consumer has already been closed.");
}
}
private void acquire() {
long threadId = Thread.currentThread().getId();
if (threadId != currentThread.get() && !currentThread.compareAndSet(NO_CURRENT_THREAD, threadId))
throw new ConcurrentModificationException("KafkaConsumer is not safe for multi-threaded access");
refcount.incrementAndGet();
}
private void release() {
if (refcount.decrementAndGet() == 0)
currentThread.set(NO_CURRENT_THREAD);
}
可以看到在方法开始的时候会进入acquire()方法,里面会对当前线程id进行判断:threadId != currentThread.get() && !currentThread.compareAndSet(NO_CURRENT_THREAD, threadId),其中currentThread是KafkaConsumer的AtomicLong类型的成员变量:private final AtomicLong currentThread = new AtomicLong(NO_CURRENT_THREAD); NO_CURRENT_THREAD常量值为-1。
那么if判断的前半部分:threadId != currentThread.get() 在第一个线程进入的时候肯定返回true;后半部分:!currentThread.compareAndSet(NO_CURRENT_THREAD, threadId) 会先判断currentThread当前值是否为NO_CURRENT_THREAD,是则设置为当前线程id,此处返回false那么不会进入if条件。
在方法最后执行release()方法,会判断refcount变量,该变量会记录当前线程的重入次数,当重入次数清零以后会把currentThread变量再次设置为NO_CURRENT_THREAD。那么就不难理解,第一个线程在执行了acquire()方法之后未执行release()之前,其它线程如果进来的执行了acquire()方法的话:判断threadId != currentThread.get() 返回ture,!currentThread.compareAndSet(NO_CURRENT_THREAD, threadId)返回true满足if条件抛出KafkaConsumer is not safe for multi-threaded access异常
由此可知在加了同步锁以后每次poll和commitSync都是串行的,能够保证acquire()和release()的原子性从而实现了该功能。