摘要:
本博客是从工作中遇到的问题(windows版kafka 工作一段时间出现java.nio.file.FileSystemException,另一个程序正在使用此文件,进程无法访问)通过对问题的查询,寻找出该问题的原因并提出多种解决办法。本次采用打补丁的方式解决此问题,经过长时间的运行,验证此方法是可用的。
问题描述:
之前的项目在linux上使用过kafka,功能非常好用,另一个设备应用场景也需要kafka,不同的是操作系统是windows版本,在windows部署后,在使用过程中经常崩掉。查看日志报错如下:
报错的内容是:java.nio.file.FileSystemException 另一个程序正在使用此文件,进程无法访问。
解决思路:
参考思路:https://stackoverflow.com/questions/59187659/windows-kafka-java-nio-file-filesystemexception
经过查询该问题的原因是kafka日志清理在windows中重命名函数有问题,在linux一个程序打开文件,另一个程序重命名文件是没有问题的,但在windows中加了锁的机制,一个文件打开后,另一个程序重命名就会报错。这样导致kafka抛异常宕机。
解决的方法有:
1.手动启动
手动删除kafka的日志文件,重启kafka 。
2.修改配置文件
log.retention.hours=-1
log.cleaner.enable=false
这样就不会触发kafka日志清理功能,但是这样会造成日志不断的增大。
3.在windows上搭建docker,然后安装zookeeper、kafka
这种方法是可以解决这个问题,但是这边的机器配置比较低,又增加些组件,显然不太适合,如果有资源够的情况下是可以用这个方法。
安装方法参考:https://blog.csdn.net/sayoko06/article/details/104020621
4.使用补丁
https://github.com/apache/kafka/pull/6329
https://github.com/apache/kafka/pull/6329/commits/3eceb9ea1d96d545d1c60713ea40e1ce7b354dce
补丁的内容:
diff --git a/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java b/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java
index 481bacfa3f..f81e62daeb 100644
--- a/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java
+++ b/clients/src/main/java/org/apache/kafka/common/record/FileRecords.java
@@ -110,14 +110,7 @@ public class FileRecords extends AbstractRecords implements Closeable {
public FileChannel channel() {
if (OperatingSystem.IS_WINDOWS) {
synchronized (mutex) {
- if (channel == null) {
- try {
- channel = FileChannel.open(file.toPath(), StandardOpenOption.CREATE, StandardOpenOption.READ,
- StandardOpenOption.WRITE);
- } catch (IOException e) {
- throw new UncheckedIOException(e);
- }
- }
+ reopenChannelIfClosed();
return channel;
}
} else {
@@ -187,8 +180,25 @@ public class FileRecords extends AbstractRecords implements Closeable {
* Commit all written data to the physical disk
*/
public void flush() throws IOException {
- if (channel != null) {
- channel.force(true);
+ if (OperatingSystem.IS_WINDOWS) {
+ synchronized (mutex) {
+ reopenChannelIfClosed();
+ }
+ } else {
+ if (channel != null) {
+ channel.force(true);
+ }
+ }
+ }
+
+ private void reopenChannelIfClosed() {
+ if (channel == null) {
+ try {
+ channel = FileChannel.open(file.toPath(), StandardOpenOption.CREATE, StandardOpenOption.READ,
+ StandardOpenOption.WRITE);
+ } catch (IOException e) {
+ throw new UncheckedIOException(e);
+ }
}
}
diff --git a/core/src/main/scala/kafka/log/Log.scala b/core/src/main/scala/kafka/log/Log.scala
index 5ad3c3e581..41158fd76b 100644
--- a/core/src/main/scala/kafka/log/Log.scala
+++ b/core/src/main/scala/kafka/log/Log.scala
@@ -41,7 +41,7 @@ import org.apache.kafka.common.record.FileRecords.TimestampAndOffset
import org.apache.kafka.common.record._
import org.apache.kafka.common.requests.FetchResponse.AbortedTransaction
import org.apache.kafka.common.requests.{EpochEndOffset, ListOffsetRequest}
-import org.apache.kafka.common.utils.{Time, Utils}
+import org.apache.kafka.common.utils.{OperatingSystem, Time, Utils}
import org.apache.kafka.common.{KafkaException, TopicPartition}
import scala.collection.JavaConverters._
@@ -758,6 +758,11 @@ class Log(@volatile var dir: File,
lock synchronized {
maybeHandleIOException(s"Error while renaming dir for $topicPartition in log dir ${dir.getParent}") {
val renamedDir = new File(dir.getParent, name)
+
+ if (OperatingSystem.IS_WINDOWS) {
+ this.close()
+ }
+
Utils.atomicMoveWithFallback(dir.toPath, renamedDir.toPath)
if (renamedDir != dir) {
dir = renamedDir
实现方法:
本次采用的是这个打过补丁的版本,进行重新编译
https://github.com/apache/kafka/tree/0baf9c158b5681a55df4de3a0e6193d32b1433ff
编译后的安装包:
链接:https://pan.baidu.com/s/18L2kCY2WXsKqqZoEDJSaLA
提取码:vaj2
此安装包已经我们多次验证,能够成功的解决这个问题。
参考:
https://hiddenpps.blog.csdn.net/article/details/80418297?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromBaidu-1.not_use_machine_learn_pai&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromBaidu-1.not_use_machine_learn_pai
https://blog.csdn.net/u013160932/article/details/79874709?utm_medium=distribute.pc_relevant.none-task-blog-searchFromBaidu-4.not_use_machine_learn_pai&depth_1-utm_source=distribute.pc_relevant.none-task-blog-searchFromBaidu-4.not_use_machine_learn_pai
https://blog.csdn.net/qq_40125653/article/details/111867706
ribute.pc_relevant.none-task-blog-searchFromBaidu-4.not_use_machine_learn_pai
https://blog.csdn.net/qq_40125653/article/details/111867706
https://www.bbsmax.com/A/ke5jPeW7zr/