在yarn集群上启动一个flink任务,抛出如下异常:
Exception in thread "Thread-5" java.lang.IllegalStateException: Trying to access closed classloader. Please check if you store classloaders directly or indirectly in static fields. If the stacktrace suggests that the leak occurs in a third party library and cannot be fixed immediately, you can disable this check with the configuration 'classloader.check-leaked-classloader'.
at org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164)
at org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183)
at org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2737)
at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:2993)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2952)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2925)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2805)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1199)
at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1787)
at org.apache.hadoop.util.ShutdownHookManager.getShutdownTimeout(ShutdownHookManager.java:183)
at org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:145)
at org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
任务虽然可以正常运行,但是为了安全,还是google了一下这个错误,发现官方issue:
https://issues.apache.org/jira/browse/FLINK-19916
This is because Hadoop 3 starts asynchronous threads to execute some shutdown hooks.
These hooks are run after the job is executed, as a result, the classloader has been released, but in hooks, configuration still holds the released classloader, so it will fail to throw an exception in this asynchronous thread.
Now it doesn't affect our function, it just prints the exception stack on the console.
大致意思是:由于hadoop3引入了异步的线程来执行shutdown hook,该hook会在任务执行时运行,由于classloader已经被释放,但是hook中仍然持有该classloader而跑出异常。该异常不影响正常功能,仅在控制台打印日志。
网上提出一种解决方法:在 flink 配置文件里 flink-conf.yaml设置 classloader.check-leaked-classloader: false