文章目录
错误日志分析
Spark 1.6 在启动Spark Thrift Server的时候,Beeline不断出现报错:
0: jdbc:hive2://10.59.34.204:10000> show tables;
Error: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiateorg.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient (state=,code=0)
Server 后台报错日志如下:
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1562)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:67)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:82)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3313)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3332)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1184)
... 60 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1560)
... 65 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl not org.apache.hadoop.hive.metastore.MetaStoreFilterHook
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2234)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.loadFilterHooks(HiveMetaStoreClient.java:248)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:200)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 70 more
Caused by: java.lang.RuntimeException: class org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl not org.apache.hadoop.hive.metastore.MetaStoreFilterHook
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2228)
... 73 more
根据最后两个Stack信息对应的源码
// DefaultMetaStoreFilterHookImpl
private MetaStoreFilterHook loadFilterHooks() throws IllegalStateException {
Class<? extends MetaStoreFilterHook> authProviderClass = conf.
getClass(HiveConf.ConfVars.METASTORE_FILTER_HOOK.varname,
DefaultMetaStoreFilterHookImpl.class,
MetaStoreFilterHook.class);
String msg = "Unable to create instance of " + authProviderClass.getName() + ": ";
// ....
}
// Configuration
public <U> Class<? extends U> getClass(String name,
Class<? extends U> defaultValue,
Class<U> xface) {
try {
Class<?> theClass = getClass(name, defaultValue);
if (theClass != null && !xface.isAssignableFrom(theClass))
throw new RuntimeException(theClass+" not "+xface.getName());
// ...
} catch (Exception e) {
throw new RuntimeException(e);
}
}
通过 Configuration 配置反射加载类 Class对象
经过确认,整个运行环境中,class 类 DefaultMetaStoreFilterHookImpl.class
和 MetaStoreFilterHook.class
只有一份。且源码中很明显 DefaultMetaStoreFilterHookImpl
是 MetaStoreFilterHook
的子类。
跟踪显示,DefaultMetaStoreFilterHookImpl
是由 Launcher$AppClassLoader
类加载器加载的,MetaStoreFilterHook
是由 IsolatedClientLoader
类加载器加载的。因此,在JVM中,isAssignableFrom方法判断这两个类是没用继承关系的。
接下来看一下这两个类为什么会由两个不同的类加载器加载。
MetaStoreFilterHook
类加载
MetaStoreFilterHook
是由 IsolatedClientLoader
类加载器加载的。
在Spark执行的时候会通过 withHiveState
函数更改函数运行时的classLoader对象,程序运行完毕后,再切换回原来的ClassLoader。
IsolatedClientLoader
就是在这个时候设置进来的。
// ClientWrapper
def withHiveState[A](f: => A): A = retryLocked {
val original = Thread.currentThread().getContextClassLoader
// The classloader in clientLoader could be changed after addJar, always use the latest
// classloader
state.getConf.setClassLoader(clientLoader.classLoader)
Thread.currentThread().setContextClassLoader(clientLoader.classLoader)
// Set the thread local metastore client to the client associated with this ClientWrapper.
Hive.set(client)
// setCurrentSessionState will use the classLoader associated
// with the HiveConf in `state` to override the context class loader of the current
// thread.
shim.setCurrentSessionState(state)
val ret = try f finally {
Thread.currentThread().setContextClassLoader(original)
state.getConf.setClassLoader(original)
}
ret
}
后面的代码大部分都是在和Hive交互,所以需要的类默认都是有 IsolatedClientLoader
进行加载的。所以代码中MetaStoreFilterHook.class
这种方式的加载时,类加载器就是 IsolatedClientLoader
。
DefaultMetaStoreFilterHookImpl
类加载
DefaultMetaStoreFilterHookImpl
类的加载是通过Hadoop中Configuration
类的getClass
方法加载的。
真正的加载方法在getClassByNameOrNull()
方法中。这个方法中其实已经制定了类加载classLoader
。如果这个classLoader
可以加载(包括懒加载方式)到则返回,否则直接失败。这里的classLoader
就是 Launcher$AppClassLoader
。
// Configuration : Class<?> theClass = getClass(name, defaultValue);
// Configuration : getClassByName(valueString);
// Configuration : getClassByNameOrNull(name);
/**
* Load a class by name, returning null rather than throwing an exception
* if it couldn't be loaded. This is to avoid the overhead of creating
* an exception.
*
* @param name the class name
* @return the class object, or null if it could not be found.
*/
public Class<?> getClassByNameOrNull(String name) {
Map<String, WeakReference<Class<?>>> map;
synchronized (CACHE_CLASSES) {
map = CACHE_CLASSES.get(classLoader);
if (map == null) {
map = Collections.synchronizedMap(
new WeakHashMap<String, WeakReference<Class<?>>>());
CACHE_CLASSES.put(classLoader, map);
}
}
Class<?> clazz = null;
WeakReference<Class<?>> ref = map.get(name);
if (ref != null) {
clazz = ref.get();
}
if (clazz == null) {
try {
clazz = Class.forName(name, true, classLoader);
} catch (ClassNotFoundException e) {
// Leave a marker that the class isn't found
map.put(name, new WeakReference<Class<?>>(NEGATIVE_CACHE_SENTINEL));
return null;
}
// two putters can race here, but they'll put the same class
map.put(name, new WeakReference<Class<?>>(clazz));
return clazz;
} else if (clazz == NEGATIVE_CACHE_SENTINEL) {
return null; // not found
} else {
// cache hit
return clazz;
}
}
Hive Conf中的类加载器
上面的代码设计,在并发程序设计的时候,可以做到很好的资源隔离。同时 Hadoop,Hive 也都预留了定义设置 classLoader的入口。
// Hadoop : Configuration
// Hive : public class HiveConf extends Configuration
public void setClassLoader(ClassLoader classLoader) {
this.classLoader = classLoader;
}
问题出现了,spark在withHiveState
方法中, state.getConf.setClassLoader(clientLoader.classLoader)
已经设置过了classLoader了,为什么这里的Configuration里的classLoader 还是Java默认的AppClassLoader
呢?
state.getConf.setClassLoader(clientLoader.classLoader)
确实改变了SessionState
中的HiveConf
类的类加载器。但是ClientWrapper
中负责和hive交互的client
对象有问题。
// ClientWrapper
def client: Hive = {
if (clientLoader.cachedHive != null) {
clientLoader.cachedHive.asInstanceOf[Hive]
} else {
val c = Hive.get(conf)
clientLoader.cachedHive = c
c
}
}
// IsolatedClientLoader
/**
* The place holder for shared Hive client for all the HiveContext sessions (they share an
* IsolatedClientLoader).
*/
private[hive] var cachedHive: Any = null
这里可以看到 IsolatedClientLoader
中保存了一个HiveClient,且在All HiveContext session中共享。当我们val c = Hive.get(conf)
进行初始化后,以后相同的clientLoader
都会使用自己的classLoader
来初始化 hive conf 和 hive client 实例。
IsolatedClientLoader
的classLoader
加载类的过程如下, 通过isSharedClass()
方法判断,对于Hadoop的class,且开启共享(默认值),使用URLClassLoader
的父类加载器进行加载,所以最终我们看到了Configuration
是由Launcher$AppClassLoader
进行加载的。
// IsolatedClientLoader
private[hive] val classLoader: MutableURLClassLoader = {
val isolatedClassLoader =
if (isolationOn) {
new URLClassLoader(allJars, rootClassLoader) {
override def loadClass(name: String, resolve: Boolean): Class[_] = {
val loaded = findLoadedClass(name)
if (loaded == null) doLoadClass(name, resolve) else loaded
}
def doLoadClass(name: String, resolve: Boolean): Class[_] = {
val classFileName = name.replaceAll("\\.", "/") + ".class"
if (isBarrierClass(name)) {
// For barrier classes, we construct a new copy of the class.
val bytes = IOUtils.toByteArray(baseClassLoader.getResourceAsStream(classFileName))
logDebug(s"custom defining: $name - ${util.Arrays.hashCode(bytes)}")
defineClass(name, bytes, 0, bytes.length)
} else if (!isSharedClass(name)) {
logDebug(s"hive class: $name - ${getResource(classToPath(name))}")
super.loadClass(name, resolve)
} else {
// For shared classes, we delegate to baseClassLoader.
logDebug(s"shared class: $name")
baseClassLoader.loadClass(name)
}
}
}
} else {
baseClassLoader
}
// ...
new NonClosableMutableURLClassLoader(isolatedClassLoader)
}
protected def isSharedClass(name: String): Boolean = {
val isHadoopClass =
name.startsWith("org.apache.hadoop.") && !name.startsWith("org.apache.hadoop.hive.")
name.contains("slf4j") ||
name.contains("log4j") ||
name.startsWith("org.apache.spark.") ||
(sharesHadoopClasses && isHadoopClass) ||
name.startsWith("scala.") ||
(name.startsWith("com.google") && !name.startsWith("com.google.cloud")) ||
name.startsWith("java.lang.") ||
name.startsWith("java.net") ||
sharedPrefixes.exists(name.startsWith)
}
接下来就简单了,因为Configuration
是由Launcher$AppClassLoader
进行加载的,所以,Configuration
类中再去加载类的时候也就是Launcher$AppClassLoader
加载的。
程序中使用到的类加载器
ExtClassLoader
AppClassLoader
class URLClassLoader : Java 自带ClassLoader,可以加载指定目录下的Jar包
class MutableURLClassLoader extends URLClassLoader : 只是覆盖了父类的addURL(url: URL)和getURLs()两个方法,没有做扩展
class ChildFirstURLClassLoader extends MutableURLClassLoader : loadClass() 时,优先执行super.loadClass(), 其次再是 parent.loadClass()
class NonClosableMutableURLClassLoader extends MutableURLClassLoader : 重写close() 方法,该classLoader不能被closed
IsolatedClientLoader 这个只是一个Client 的加载器,不是类加载器。内部成员变量 classLoader 是一个NonClosableMutableURLClassLoader 的实例。
为什么Spark-Sql 没有遇到这个问题
通过上面的研究,基本搞清楚了问题原因是因为不同的类加载器加载的类不一致,出现了问题现象。那为什么Spark-SQL没有遇到类似的问题呢?
接下来要先搞清楚下面几个名词
// HiveContext
override def newSession(): HiveContext = {
new HiveContext(
sc = sc,
cacheManager = cacheManager,
listener = listener,
execHive = executionHive.newSession(),
metaHive = metadataHive.newSession(),
isRootContext = false)
}
// ClientWrapper
def newSession(): ClientWrapper = {
// 通过new(isolationOn = false) 或者 反射创建(isolationOn = true)
clientLoader.createClient().asInstanceOf[ClientWrapper]
}
- Session & HiveContext : 每次新建回话的时候,会通过HiveContext的newSession()方法创建一个新的HiveContext对象。HiveContext 注释:Returns a new HiveContext as new session, which will have separated SQLConf, UDF/UDAF, temporary tables and SessionState, but sharing the same CacheManager, IsolatedClientLoader and Hive client (both of execution and metadata) with existing HiveContext.
- SessionState :SessionState 中存储和当前session相关的公共数据(包括Hive Conf),上面newSession()调用时,新建ClientWrapper,SessionState 实例,但是HiveClient 等对象共享
- val clientLoader: IsolatedClientLoader : Hive 客户度加载器,负责共享部分class类,Hive客户度封装等
- clientLoader.cachedHive :The place holder for shared Hive client for all the HiveContext sessions,就是说只要是相同的 clientLoader 对hive的操作,hive客户度对象都是同一个(但是Hive Client的Conf 可以是完全不同的)
- Hive 实例:负责和Hive交互的客户度,会在当前线程进行本地缓存(ThreadLocal变量),初始化的时候会根据当前SessionState的conf配置文件创建实例
当我们使用Spark-SQL的时候使用的是同一个HiveSession,这个Session在创建的时候hiveConf 的classLoader为IsolatedClientLoader.classLoader 属性,和当前现场classLoader一致,后续不会报错。
19/06/25 14:51:56 INFO client.ClientWrapper: init new HiveClient : clientLoader=org.apache.spark.sql.hive.client.IsolatedClientLoader@35536760, conf classLoader=org.apache.spark.sql.internal.NonClosableMutableURLClassLoader@194224ca, Hive=org.apache.hadoop.hive.ql.metadata.Hive@3278d065,
19/06/25 14:51:56 INFO client.ClientWrapper: withHiveState client = org.apache.hadoop.hive.ql.metadata.Hive@3278d065 conf classLoader = org.apache.spark.sql.internal.NonClosableMutableURLClassLoader@194224ca
当我们通过newSession新建回话的时候,clientLoader 对象是同一个,但是 clientLoader内部的 classLoader 对象却是一个普通的 MutableURLClassLoader ,所以Configuration加载的类使用 MutableURLClassLoader 来加载,当前线程使用 加载类的时候是使用 IsolatedClientLoader内部classLoader 来加载。
19/06/25 14:52:12 INFO client.ClientWrapper: newSession createClient client = org.apache.spark.sql.hive.client.ClientWrapper@af9eb18, clientLoader = org.apache.spark.sql.hive.client.IsolatedClientLoader@35536760 new SessionState initClassLoader conf loader = org.apache.hadoop.hive.ql.exec.UDFClassLoader@226b8d79
19/06/25 14:52:12 INFO client.ClientWrapper: withHiveState client = org.apache.hadoop.hive.ql.metadata.Hive@491893f8 conf classLoader = org.apache.spark.util.MutableURLClassLoader@38bc8ab5
19/06/25 14:52:12 INFO client.ClientWrapper: withHiveState current thread = org.apache.spark.sql.internal.NonClosableMutableURLClassLoader@197ce367 state classLoader = org.apache.spark.sql.internal.NonClosableMutableURLClassLoader@197ce367 conf classLoader = org.apache.spark.util.MutableURLClassLoader@38bc8ab5
相关对象创建层次关系
HiveContext
metadataHive -->
new IsolatedClientLoader( isolationOn = true )
cachedHive 不会变
hiveConf --> 跟随 SessionState 变量不断变化
classLoader --> 变化
executionHive -->
new IsolatedClientLoader( isolationOn = false )
cachedHive 不会变
hiveConf --> 跟随 SessionState 变量不断变化
classLoader --> 变化
上面这两个对象中使用独立的 IsolatedClientLoader
动态更新 HiveClient的Configuration的类加载器
自己更新ClassLoader
因为当前线程和Conf的类加载器在Spark的设计中都是跟随当前执行的classLoader变化的,解决上面问题的最好办法也还是按照这个思路来,来让Hive Conf 跟随当前ClassLoader动态变化。
ClientWrapper 修改后代码如下:
// ClientWrapper
def client: Hive = {
if (clientLoader.cachedHive != null) {
val c = clientLoader.cachedHive.asInstanceOf[Hive]
c.getConf.setClassLoader(clientLoader.classLoader)
c
} else {
val c = Hive.get(conf)
clientLoader.cachedHive = c
c
}
}
官方Patch
看了一下Spark官方代码,在2017年的时候有针对HiveClient的配置文件的ClassLoader进行动态更新SPAKR-19540
/**
* Runs `f` with ThreadLocal session state and classloaders configured for this version of hive.
*/
def withHiveState[A](f: => A): A = retryLocked {
val original = Thread.currentThread().getContextClassLoader
// The classloader in clientLoader could be changed after addJar, always use the latest
// classloader
state.getConf.setClassLoader(clientLoader.classLoader)
Thread.currentThread().setContextClassLoader(clientLoader.classLoader)
// Set the thread local metastore client to the client associated with this ClientWrapper.
Hive.set(client)
// Replace conf in the thread local Hive with current conf
// 在这里,会根据conf找到hive客户度对象,然后同步更新该对象的conf
Hive.get(conf)
// setCurrentSessionState will use the classLoader associated
// with the HiveConf in `state` to override the context class loader of the current
// thread.
shim.setCurrentSessionState(state)
val ret = try f finally {
Thread.currentThread().setContextClassLoader(original)
state.getConf.setClassLoader(original)
}
ret
}