Hbase解析之一文搞懂协处理器
1. 简介
hbase协处理器通过实现Coprocessor接口,实现一个协处理。其中包括RegionServerCoprocessor、RegionServerCoprocessor、WALCoprocessor、MasterCoprocessor四种。通过实现不同的Coprocessor方法,完成协处理的生命周期(协处理的加载、卸载等),必须实现每个接口中类似getMasterObserver()的方法。
2. 协处理的分类
2. 1 endpoint方式
实现四种Coprocessor接口后,只是完成了协处理的第一步,接下来就是完成协处理器的具体操作
如果实现类似存储过程的方式,需要协处理实现RegionServerServices,完成region服务的注册实现。MasterServices,完成master服务的注册,也就是endpoint方式。
2. 2 observe方式
如果协处理实现一下接口,即可实现观察者模式
MasterObserver
RegionObserver
RegionServerObserver
WALObserver
3. 协处理的加载方式
不同类型的协处理加载方式是不同的,本文主要解析下master发和regin两种方式
3.1 Master加载方式
Hmaster或者regionserver启动时会创建MasterCoprocessorHosts对象,MasterCoprocessorHosts提供协处理器框架和环境,Hmaster通过这个类与加载的协处理器交互。
loadSystemCoprocessors在系统启动时只会调用一次,完成协处理器的加载和环境变量的设置。
protected void loadSystemCoprocessors(Configuration conf, String confKey) {
boolean coprocessorsEnabled = conf.getBoolean(COPROCESSORS_ENABLED_CONF_KEY,
DEFAULT_COPROCESSORS_ENABLED);
if (!coprocessorsEnabled) {
return;
}
Class<?> implClass;
//通过读取配置文件,获取协处理器的名称
//MASTER = "hbase.coprocessor.master.classes”;
//WAL =“hbase.coprocessor.wal.classes"
//REGIONSERVER = "hbase.coprocessor.regionserver.classes”;
String[] defaultCPClasses = conf.getStrings(confKey);
if (defaultCPClasses == null || defaultCPClasses.length == 0)
return;
int priority = Coprocessor.PRIORITY_SYSTEM;
for (String className : defaultCPClasses) {
className = className.trim();
if (findCoprocessor(className) != null) {
// If already loaded will just continue
LOG.warn("Attempted duplicate loading of " + className + "; skipped");
continue;
}
//获取当前线程的类加载器
ClassLoader cl = this.getClass().getClassLoader();
Thread.currentThread().setContextClassLoader(cl);
try {
implClass = cl.loadClass(className);
//避免重复加载
E env = checkAndLoadInstance(implClass, priority, conf);
if (env != null) {
this.coprocEnvironments.add(env);
LOG.info("System coprocessor {} loaded, priority={}.", className, priority);
++priority;
}
} catch (Throwable t) {
// We always abort if system coprocessors cannot be loaded
abortServer(className, t);
}
}
}
进一步分析checkAndLoadInstance方法,主要完成类类型判断和环境变量的创建
public E checkAndLoadInstance(Class<?> implClass, int priority, Configuration conf)
throws IOException {
// create the instance
C impl;
try {
//检查类的类型
impl = checkAndGetInstance(implClass);
if (impl == null) {
LOG.error("Cannot load coprocessor " + implClass.getSimpleName());
return null;
}
} catch (InstantiationException|IllegalAccessException e) {
throw new IOException(e);
}
// create the environment
E env = createEnvironment(impl, priority, loadSequence.incrementAndGet(), conf);
assert env instanceof BaseEnvironment;
((BaseEnvironment<C>) env).startup();
// HBASE-4014: maintain list of loaded coprocessors for later crash analysis
// if server (master or regionserver) aborts.
coprocessorNames.add(implClass.getName());
return env;
}
checkAndGetInstance方法,返回实现接口的MasterCoprocessor。以master为例:
public MasterCoprocessor checkAndGetInstance(Class<?> implClass)
throws InstantiationException, IllegalAccessException {
try {
//如果是MasterCoprocessor
if (MasterCoprocessor.class.isAssignableFrom(implClass)) {
return implClass.asSubclass(MasterCoprocessor.class).getDeclaredConstructor().newInstance();
//如果是CoprocessorService
//判定此 CoprocessorService 对象所表示的类或接口与指定的 implClass 参数所表示的类或接口是否相同,或是否是其超类或超
//接口。
} else if (CoprocessorService.class.isAssignableFrom(implClass)) {
// For backward compatibility with old CoprocessorService impl which don't extend
// MasterCoprocessor.
CoprocessorService cs;
//将调用这个方法的implClass对象转换成由CoprocessorService参数所表示的class对象的某个子类
cs = implClass.asSubclass(CoprocessorService.class).getDeclaredConstructor().newInstance();
return new CoprocessorServiceBackwardCompatiblity.MasterCoprocessorService(cs);
} else {
LOG.error("{} is not of type MasterCoprocessor. Check the configuration of {}",
implClass.getName(), CoprocessorHost.MASTER_COPROCESSOR_CONF_KEY);
return null;
}
} catch (NoSuchMethodException | InvocationTargetException e) {
throw (InstantiationException) new InstantiationException(implClass.getName()).initCause(e);
}
}
createEnvironment方法,构建环境变量,以master为例
public MasterEnvironment createEnvironment(final MasterCoprocessor instance, final int priority,
final int seq, final Configuration conf) {
//instance实现了services方法,注册到masterServices,所以endpoint的实现方式,必须实现getServices()方法
for (Service service : instance.getServices()) {
masterServices.registerService(service);
}
// CoreCoprocessor类型 MasterQuotasObserver/ReplicationObserver/AccessController等
// If a CoreCoprocessor, return a 'richer' environment, one laden with MasterServices.
return instance.getClass().isAnnotationPresent(CoreCoprocessor.class)?
new MasterEnvironmentForCoreCoprocessors(instance, priority, seq, conf, masterServices):
new MasterEnvironment(instance, priority, seq, conf, masterServices);
}
系统协处理的加载方式,是在系统启动时,读取配置文件,通过反射的方式动态的加载协处理。最终将配置中定义的协处理
封装为MasterEnvironment或者RegionEnvironment对象,添加到master对象中。
3.2 Region协处理加载方式
创建region对象时,与master类似,同样会创建RegionCoprocessorHost,RegionCoprocessorHost提供协处理器框架和环境,region通过这个类与加载的协处理器交互。与master不同的时,region需要加载多种协处理loadSystemCoprocessors、loadTableCoprocessors
// load system default cp's from configuration.
loadSystemCoprocessors(conf, REGION_COPROCESSOR_CONF_KEY);
// load system default cp's for user tables from configuration.
if (!region.getRegionInfo().getTable().isSystemTable()) {
loadSystemCoprocessors(conf, USER_REGION_COPROCESSOR_CONF_KEY);
}
// load Coprocessor From HDFS
loadTableCoprocessors(conf);
重点讨论一下loadTableCoprocessors方法。
void loadTableCoprocessors(final Configuration conf) {
//标识位的判断
boolean coprocessorsEnabled = conf.getBoolean(COPROCESSORS_ENABLED_CONF_KEY,
DEFAULT_COPROCESSORS_ENABLED);
boolean tableCoprocessorsEnabled = conf.getBoolean(USER_COPROCESSORS_ENABLED_CONF_KEY,
DEFAULT_USER_COPROCESSORS_ENABLED);
if (!(coprocessorsEnabled && tableCoprocessorsEnabled)) {
return;
}
// scan the table attributes for coprocessor load specifications
// initialize the coprocessors
List<RegionCoprocessorEnvironment> configured = new ArrayList<>();
//从表描述中创建 TableCoprocessorAttribute,Attribute包含了path、classneam、优先级
//这些都是创建表或者修改表描述时定义的,path可以时本地路径或者hdfs,path如果没有定义,就从当前类路径
//加载
for (TableCoprocessorAttribute attr: getTableCoprocessorAttrsFromSchema(conf,
region.getTableDescriptor())) {
// Load encompasses classloading and coprocessor initialization
try {
RegionCoprocessorEnvironment env = load(attr.getPath(), attr.getClassName(),
attr.getPriority(), attr.getConf());
if (env == null) {
continue;
}
configured.add(env);
LOG.info("Loaded coprocessor " + attr.getClassName() + " from HTD of " +
region.getTableDescriptor().getTableName().getNameAsString() + " successfully.");
} catch (Throwable t) {
// Coprocessor failed to load, do we abort on error?
if (conf.getBoolean(ABORT_ON_ERROR_KEY, DEFAULT_ABORT_ON_ERROR)) {
abortServer(attr.getClassName(), t);
} else {
LOG.error("Failed to load coprocessor " + attr.getClassName(), t);
}
}
}
// add together to coprocessor set for COW efficiency
coprocEnvironments.addAll(configured);
}
```java
重点看看load方法
```java
public E load(Path path, String className, int priority,
Configuration conf, String[] includedClassPrefixes) throws IOException {
Class<?> implClass;
LOG.debug("Loading coprocessor class " + className + " with path " +
path + " and priority " + priority);
boolean skipLoadDuplicateCoprocessor = conf.getBoolean(SKIP_LOAD_DUPLICATE_TABLE_COPROCESSOR,
DEFAULT_SKIP_LOAD_DUPLICATE_TABLE_COPROCESSOR);
if (skipLoadDuplicateCoprocessor && findCoprocessor(className) != null) {
// If already loaded will just continue
LOG.warn("Attempted duplicate loading of {}; skipped", className);
return null;
}
ClassLoader cl = null;
if (path == null) {
try {
//path为null,当前类路径,是同一个类加载器
implClass = getClass().getClassLoader().loadClass(className);
} catch (ClassNotFoundException e) {
throw new IOException("No jar path specified for " + className);
}
} else {
//path不为null,文件在hdfs中,可能不是同一个类加载器
cl = CoprocessorClassLoader.getClassLoader(
path, getClass().getClassLoader(), pathPrefix, conf);
try {
implClass = ((CoprocessorClassLoader)cl).loadClass(className, includedClassPrefixes);
} catch (ClassNotFoundException e) {
throw new IOException("Cannot load external coprocessor class " + className, e);
}
}
//currentThread的getContextClassLoader是原有代理类加载模式的一种补充。提供一种在子ClassLoader加载的类中获取父
//ClassLoader的实例来操作父加载器加载类的方法。
//确保 user调用和system调用加载的协处理类是同一个类加载器
//load custom code for coprocessor
Thread currentThread = Thread.currentThread();
ClassLoader hostClassLoader = currentThread.getContextClassLoader();
try{
// switch temporarily to the thread classloader for custom CP
currentThread.setContextClassLoader(cl);
//创建RegionEnvironment
//加载服务,如果实现RegionServerServices(endpoint)
E cpInstance = checkAndLoadInstance(implClass, priority, conf);
return cpInstance;
} finally {
// restore the fresh (host) classloader
currentThread.setContextClassLoader(hostClassLoader);
}
}