hindex在源码中关于split的修改部分
1、CompactSplitThread.requestSplit 如果是索引表直接退出
2.SplitTransaction.execute 将原始的execute函数成两个函数分别是 execute 和 stepsAfterPONR,添加了
this.parent.getCoprocessorHost().preSplitAfterPONR();
3. SplitTransaction.createDaughters 将原生的createDaughters 拆成了多个函数,
添加的功能:
先看是否开启索引功能,如果开启了则调用info = this.parent.getCoprocessorHost().preSplitBeforePONR(this.splitrow);去触发 preSplitBeforePONR 函数。
然后如果没有开启索引功能的话,只会将原表在meta中注册为offline并且有相关的两个子region信息。如果开启索引功能的话,会将原表和索引表都在meta中注册为offline,并且有相关的子region信息。
--------------------------------------------
IndexRegionObserver分析
--------------------------------------------
1、 preSplit
(1)、实例化一个 SplitInfo 放入ThreadLocal<SplitInfo> splitThreadLocal变量中,以便使用。
2、 preSplitBeforePONR:
(1)、得到当前HRegionServer上当前region对应的索引region
(2)、st = new SplitTransaction(indexRegion, splitKey); 根据索引region与表splitKey生成SplitTransaction实例
(3)、生成并返回索引的splitInfo
3、 preSplitAfterPONR
(1)、如果当前表是索引表则直接返回
(2)、如果原表上有索引,则得到索引表的splitInfo,
(3)、获得索引表对应的splitTransaction,和子region
(4)、执行splitTransaction.stepsAfterPONR(rs, rs, daughters);
/*
* Format for the rowkey for index table [Startkey for the index region] + [one 0 byte] + [Index
* name] + [Padding for the max index name] + [[index col value]+[padding for the max col value]
* for each of the index col] + [user table row key] To know the reason for adding empty byte
* array refert to HDP-1666
*/
[Startkey for the index region]
+ [one 0 byte]
+ [Index name] + [Padding for the max index name]
+ [[index col value]+[padding for the max col value] for each of the index col]
+ [user table row key]
1、CompactSplitThread.requestSplit 如果是索引表直接退出
public synchronized void requestSplit(final HRegion r, byte[] midKey) {
if (midKey == null) {
LOG.debug("Region " + r.getRegionNameAsString() +
" not splittable because midkey=null");
return;
}
boolean indexUsed = this.conf.getBoolean("hbase.use.secondary.index", false);
if (indexUsed) {
if (r.getRegionInfo().getTableNameAsString().endsWith("_idx")) {
LOG.warn("Split issued on the index region which is not allowed."
+ "Returning without splitting the region.");
return;
}
}
try {
this.splits.execute(new SplitRequest(r, midKey, this.server));
if (LOG.isDebugEnabled()) {
LOG.debug("Split requested for " + r + ". " + this);
}
} catch (RejectedExecutionException ree) {
LOG.info("Could not execute split for " + r, ree);
}
}
2.SplitTransaction.execute 将原始的execute函数成两个函数分别是 execute 和 stepsAfterPONR,添加了
this.parent.getCoprocessorHost().preSplitAfterPONR();
public PairOfSameType<HRegion> execute(final Server server,
final RegionServerServices services)
throws IOException {
PairOfSameType<HRegion> regions = createDaughters(server, services);
if (this.parent.getCoprocessorHost() != null) {
this.parent.getCoprocessorHost().preSplitAfterPONR();
}
stepsAfterPONR(server, services, regions);
return regions;
}
public void stepsAfterPONR(final Server server, final RegionServerServices services,
PairOfSameType<HRegion> regions) throws IOException {
openDaughters(server, services, regions.getFirst(), regions.getSecond());
transitionZKNode(server, services, regions.getFirst(), regions.getSecond());
}
3. SplitTransaction.createDaughters 将原生的createDaughters 拆成了多个函数,
添加的功能:
先看是否开启索引功能,如果开启了则调用info = this.parent.getCoprocessorHost().preSplitBeforePONR(this.splitrow);去触发 preSplitBeforePONR 函数。
然后如果没有开启索引功能的话,只会将原表在meta中注册为offline并且有相关的两个子region信息。如果开启索引功能的话,会将原表和索引表都在meta中注册为offline,并且有相关的子region信息。
Edit parent in meta. Offlines parent region and adds splita and splitb.
/* package */PairOfSameType<HRegion> createDaughters(final Server server,
final RegionServerServices services) throws IOException {
LOG.info("Starting split of region " + this.parent);
boolean secondaryIndex =
server == null ? false : server.getConfiguration().getBoolean("hbase.use.secondary.index",
false);
boolean indexRegionAvailable = false;
if ((server != null && server.isStopped()) ||
(services != null && services.isStopping())) {
throw new IOException("Server is stopped or stopping");
}
assert !this.parent.lock.writeLock().isHeldByCurrentThread(): "Unsafe to hold write lock while performing RPCs";
// Coprocessor callback
if (this.parent.getCoprocessorHost() != null) {
this.parent.getCoprocessorHost().preSplit();
}
boolean testing = server == null? true:
server.getConfiguration().getBoolean("hbase.testing.nocluster", false);
PairOfSameType<HRegion> daughterRegionsPair = stepsBeforeAddingPONR(server, services, testing);
SplitInfo info = null;
// Coprocessor callback
if (secondaryIndex) {
if (this.parent.getCoprocessorHost() != null) {
info = this.parent.getCoprocessorHost().preSplitBeforePONR(this.splitrow);
if (info == null) {
throw new IOException("Pre split of Index region has failed.");
}
if ((info.getSplitTransaction() != null && info.getDaughters() != null)) {
indexRegionAvailable = true;
}
}
}
// add one hook
// do the step till started_region_b_creation
// This is the point of no return. Adding subsequent edits to .META. as we
// do below when we do the daughter opens adding each to .META. can fail in
// various interesting ways the most interesting of which is a timeout
// BUT the edits all go through (See HBASE-3872). IF we reach the PONR
// then subsequent failures need to crash out this regionserver; the
// server shutdown processing should be able to fix-up the incomplete split.
// The offlined parent will have the daughters as extra columns. If
// we leave the daughter regions in place and do not remove them when we
// crash out, then they will have their references to the parent in place
// still and the server shutdown fixup of .META. will point to these
// regions.
// We should add PONR JournalEntry before offlineParentInMeta,so even if
// OfflineParentInMeta timeout,this will cause regionserver exit,and then
// master ServerShutdownHandler will fix daughter & avoid data loss. (See
// HBase-4562).
this.journal.add(JournalEntry.PONR);
// Edit parent in meta. Offlines parent region and adds splita and splitb.
if (!testing) {
if (!indexRegionAvailable) {
MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(),
daughterRegionsPair.getFirst().getRegionInfo(), daughterRegionsPair.getSecond()
.getRegionInfo());
} else {
offlineParentInMetaBothIndexAndMainRegion(server.getCatalogTracker(),
this.parent.getRegionInfo(), daughterRegionsPair.getFirst().getRegionInfo(),
daughterRegionsPair.getSecond().getRegionInfo(),
info.getSplitTransaction().parent.getRegionInfo(), info.getDaughters().getFirst()
.getRegionInfo(), info.getDaughters().getSecond().getRegionInfo());
}
}
return daughterRegionsPair;
}
private static void offlineParentInMetaBothIndexAndMainRegion(CatalogTracker catalogTracker,
HRegionInfo parent, final HRegionInfo a, final HRegionInfo b, final HRegionInfo parentIdx,
final HRegionInfo idxa, final HRegionInfo idxb) throws NotAllMetaRegionsOnlineException,
IOException {
HRegionInfo copyOfParent = new HRegionInfo(parent);
copyOfParent.setOffline(true);
copyOfParent.setSplit(true);
List<Put> list = new ArrayList<Put>();
Put put = new Put(copyOfParent.getRegionName());
put.add(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER,
Writables.getBytes(copyOfParent));
put.add(HConstants.CATALOG_FAMILY, HConstants.SPLITA_QUALIFIER, Writables.getBytes(a));
put.add(HConstants.CATALOG_FAMILY, HConstants.SPLITB_QUALIFIER, Writables.getBytes(b));
list.add(put);
HRegionInfo copyOfIdxParent = new HRegionInfo(parentIdx);
copyOfIdxParent.setOffline(true);
copyOfIdxParent.setSplit(true);
Put putForIdxRegion = new Put(copyOfIdxParent.getRegionName());
putForIdxRegion.add(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER,
Writables.getBytes(copyOfIdxParent));
putForIdxRegion.add(HConstants.CATALOG_FAMILY, HConstants.SPLITA_QUALIFIER,
Writables.getBytes(idxa));
putForIdxRegion.add(HConstants.CATALOG_FAMILY, HConstants.SPLITB_QUALIFIER,
Writables.getBytes(idxb));
list.add(putForIdxRegion);
putToMetaTable(catalogTracker, list);
LOG.info("Offlined parent region " + parent.getRegionNameAsString() + " in META");
}
private static void putToMetaTable(final CatalogTracker ct, final List<Put> p) throws IOException {
org.apache.hadoop.hbase.client.HConnection c = ct.getConnection();
if (c == null) throw new NullPointerException("No connection");
put(new HTable(ct.getConnection().getConfiguration(), HConstants.META_TABLE_NAME), p);
}
private static void put(final HTable t, final List<Put> p) throws IOException {
try {
t.put(p);
} finally {
t.close();
}
}
public PairOfSameType<HRegion> stepsBeforeAddingPONR(final Server server, final RegionServerServices services,
boolean testing) throws IOException {
// Set ephemeral SPLITTING znode up in zk. Mocked servers sometimes don't
// have zookeeper so don't do zk stuff if server or zookeeper is null
if (server != null && server.getZooKeeper() != null) {
try {
createNodeSplitting(server.getZooKeeper(),
this.parent.getRegionInfo(), server.getServerName());
} catch (KeeperException e) {
throw new IOException("Failed creating SPLITTING znode on " +
this.parent.getRegionNameAsString(), e);
}
}
this.journal.add(JournalEntry.SET_SPLITTING_IN_ZK);
if (server != null && server.getZooKeeper() != null) {
try {
// Transition node from SPLITTING to SPLITTING after creating the split node.
// Master will get the callback for node change only if the transition is successful.
// Note that if the transition fails then the rollback will delete the created znode
// TODO : May be we can add some new state to znode and handle the new state incase of success/failure
this.znodeVersion = transitionNodeSplitting(server.getZooKeeper(),
this.parent.getRegionInfo(), server.getServerName(), -1);
} catch (KeeperException e) {
throw new IOException("Failed setting SPLITTING znode on "
+ this.parent.getRegionNameAsString(), e);
}
}
createSplitDir(this.parent.getFilesystem(), this.splitdir);
this.journal.add(JournalEntry.CREATE_SPLIT_DIR);
List<StoreFile> hstoreFilesToSplit = null;
Exception exceptionToThrow = null;
try{
hstoreFilesToSplit = this.parent.close(false);
} catch (Exception e) {
exceptionToThrow = e;
}
if (exceptionToThrow == null && hstoreFilesToSplit == null) {
// The region was closed by a concurrent thread. We can't continue
// with the split, instead we must just abandon the split. If we
// reopen or split this could cause problems because the region has
// probably already been moved to a different server, or is in the
// process of moving to a different server.
exceptionToThrow = closedByOtherException;
}
if (exceptionToThrow != closedByOtherException) {
this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
}
if (exceptionToThrow != null) {
if (exceptionToThrow instanceof IOException) throw (IOException)exceptionToThrow;
throw new IOException(exceptionToThrow);
}
if (!testing) {
services.removeFromOnlineRegions(this.parent.getRegionInfo().getEncodedName());
}
this.journal.add(JournalEntry.OFFLINED_PARENT);
// TODO: If splitStoreFiles were multithreaded would we complete steps in
// less elapsed time? St.Ack 20100920
//
// splitStoreFiles creates daughter region dirs under the parent splits dir
// Nothing to unroll here if failure -- clean up of CREATE_SPLIT_DIR will
// clean this up.
splitStoreFiles(this.splitdir, hstoreFilesToSplit);
// Log to the journal that we are creating region A, the first daughter
// region. We could fail halfway through. If we do, we could have left
// stuff in fs that needs cleanup -- a storefile or two. Thats why we
// add entry to journal BEFORE rather than AFTER the change.
this.journal.add(JournalEntry.STARTED_REGION_A_CREATION);
HRegion a = createDaughterRegion(this.hri_a, this.parent.rsServices);
// Ditto
this.journal.add(JournalEntry.STARTED_REGION_B_CREATION);
HRegion b = createDaughterRegion(this.hri_b, this.parent.rsServices);
return new PairOfSameType<HRegion>(a,b);
}
--------------------------------------------
IndexRegionObserver分析
--------------------------------------------
1、 preSplit
(1)、实例化一个 SplitInfo 放入ThreadLocal<SplitInfo> splitThreadLocal变量中,以便使用。
2、 preSplitBeforePONR:
(1)、得到当前HRegionServer上当前region对应的索引region
(2)、st = new SplitTransaction(indexRegion, splitKey); 根据索引region与表splitKey生成SplitTransaction实例
(3)、生成并返回索引的splitInfo
3、 preSplitAfterPONR
(1)、如果当前表是索引表则直接返回
(2)、如果原表上有索引,则得到索引表的splitInfo,
(3)、获得索引表对应的splitTransaction,和子region
(4)、执行splitTransaction.stepsAfterPONR(rs, rs, daughters);
rowkey结构
/*
* Format for the rowkey for index table [Startkey for the index region] + [one 0 byte] + [Index
* name] + [Padding for the max index name] + [[index col value]+[padding for the max col value]
* for each of the index col] + [user table row key] To know the reason for adding empty byte
* array refert to HDP-1666
*/
[Startkey for the index region]
+ [one 0 byte]
+ [Index name] + [Padding for the max index name]
+ [[index col value]+[padding for the max col value] for each of the index col]
+ [user table row key]