华为hindex split

最新推荐文章于 2024-05-19 09:55:17 发布

kf_panda

最新推荐文章于 2024-05-19 09:55:17 发布

阅读量1k

点赞数

分类专栏： Hbase

本文链接：https://blog.csdn.net/gua___gua/article/details/51566573

版权

Hbase 专栏收录该内容

28 篇文章 1 订阅

订阅专栏

hindex在源码中关于split的修改部分

1、CompactSplitThread.requestSplit 如果是索引表直接退出

 public synchronized void requestSplit(final HRegion r, byte[] midKey) {
    if (midKey == null) {
      LOG.debug("Region " + r.getRegionNameAsString() +
        " not splittable because midkey=null");
      return;
    }
    boolean indexUsed = this.conf.getBoolean("hbase.use.secondary.index", false);
    if (indexUsed) {
      if (r.getRegionInfo().getTableNameAsString().endsWith("_idx")) {
        LOG.warn("Split issued on the index region which is not allowed."
            + "Returning without splitting the region.");
        return;
      }
    }
    try {
      this.splits.execute(new SplitRequest(r, midKey, this.server));
      if (LOG.isDebugEnabled()) {
        LOG.debug("Split requested for " + r + ".  " + this);
      }
    } catch (RejectedExecutionException ree) {
      LOG.info("Could not execute split for " + r, ree);
    }
  }

2.SplitTransaction.execute 将原始的execute函数成两个函数分别是 execute 和 stepsAfterPONR，添加了
this.parent.getCoprocessorHost().preSplitAfterPONR();

  public PairOfSameType<HRegion> execute(final Server server,
      final RegionServerServices services)
  throws IOException {
    PairOfSameType<HRegion> regions = createDaughters(server, services);
    if (this.parent.getCoprocessorHost() != null) {
      this.parent.getCoprocessorHost().preSplitAfterPONR();
    }
    stepsAfterPONR(server, services, regions);
    return regions;
  }

  public void stepsAfterPONR(final Server server, final RegionServerServices services,
      PairOfSameType<HRegion> regions) throws IOException {
    openDaughters(server, services, regions.getFirst(), regions.getSecond());
    transitionZKNode(server, services, regions.getFirst(), regions.getSecond());
  }

3. SplitTransaction.createDaughters 将原生的createDaughters 拆成了多个函数，
添加的功能：
先看是否开启索引功能，如果开启了则调用info = this.parent.getCoprocessorHost().preSplitBeforePONR(this.splitrow);去触发 preSplitBeforePONR 函数。
然后如果没有开启索引功能的话，只会将原表在meta中注册为offline并且有相关的两个子region信息。如果开启索引功能的话，会将原表和索引表都在meta中注册为offline,并且有相关的子region信息。

Edit parent in meta.  Offlines parent region and adds splita and splitb.


  /* package */PairOfSameType<HRegion> createDaughters(final Server server,
      final RegionServerServices services) throws IOException {
    LOG.info("Starting split of region " + this.parent);
    boolean secondaryIndex =
        server == null ? false : server.getConfiguration().getBoolean("hbase.use.secondary.index",
          false);
    boolean indexRegionAvailable = false;
    if ((server != null && server.isStopped()) ||
        (services != null && services.isStopping())) {
      throw new IOException("Server is stopped or stopping");
    }
    assert !this.parent.lock.writeLock().isHeldByCurrentThread(): "Unsafe to hold write lock while performing RPCs";


    // Coprocessor callback
    if (this.parent.getCoprocessorHost() != null) {
      this.parent.getCoprocessorHost().preSplit();
    }


    boolean testing = server == null? true:
      server.getConfiguration().getBoolean("hbase.testing.nocluster", false);


    PairOfSameType<HRegion> daughterRegionsPair = stepsBeforeAddingPONR(server, services, testing);


    SplitInfo info = null;
    // Coprocessor callback
    if (secondaryIndex) {
      if (this.parent.getCoprocessorHost() != null) {
        info = this.parent.getCoprocessorHost().preSplitBeforePONR(this.splitrow);
        if (info == null) {
          throw new IOException("Pre split of Index region has failed.");
        }
        if ((info.getSplitTransaction() != null && info.getDaughters() != null)) {
          indexRegionAvailable = true;
        }
      }
    }


    // add one hook
    // do the step till started_region_b_creation
    // This is the point of no return.  Adding subsequent edits to .META. as we
    // do below when we do the daughter opens adding each to .META. can fail in
    // various interesting ways the most interesting of which is a timeout
    // BUT the edits all go through (See HBASE-3872).  IF we reach the PONR
    // then subsequent failures need to crash out this regionserver; the
    // server shutdown processing should be able to fix-up the incomplete split.
    // The offlined parent will have the daughters as extra columns.  If
    // we leave the daughter regions in place and do not remove them when we
    // crash out, then they will have their references to the parent in place
    // still and the server shutdown fixup of .META. will point to these
    // regions.
    // We should add PONR JournalEntry before offlineParentInMeta,so even if
    // OfflineParentInMeta timeout,this will cause regionserver exit,and then
    // master ServerShutdownHandler will fix daughter & avoid data loss. (See
    // HBase-4562).
    this.journal.add(JournalEntry.PONR);


    // Edit parent in meta.  Offlines parent region and adds splita and splitb.
    if (!testing) {
      if (!indexRegionAvailable) {
        MetaEditor.offlineParentInMeta(server.getCatalogTracker(), this.parent.getRegionInfo(),
          daughterRegionsPair.getFirst().getRegionInfo(), daughterRegionsPair.getSecond()
              .getRegionInfo());
      } else {
        offlineParentInMetaBothIndexAndMainRegion(server.getCatalogTracker(),
          this.parent.getRegionInfo(), daughterRegionsPair.getFirst().getRegionInfo(),
          daughterRegionsPair.getSecond().getRegionInfo(),
          info.getSplitTransaction().parent.getRegionInfo(), info.getDaughters().getFirst()
              .getRegionInfo(), info.getDaughters().getSecond().getRegionInfo());
      }
    }
    return daughterRegionsPair;
  }


  
  private static void offlineParentInMetaBothIndexAndMainRegion(CatalogTracker catalogTracker,
      HRegionInfo parent, final HRegionInfo a, final HRegionInfo b, final HRegionInfo parentIdx,
      final HRegionInfo idxa, final HRegionInfo idxb) throws NotAllMetaRegionsOnlineException,
      IOException {
    HRegionInfo copyOfParent = new HRegionInfo(parent);
    copyOfParent.setOffline(true);
    copyOfParent.setSplit(true);
    List<Put> list = new ArrayList<Put>();
    Put put = new Put(copyOfParent.getRegionName());
    put.add(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER,
      Writables.getBytes(copyOfParent));
    put.add(HConstants.CATALOG_FAMILY, HConstants.SPLITA_QUALIFIER, Writables.getBytes(a));
    put.add(HConstants.CATALOG_FAMILY, HConstants.SPLITB_QUALIFIER, Writables.getBytes(b));
    list.add(put);


    HRegionInfo copyOfIdxParent = new HRegionInfo(parentIdx);
    copyOfIdxParent.setOffline(true);
    copyOfIdxParent.setSplit(true);
    Put putForIdxRegion = new Put(copyOfIdxParent.getRegionName());
    putForIdxRegion.add(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER,
      Writables.getBytes(copyOfIdxParent));
    putForIdxRegion.add(HConstants.CATALOG_FAMILY, HConstants.SPLITA_QUALIFIER,
      Writables.getBytes(idxa));
    putForIdxRegion.add(HConstants.CATALOG_FAMILY, HConstants.SPLITB_QUALIFIER,
      Writables.getBytes(idxb));
    list.add(putForIdxRegion);
    putToMetaTable(catalogTracker, list);
    LOG.info("Offlined parent region " + parent.getRegionNameAsString() + " in META");
  }


  private static void putToMetaTable(final CatalogTracker ct, final List<Put> p) throws IOException {
    org.apache.hadoop.hbase.client.HConnection c = ct.getConnection();
    if (c == null) throw new NullPointerException("No connection");
    put(new HTable(ct.getConnection().getConfiguration(), HConstants.META_TABLE_NAME), p);
  }


  private static void put(final HTable t, final List<Put> p) throws IOException {
    try {
      t.put(p);
    } finally {
      t.close();
    }
  }


  public PairOfSameType<HRegion> stepsBeforeAddingPONR(final Server server, final RegionServerServices services,
      boolean testing) throws IOException {
    // Set ephemeral SPLITTING znode up in zk.  Mocked servers sometimes don't
    // have zookeeper so don't do zk stuff if server or zookeeper is null
    if (server != null && server.getZooKeeper() != null) {
      try {
        createNodeSplitting(server.getZooKeeper(),
          this.parent.getRegionInfo(), server.getServerName());
      } catch (KeeperException e) {
        throw new IOException("Failed creating SPLITTING znode on " +
          this.parent.getRegionNameAsString(), e);
      }
    }
    this.journal.add(JournalEntry.SET_SPLITTING_IN_ZK);
    if (server != null && server.getZooKeeper() != null) {
      try {
        // Transition node from SPLITTING to SPLITTING after creating the split node.
        // Master will get the callback for node change only if the transition is successful.
        // Note that if the transition fails then the rollback will delete the created znode
        // TODO : May be we can add some new state to znode and handle the new state incase of success/failure
        this.znodeVersion = transitionNodeSplitting(server.getZooKeeper(),
            this.parent.getRegionInfo(), server.getServerName(), -1);
      } catch (KeeperException e) {
        throw new IOException("Failed setting SPLITTING znode on "
            + this.parent.getRegionNameAsString(), e);
      }
    }
    createSplitDir(this.parent.getFilesystem(), this.splitdir);
    this.journal.add(JournalEntry.CREATE_SPLIT_DIR);
 
    List<StoreFile> hstoreFilesToSplit = null;
    Exception exceptionToThrow = null;
    try{
      hstoreFilesToSplit = this.parent.close(false);
    } catch (Exception e) {
      exceptionToThrow = e;
    }
    if (exceptionToThrow == null && hstoreFilesToSplit == null) {
      // The region was closed by a concurrent thread.  We can't continue
      // with the split, instead we must just abandon the split.  If we
      // reopen or split this could cause problems because the region has
      // probably already been moved to a different server, or is in the
      // process of moving to a different server.
      exceptionToThrow = closedByOtherException;
    }
    if (exceptionToThrow != closedByOtherException) {
      this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
    }
    if (exceptionToThrow != null) {
      if (exceptionToThrow instanceof IOException) throw (IOException)exceptionToThrow;
      throw new IOException(exceptionToThrow);
    }


    if (!testing) {
      services.removeFromOnlineRegions(this.parent.getRegionInfo().getEncodedName());
    }
    this.journal.add(JournalEntry.OFFLINED_PARENT);


    // TODO: If splitStoreFiles were multithreaded would we complete steps in
    // less elapsed time?  St.Ack 20100920
    //
    // splitStoreFiles creates daughter region dirs under the parent splits dir
    // Nothing to unroll here if failure -- clean up of CREATE_SPLIT_DIR will
    // clean this up.
    splitStoreFiles(this.splitdir, hstoreFilesToSplit);


    // Log to the journal that we are creating region A, the first daughter
    // region.  We could fail halfway through.  If we do, we could have left
    // stuff in fs that needs cleanup -- a storefile or two.  Thats why we
    // add entry to journal BEFORE rather than AFTER the change.
    this.journal.add(JournalEntry.STARTED_REGION_A_CREATION);
    HRegion a = createDaughterRegion(this.hri_a, this.parent.rsServices);


    // Ditto
    this.journal.add(JournalEntry.STARTED_REGION_B_CREATION);
    HRegion b = createDaughterRegion(this.hri_b, this.parent.rsServices);
    return new PairOfSameType<HRegion>(a,b);
  }

--------------------------------------------
IndexRegionObserver分析
--------------------------------------------
1、 preSplit
（1）、实例化一个 SplitInfo 放入ThreadLocal<SplitInfo> splitThreadLocal变量中，以便使用。

2、 preSplitBeforePONR：
（1）、得到当前HRegionServer上当前region对应的索引region
（2）、st = new SplitTransaction(indexRegion, splitKey); 根据索引region与表splitKey生成SplitTransaction实例
（3）、生成并返回索引的splitInfo

3、 preSplitAfterPONR
（1）、如果当前表是索引表则直接返回
（2）、如果原表上有索引，则得到索引表的splitInfo，
（3）、获得索引表对应的splitTransaction，和子region
（4）、执行splitTransaction.stepsAfterPONR(rs, rs, daughters);

rowkey结构

/*
* Format for the rowkey for index table [Startkey for the index region] + [one 0 byte] + [Index
* name] + [Padding for the max index name] + [[index col value]+[padding for the max col value]
* for each of the index col] + [user table row key] To know the reason for adding empty byte
* array refert to HDP-1666
*/

[Startkey for the index region]
+ [one 0 byte]
+ [Index name] + [Padding for the max index name]
+ [[index col value]+[padding for the max col value] for each of the index col]
+ [user table row key]