hbase 源代码分析（5）regionLocator 获取region过程详解

最新推荐文章于 2025-05-19 16:20:25 发布

陈奉刚11

最新推荐文章于 2025-05-19 16:20:25 发布

阅读量3.1k

点赞数

分类专栏： HBASE 源代码文章标签：源代码 hbase

本文链接：https://blog.csdn.net/chenfenggang/article/details/75041244

版权

源代码同时被 2 个专栏收录

55 篇文章

订阅专栏

HBASE

26 篇文章

订阅专栏

按计划今天是分析Hbase GET的过程。

上一篇：hbase 源代码解析（4）的createTable 的 region assign
http://blog.csdn.net/chenfenggang/article/details/75000230

在我以前的印象中，get过程应该是
client 找zk 拿到HMaster的主机地址，然后和HMaster 由HMaster 找regionService。拿到region。region再找menStore或者StoreFile。在到HDFS 过程。

结果再今天走读代码的时候，感觉一下子就从client 到RegionService。中间流程总是不对，后来细心分析，才发现流程是这样的

client 找 zk 拿到META 表的regionService主机。然后直接访问这台机器，在 META regionService 表中找的table的行记录的region 的主机。然后再次发送客户端请求。在这个里面找到数据。

而前面的过程其实是 regionLocator 的过程。反正这个是我后的一个点。所以先写这个过程。

     RegionLocator regionLocator = connection.getRegionLocator(TableName.META_TABLE_NAME);
            regionLocator.getRegionLocation("rowKey".getBytes());

regionLocator 里很简单。简单到里面就放了两个东西。connection和table。通过connection可以连接世界，通过table可以在世界定位于你。里面最重要的方法getRegionLocation。通过传入参数rowkey。可以重世界上获得table的当前rowkey的region所有信息。

这里可以获取多个版本的所有信息，但是默认只获取一个就够了。之后的过程会去zk里确认一下表是否enable状态。

   @Override
    public HRegionLocation relocateRegion(final TableName tableName,
        final byte [] row) throws IOException{
      RegionLocations locations =  relocateRegion(tableName, row,
        RegionReplicaUtil.DEFAULT_REPLICA_ID);
      return locations == null ? null :
        locations.getRegionLocation(RegionReplicaUtil.DEFAULT_REPLICA_ID);
    }

然后在这里做了一个分支：如果是META表，走locateMeta 否则走locateRegionInMeta 现在是获取非META表。之后会去缓存里查找一次是否有信息。如果没就继续
if (tableName.equals(TableName.META_TABLE_NAME)) {
return locateMeta(tableName, useCache, replicaId);
} else {
// Region not in the cache - have to go to the meta RS
return locateRegionInMeta(tableName, row, useCache, retry, replicaId);
}

在这里的时候调用了这个

  /*
      * Search the hbase:meta table for the HRegionLocation
      * info that contains the table and row we're seeking.
      */
    private RegionLocations locateRegionInMeta(TableName tableName, byte[] row,
                   boolean useCache, boolean retry, int replicaId) throws IOException {
         ..............
            rcs = new ClientSmallReversedScanner(conf, s, TableName.META_TABLE_NAME, this,
              rpcCallerFactory, rpcControllerFactory, getMetaLookupPool(), 0);

            regionInfoRow = rcs.next();
          .............
          // convert the row result into the HRegionLocation we need!
          RegionLocations locations = MetaTableAccessor.getRegionLocations(regionInfoRow);
            .........
          ServerName serverName = locations.getRegionLocation(replicaId).getServerName();
          if (serverName == null) {
            throw new NoServerForRegionException("No server address listed " +
              "in " + TableName.META_TABLE_NAME + " for region " +
              regionInfo.getRegionNameAsString() + " containing row " +
              Bytes.toStringBinary(row));
          ........................................
    }

这里是拿到serverName的最主要的地方。 rcs.next()最重要。

这个面有个方法

  Result[] call(ScannerCallableWithReplicas callable,
      RpcRetryingCaller<Result[]> caller, int scannerTimeout)
      throws IOException, RuntimeException {
    if (Thread.interrupted()) {
      throw new InterruptedIOException();
    }
    // callWithoutRetries is at this layer. Within the ScannerCallableWithReplicas,
    // we do a callWithRetries
    return caller.callWithoutRetries(callable, scannerTimeout);
  }

然后： callable.call(scannerTimeout)
所以这里最重要的是找到callable。回到new 对象过程

 rcs = new ClientSmallReversedScanner(conf, s, TableName.META_TABLE_NAME, this,
              rpcCallerFactory, rpcControllerFactory, getMetaLookupPool(), 0);

的时候

 protected boolean nextScanner(int nbRows, final boolean done) throws IOException {
   .......
    try {
      callable = getScannerCallable(localStartKey, nbRows);
      // Open a scanner on the region server starting at the
      // beginning of the region
      call(callable, caller, scannerTimeout);
      this.currentRegion = callable.getHRegionInfo();
      .....
     }
  }

这个里面又是一个callable。那么这个callable是什么呢。

 @InterfaceAudience.Private
    protected ScannerCallableWithReplicas getScannerCallable(byte [] localStartKey,
        int nbRows) {
      scan.setStartRow(localStartKey);
      ScannerCallable s =
          new ScannerCallable(getConnection(), getTable(), scan, this.scanMetrics,
              this.rpcControllerFactory);
      s.setCaching(nbRows);
      ScannerCallableWithReplicas sr = new ScannerCallableWithReplicas(tableName, getConnection(),
       s, pool, primaryOperationTimeout, scan,
       retries, scannerTimeout, caching, conf, caller);
      return sr;
    }

有点绕，但是这个还么结束，因为我们还是不知道这个callable的ServiceName是什么，因为每一次call需要一个stub ，而stub需要指定主机名。
还好，还好在caller调用的时候有

   callable.prepare(false);
      return callable.call(callTimeout);

这个prepare 很重要，上一个prepare我去看了是空的，所以好几次默认这个也是空的。但是最后还是被我发现了

这个prepare很有趣所以代码全部拷贝

 @Override
  public void prepare(boolean reload) throws IOException {
    if (Thread.interrupted()) {
      throw new InterruptedIOException();
    }
    RegionLocations rl = RpcRetryingCallerWithReadReplicas.getRegionLocations(!reload,
        id, getConnection(), getTableName(), getRow());
    location = id < rl.size() ? rl.getRegionLocation(id) : null;
    if (location == null || location.getServerName() == null) {
      // With this exception, there will be a retry. The location can be null for a replica
      //  when the table is created or after a split.
      throw new HBaseIOException("There is no location for replica id #" + id);
    }
    ServerName dest = location.getServerName();
    setStub(super.getConnection().getClient(dest));
    if (!instantiated || reload) {
      checkIfRegionServerIsRemote();
      instantiated = true;
    }
    // check how often we retry.
    // HConnectionManager will call instantiateServer with reload==true
    // if and only if for retries.
    if (reload && this.scanMetrics != null) {
      this.scanMetrics.countOfRPCRetries.incrementAndGet();
      if (isRegionServerRemote) {
        this.scanMetrics.countOfRemoteRPCRetries.incrementAndGet();
      }
    }
  }

这个里面调用了获取getRegionLocations方法。

RegionLocations rl = RpcRetryingCallerWithReadReplicas.getRegionLocations(!reload,
        id, getConnection(), getTableName(), getRow());
  static RegionLocations getRegionLocations(boolean useCache, int replicaId,
                 ClusterConnection cConnection, TableName tableName, byte[] row)
      throws RetriesExhaustedException, DoNotRetryIOException, InterruptedIOException {
    RegionLocations rl;
    try {
      if (!useCache) {
        rl = cConnection.relocateRegion(tableName, row, replicaId);
      } else {
        rl = cConnection.locateRegion(tableName, row, useCache, true, replicaId);
      }
    }

cConnection.relocateRegion(tableName, row, replicaId); 看是不是有回到最开始的时候

而且根据结果获取ServerName()。然后将这个的变成目的地，
setStub(super.getConnection().getClient(dest));

到这里发现似乎一切又回到最开始。
有点抓狂，不过我还是发现了什么tableName 变成了TableName.META_TABLE_NAME，肯定变成前面table。所以一切都变了。还记得前面的分支吗。

if (tableName.equals(TableName.META_TABLE_NAME)) {
        return locateMeta(tableName, useCache, replicaId);
      }

里面有个这样的
locations = this.registry.getMetaRegionLocation();
而这个registry 就是zookeeper的ZooKeeperRegistry

 static Registry getRegistry(final Connection connection)
  throws IOException {
    String registryClass = connection.getConfiguration().get(REGISTRY_IMPL_CONF_KEY,
      ZooKeeperRegistry.class.getName());
    Registry registry = null;
    try {
      registry = (Registry)Class.forName(registryClass).newInstance();
    } catch (Throwable t) {
      throw new IOException(t);
    }
    registry.init(connection);
    return registry;
  }

反正里面有个

   List<ServerName> servers = new MetaTableLocator().blockUntilAvailable(zkw, hci.rpcTimeout,
          hci.getConfiguration());

这样就重zk里拿到MATA的ServiceName。一切都通了。

总结
client 首先去zk 里拿到MATA的ServiceName 发送第一次callable.call(),然后在MATA RS里拿到table的region的ServiceName.然后发送第二次callable。call.获得region信息。

这里只是分析了客户端代码，下一节分析RS端代码。

未完待续。。。

hbase 源代码分析（5）regionLocator 获取region过程 详解

hbase 源代码分析（5）regionLocator 获取region过程详解