HBase源码分析之Region定位

最新推荐文章于 2024-01-18 02:10:30 发布

lipeng_bigdata

最新推荐文章于 2024-01-18 02:10:30 发布

阅读量6.8k

点赞数 9

分类专栏： HBase1.0.2源码分析

本文链接：https://blog.csdn.net/lipeng_bigdata/article/details/50904229

版权

本文深入探讨了HBase如何通过RowKey定位Region和RegionServer。分析了Region定位的过程，包括不使用缓存时的ClusterConnection.relocateRegion()方法，以及使用缓存时的流程。详细解释了locateRegionInMeta()方法，涉及到Meta表的处理和从ZooKeeper获取MetaRegionLocation的步骤，阐述了Region定位的关键步骤和策略。

摘要由CSDN通过智能技术生成

我们知道，HBase是一个基于RowKey进行检索的分布式数据库。它按照行的方向将表中的数据切分成一个个Region，而每个Region都会存在一个起始行StartKey和一个终止行EndKey。Region会最终选择一个RegionSever上线，并依靠RegionSever对外提供数据存取服务。那么，HBase是如何实现数据的检索，也就是它如何将需要读写的行Row准确的定位到其所在Region和RegionServer上的呢？本文，我们就将研究下HRegion的定位。

之前我们已经研究过HBase读取数据的应用--Scan，在Scan的过程中，它每次通过RPC与服务端通信，都是针对特定的Region及其所在RegionServer进行数据读取请求，将数据缓存至客户端。在它迭代获取数据的Scanner的next()中，会检查缓存中是否存在数据，若无，则加载缓存，然后直接从缓存中拉取数据，代码如下：

    @Override
    public Result next() throws IOException {
      // If the scanner is closed and there's nothing left in the cache, next is a no-op.
      if (cache.size() == 0 && this.closed) {
        return null;
      }
      
      // 如果缓存中不存在数据，调用loadCache()方法加载缓存
      if (cache.size() == 0) {
        loadCache();
      }

      // 缓存中存在数据的话，直接从缓存中拉取数据，返回给客户端请求者
      if (cache.size() > 0) {
        return cache.poll();
      }

      // if we exhausted this scanner before calling close, write out the scan metrics
      writeScanMetrics();
      return null;
    }

而这个加载缓存的loadCache()方法，则会调用call()方法，发送RPC请求给对应的RegionServer上的Region，那么它是如何定位Region的呢？我们先看下这个call()方法，代码如下：

  Result[] call(Scan scan, ScannerCallableWithReplicas callable,
      RpcRetryingCaller<Result[]> caller, int scannerTimeout)
      throws IOException, RuntimeException {
    if (Thread.interrupted()) {
      throw new InterruptedIOException();
    }
    // callWithoutRetries is at this layer. Within the ScannerCallableWithReplicas,
    // we do a callWithRetries
    // caller为RpcRetryingCaller类型
    // callable为ScannerCallableWithReplicas类型
    return caller.callWithoutRetries(callable, scannerTimeout);
  }

实际上caller为RpcRetryingCaller类型，而callable为ScannerCallableWithReplicas类型，我们看下RpcRetryingCaller的callWithoutRetries()方法，关键代码如下：

      // 先调用prepare()方法，再调用call()方法，超时时间为callTimeout
      callable.prepare(false);
      return callable.call(callTimeout);

发现没，实际上最终调用的是callable的call()方法，也就是ScannerCallableWithReplicas的call()方法，我们跟进下关键代码：

  @Override
  public Result [] call(int timeout) throws IOException {

    // 此处省略代码若干字......

    // 根据scan的startRow获取Region位置，使用cache
    RegionLocations rl = RpcRetryingCallerWithReadReplicas.getRegionLocations(true,
        RegionReplicaUtil.DEFAULT_REPLICA_ID, cConnection, tableName,
        currentScannerCallable.getRow());

    // 此处省略代码若干字......
}

终于切入主题了！Region的定位是通过调用RpcRetryingCallerWithReadReplicas的getRegionLocations()方法进行的，它需要是否使用缓存标识位useCache、主从复制replicaId、ClusterConnection集群连接器cConnection，表名tableName、所在行Row等关键参数，并返回RegionLocations，用于表示Region的位置信息。而RegionLocations中存在一个数组locations，它的定义如下：

  // locations array contains the HRL objects for known region replicas indexes by the replicaId.
  // elements can be null if the region replica is not known at all. A null value indicates
  // that there is a region replica with the index as replicaId, but the location is not known
  // in the cache.
  private final HRegionLocation[] locations; // replicaId -> HRegionLocation.

它是一个HRegionLocation类型的数组，实际上存储的是replicaId到HRegionLocation的映射，replicaId就是数组的下标。而上面调用getRegionLocations()方法时，传入的replicaId为RegionReplicaUtil.DEFAULT_REPLICA_ID，也就是0。那么HRegionLocation是什么呢?看下它的两个关键成员变量就知道了：

  private final HRegionInfo regionInfo;
  private final ServerName serverName;

HRegionLocation就是Region的位置信息，它包含了关键的两点信息：1、数据读写请求中row所在Region信息HRegionInfo；2、Region所在服务器ServerName。有了这两点，我们就能够掌握row对应Region位置信息了。

言归正传，我们从RpcRetryingCallerWithReadReplicas的getRegionLocations()方法开始，代码如下：

  static RegionLocations getRegionLocations(boolean useCache, int replicaId,
                 ClusterConnection cConnection, TableName tableName, byte[] row)
      throws RetriesExhaustedException, DoNotRetryIOException, InterruptedIOException {

    RegionLocations rl;
    try {
    	
      // 根据表名tableName，行row，和副本replicaId，来定位Region位置，得到RegionLocations，即rl
      if (!useCache) {
    	// 不使用缓存，调用ClusterConnection的relocateRegion()方法，定位Region位置
        rl = cConnection.relocateRegion(tableName, row, replicaId);
      } else {
        // 使用缓存，调用ClusterConnection的locateRegion()方法，定位Region位置
    	rl = cConnection.locateRegion(tableName, row, useCache, true, replicaId);
      }
    } catch (DoNotRetryIOException e) {
      throw e;
    } catch (RetriesExhaustedException e) {
      throw e;
    } catch (InterruptedIOException e) {
      throw e;
    } catch (IOException e) {
      throw new RetriesExhaustedException("Can't get the location", e);
    }
    if (rl == null) {
      throw new RetriesExhaustedException("Can't get the locations");
    }

    return rl;
  }

其实逻辑很简单，就分两种情况，使用缓存和不使用缓存。而且，我们也应该能猜出来，即便是使用缓存，如果缓存中没有的话，它还是会走一遍不使用缓存的流程，将获取到的Region位置信息加载到缓存中，然后再返回给外部调用者，最终我们需要共同研究的仅仅是不使用缓存的情况下如何定位Region而已。到底是不是这样呢？我们先记住，后面再做验证。

首先，我们来看下不使用缓存的情况下，是如何进行Region定位的。它调用的是ClusterConnection的relocateRegion()方法，而这个ClusterConnection是一个接口，它的实例化，是在HTable中进行，然后一层层传递过来的。我们先看下它的实例化，在HTable的构造方法中，代码如下：

this.connection = ConnectionManager.getConnectionInternal(conf);

通过ConnectionManager的静态方法getConnectionInternal()，从配置信息conf中加载而来。继续看下它的代码：

  static ClusterConnection getConnectionInternal(final Configuration conf)
    throws IOException {
	  
	// 根据配置信息conf构造HConnectionKey
    HConnectionKey connectionKey = new HConnectionKey(conf);
    
    synchronized (CONNECTION_INSTANCES) {
    	
      // 先从CONNECTION_INSTANCES中根据HConnectionKey获取连接HConnectionImplementation类型的connection,
      // CONNECTION_INSTANCES为HConnectionKey到HConnectionImplementation的映射集合
      HConnectionImplementation connection = CONNECTION_INSTANCES.get(connectionKey);
      
      if (connection == null) {// 如果CONNECTION_INSTANCES中不存在
    	  
    	// 调用createConnection()方法创建一个HConnectionImplementation
        connection = (HConnectionImplementation)createConnection(conf, true);
        
        // 将新创建的HConnectionImplementation与HConnectionKey的对应关系存入CONNECTION_INSTANCES
        CONNECTION_INSTANCES.put(connectionKey, connection);
      } else if (connection.isClosed()) {// 如果CONNECTION_INSTANCES中存在，且已关闭的话
    	  
    	// 调用ConnectionManager的deleteConnection()方法，删除connectionKey对应的记录:
    	// 1、调用decCount()方法减少计数；
    	// 2、从CONNECTION_INSTANCES类表中移除connectionKey对应记录；
    	// 3、调用HConnectionImplementation的internalClose()方法处理关闭连接事宜
        ConnectionManager.deleteConnection(connectionKey, true);
        
        // 调用createConnection()方法创建一个HConnectionImplementation
        connection = (HConnectionImplementation)createConnection(conf, true);
        
        // 将新创建的HConnectionImplementation与HConnectionKey的对应关系存入CONNECTION_INSTANCES
        CONNECTION_INSTANCES.put(connectionKey, connection);
      }
      
      // 连接计数器增1
      connection.incCount();
      
      // 返回连接
      return connection;
    }
  }

这个HConnectionKey实际上是连接的一个Key类，包含了连接对应的hbase.zookeeper.quorum、hbase.zookeeper.property.clientPort等重要信息，而获取连接的方法也很简单，如果之前创建过key相同的连接，直接从CONNECTION_INSTANCES集合中根据HConnectionKey获取，并将连接计数器增1，直接返回连接，获取不到的话，根据HConnectionKey创建一个新的，并加入CONNECTION_INSTANCES集合，而且，如果获取到的连接是Closed的话，调用ConnectionManager的deleteConnection()方法，删除connectionKey对应的记录，创建一个新的连接创建一个HConnectionImplementation，并加入到CONNECTION_INSTANCES集合。

我们已经知道，上述ClusterConnection的实现类就是HConnectionImplementation，那么我们回到正轨上，继续研究Region的定位，先看下不使用缓存的情况的情况下是如何处理的。好，我们进入HConnectionImplementation的relocateRegion()方法，代码如下：

    @Override
    public RegionLocations relocateRegion(final TableName tableName,
        final byte [] row, int replicaId) throws IOException{
      // Since this is an explicit request not to use any caching, finding
      // disabled tables should not be desirable.  This will ensure that an exception is thrown when
      // the first time a disabled table is interacted with.
      // 既然这是一个明确不使用任何缓存的请求，如果发现表被禁用，那么这将是不可取的。
      // 当我们第一时间发现表被禁用时，