2021SC@SDUSC hbase源码分析(三)写入流程(2)

2021SC@SDUSC 2021SC@SDUSC 2021SC@SDUSC 2021SC@SDUSC

多次Put请求及RS寻址、构造RPC

1.多次put请求
@Override
public void put(final List<Put> puts) throws IOException {
  for (Put put : puts) {
    validatePut(put);
  }
  Object[] results = new Object[puts.size()];
  try {
      //batch方法,将众多put作为参数进行了传递
    batch(puts, results, writeRpcTimeoutMs);
      
  } catch (InterruptedException e) {
    throw (InterruptedIOException) new InterruptedIOException().initCause(e);
  }
}

这里的validatePut方法在上篇博客中提到了,这里及不赘述了。

然后我们看多次put方法中特有的的batch方法:

public void batch(final List<? extends Row> actions, final Object[] results, int rpcTimeout)
    throws InterruptedException, IOException {
  AsyncProcessTask task = AsyncProcessTask.newBuilder()
          .setPool(pool)
          .setTableName(tableName)
          .setRowAccess(actions)
          .setResults(results)
          .setRpcTimeout(rpcTimeout)
          .setOperationTimeout(operationTimeoutMs)
          .setSubmittedRows(AsyncProcessTask.SubmittedRows.ALL)
          .build();
  AsyncRequestFuture ars = multiAp.submit(task);
  ars.waitUntilDone();
  if (ars.hasError()) {
    throw ars.getErrors();
  }
}

上面的AsyncProcess常量对象调用了submit方法,提交了一个新创建的AsyncProcessTask对象,其中包含在put方法中的许多put请求(通过put方法中创建的results数组)。

2.RS寻址

我们到AsyncProcess类中查看它相应的submit方法具体代码(仅列出submit方法中的部分核心代码):

while (it.hasNext()) {
  Row r = it.next();
  HRegionLocation loc;
  try {
    if (r == null) {
      throw new IllegalArgumentException("#" + id + ", row cannot be null");
    }
    // Make sure we get 0-s replica.
    RegionLocations locs = connection.locateRegion(
        tableName, r.getRow(), true, true, RegionReplicaUtil.DEFAULT_REPLICA_ID);
    if (locs == null || locs.isEmpty() || locs.getDefaultRegionLocation() == null) {
      throw new IOException("#" + id + ", no location found, aborting submit for"
          + " tableName=" + tableName + " rowkey=" + Bytes.toStringBinary(r.getRow()));
    }
    loc = locs.getDefaultRegionLocation();
  } catch (IOException ex) {
	...
  }
	...
    return submitMultiActions(task, retainedActions, nonceGroup,
        locationErrors, locationErrorRows, actionsByServer);
}

以上代码是AsyncProcess类中通过submit提交过来的AsyncProcessTask,从而获取region信息,方法最后调用该类的submitMultiActions方法,从而发送异步请求:

<CResult> AsyncRequestFuture submitMultiActions(AsyncProcessTask task,
    List<Action> retainedActions, long nonceGroup, List<Exception> locationErrors,
    List<Integer> locationErrorRows, Map<ServerName, MultiAction> actionsByServer) {
  AsyncRequestFutureImpl<CResult> ars = createAsyncRequestFuture(task, retainedActions, nonceGroup);
  ...
  ars.sendMultiAction(actionsByServer, 1, null, false);
  return ars;
}

submitMultiActions方法中最后的sendMultiAction即可发送异步请求。

3.RPC请求

在提交之前HBase会在元数据表hbase:meta中根据rowkey找到她们归属的RS。

发送异步请求后,在sendMultiAction根据MultiAction,会封装多线程rpc任务:

void sendMultiAction(Map<ServerName, MultiAction> actionsByServer,
                             int numAttempt, List<Action> actionsForReplicaThread, boolean reuseThread) {

  int actionsRemaining = actionsByServer.size();

  for (Map.Entry<ServerName, MultiAction> e : actionsByServer.entrySet()) {
    ServerName server = e.getKey();
    MultiAction multiAction = e.getValue();
    Collection<? extends Runnable> runnables = getNewMultiActionRunnable(server, multiAction,
        numAttempt);


    for (Runnable runnable : runnables) {
      if ((--actionsRemaining == 0) && reuseThread
          && numAttempt % HConstants.DEFAULT_HBASE_CLIENT_RETRIES_NUMBER != 0) {
        runnable.run();
          //启动线程
      } else {
		...
      }
    }
  }
	...
}

run方法中调用MultiServerCallable的call()方法,发送rpc请求

  public void run() {
    AbstractResponse res = null;
    CancellableRegionServerCallable callable = currentCallable;
    try {
      if (callable == null) {
        callable = createCallable(server, tableName, multiAction);
      }
      RpcRetryingCaller<AbstractResponse> caller = asyncProcess.createCaller(callable,rpcTimeout);
      try {
        if (callsInProgress != null) {
          callsInProgress.add(callable);
        }
                    
          //远程PCR调用入口
        res = caller.callWithoutRetries(callable, operationTimeout);
                 
        if (res == null) {
          return;
        }
      } catch (IOException e) {
        ...
      }
      ...
  }
}

调用MultiServerCallable的call()方法,发送rpc请求:

public T callWithoutRetries(RetryingCallable<T> callable, int callTimeout)
throws IOException, RuntimeException {
  // The code of this method should be shared with withRetries.
  try {
    callable.prepare(false);
    return callable.call(callTimeout);
  } catch (Throwable t) {
	...
  }
}

Put方法的客户端处理阶段小结

  1. 符合条件的put操作就通过AsyncProcess异步批量提交

  2. 在提交之前,我们要根据每个rowkey找到它们归属的region server,这个定位的过程是通过locateRegion方法获得的,然后再把这些rowkey按照HRegionLocation分组。(2.x版本后的hbase已经移除HConnection)

  3. 通过多线程,一个HRegionLocation构造MultiServerCallable,然后通过多线程执行调用,忽略掉失败重新提交和错误处理,客户端的提交操作到此结束

到此,put方法的客户端写入阶段结束,之后是Region写入阶段、MemStore Flush阶段。

如有不足或错误,欢迎指正

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值