2021SC@SDUSC 2021SC@SDUSC 2021SC@SDUSC 2021SC@SDUSC
多次Put请求及RS寻址、构造RPC
1.多次put请求
@Override
public void put(final List<Put> puts) throws IOException {
for (Put put : puts) {
validatePut(put);
}
Object[] results = new Object[puts.size()];
try {
//batch方法,将众多put作为参数进行了传递
batch(puts, results, writeRpcTimeoutMs);
} catch (InterruptedException e) {
throw (InterruptedIOException) new InterruptedIOException().initCause(e);
}
}
这里的validatePut方法在上篇博客中提到了,这里及不赘述了。
然后我们看多次put方法中特有的的batch方法:
public void batch(final List<? extends Row> actions, final Object[] results, int rpcTimeout)
throws InterruptedException, IOException {
AsyncProcessTask task = AsyncProcessTask.newBuilder()
.setPool(pool)
.setTableName(tableName)
.setRowAccess(actions)
.setResults(results)
.setRpcTimeout(rpcTimeout)
.setOperationTimeout(operationTimeoutMs)
.setSubmittedRows(AsyncProcessTask.SubmittedRows.ALL)
.build();
AsyncRequestFuture ars = multiAp.submit(task);
ars.waitUntilDone();
if (ars.hasError()) {
throw ars.getErrors();
}
}
上面的AsyncProcess常量对象调用了submit方法,提交了一个新创建的AsyncProcessTask对象,其中包含在put方法中的许多put请求(通过put方法中创建的results数组)。
2.RS寻址
我们到AsyncProcess类中查看它相应的submit方法具体代码(仅列出submit方法中的部分核心代码):
while (it.hasNext()) {
Row r = it.next();
HRegionLocation loc;
try {
if (r == null) {
throw new IllegalArgumentException("#" + id + ", row cannot be null");
}
// Make sure we get 0-s replica.
RegionLocations locs = connection.locateRegion(
tableName, r.getRow(), true, true, RegionReplicaUtil.DEFAULT_REPLICA_ID);
if (locs == null || locs.isEmpty() || locs.getDefaultRegionLocation() == null) {
throw new IOException("#" + id + ", no location found, aborting submit for"
+ " tableName=" + tableName + " rowkey=" + Bytes.toStringBinary(r.getRow()));
}
loc = locs.getDefaultRegionLocation();
} catch (IOException ex) {
...
}
...
return submitMultiActions(task, retainedActions, nonceGroup,
locationErrors, locationErrorRows, actionsByServer);
}
以上代码是AsyncProcess类中通过submit提交过来的AsyncProcessTask,从而获取region信息,方法最后调用该类的submitMultiActions方法,从而发送异步请求:
<CResult> AsyncRequestFuture submitMultiActions(AsyncProcessTask task,
List<Action> retainedActions, long nonceGroup, List<Exception> locationErrors,
List<Integer> locationErrorRows, Map<ServerName, MultiAction> actionsByServer) {
AsyncRequestFutureImpl<CResult> ars = createAsyncRequestFuture(task, retainedActions, nonceGroup);
...
ars.sendMultiAction(actionsByServer, 1, null, false);
return ars;
}
submitMultiActions方法中最后的sendMultiAction即可发送异步请求。
3.RPC请求
在提交之前HBase会在元数据表hbase:meta中根据rowkey找到她们归属的RS。
发送异步请求后,在sendMultiAction根据MultiAction,会封装多线程rpc任务:
void sendMultiAction(Map<ServerName, MultiAction> actionsByServer,
int numAttempt, List<Action> actionsForReplicaThread, boolean reuseThread) {
int actionsRemaining = actionsByServer.size();
for (Map.Entry<ServerName, MultiAction> e : actionsByServer.entrySet()) {
ServerName server = e.getKey();
MultiAction multiAction = e.getValue();
Collection<? extends Runnable> runnables = getNewMultiActionRunnable(server, multiAction,
numAttempt);
for (Runnable runnable : runnables) {
if ((--actionsRemaining == 0) && reuseThread
&& numAttempt % HConstants.DEFAULT_HBASE_CLIENT_RETRIES_NUMBER != 0) {
runnable.run();
//启动线程
} else {
...
}
}
}
...
}
run方法中调用MultiServerCallable的call()方法,发送rpc请求
public void run() {
AbstractResponse res = null;
CancellableRegionServerCallable callable = currentCallable;
try {
if (callable == null) {
callable = createCallable(server, tableName, multiAction);
}
RpcRetryingCaller<AbstractResponse> caller = asyncProcess.createCaller(callable,rpcTimeout);
try {
if (callsInProgress != null) {
callsInProgress.add(callable);
}
//远程PCR调用入口
res = caller.callWithoutRetries(callable, operationTimeout);
if (res == null) {
return;
}
} catch (IOException e) {
...
}
...
}
}
调用MultiServerCallable的call()方法,发送rpc请求:
public T callWithoutRetries(RetryingCallable<T> callable, int callTimeout)
throws IOException, RuntimeException {
// The code of this method should be shared with withRetries.
try {
callable.prepare(false);
return callable.call(callTimeout);
} catch (Throwable t) {
...
}
}
Put方法的客户端处理阶段小结
-
符合条件的put操作就通过AsyncProcess异步批量提交
-
在提交之前,我们要根据每个rowkey找到它们归属的region server,这个定位的过程是通过locateRegion方法获得的,然后再把这些rowkey按照HRegionLocation分组。(2.x版本后的hbase已经移除HConnection)
-
通过多线程,一个HRegionLocation构造MultiServerCallable,然后通过多线程执行调用,忽略掉失败重新提交和错误处理,客户端的提交操作到此结束
到此,put方法的客户端写入阶段结束,之后是Region写入阶段、MemStore Flush阶段。
如有不足或错误,欢迎指正