先从入数据开始。
Put方法最终会调用HTable的doPut方法
private void doPut(final List<Put> puts) throws IOException {
int n = 0;
for (Put put : puts) {
validatePut(put);
writeBuffer.add(put);
currentWriteBufferSize += put.heapSize();
// we need to periodically see if the writebuffer is full instead of waiting until the end of the List
n++;
if (n % DOPUT_WB_CHECK == 0 && currentWriteBufferSize > writeBufferSize) {
flushCommits();
}
}
if (autoFlush || currentWriteBufferSize > writeBufferSize) {
flushCommits();
}
}
很容易发现,hbase执行flush的时候有两个触发条件,要么WriteBufferSize大于指定值,要么autoFlush。
下面我们再来看flushCommits 方法
public void flushCommits() throws IOException {
try {
Object[] results = new Object[writeBuffer.size()];
try {
this.connection.processBatch(writeBuffer, tableName, pool, results);
} catch (InterruptedException e) {
throw new IOException(e);
} finally {
// mutate list so that it is empty for complete success, or contains
// only failed records results are returned in the same order as the
// requests in list walk the list backwards, so we can remove from list
// without impacting the indexes of earlier members
for (int i = results.length - 1; i>=0; i--) {
if (results[i] instanceof Result) {
// successful Puts are removed from the list here.
writeBuffer.remove(i);
}
}
}
} finally {
if (clearBufferOnFail) {
writeBuffer.clear();
currentWriteBufferSize = 0;
} else {
// the write buffer was adjusted by processBatchOfPuts
currentWriteBufferSize = 0;
for (Put aPut : writeBuffer) {
currentWriteBufferSize += aPut.heapSize();
}
}
}
}
我们会发现提交工作在connection.processBatch里完成。connection.processBatch方法会传入一个Object的数组result ,来存储结果。
并在后面判断,如果结果数组result里有结果的话,就是说提交成功的话,会从writeBuffer(一个存储提交数据的list)里把对应的项删除掉,如果不成功则保留起来。
注意最后的一个finally 。 如果clearBufferOnFail 为true的情况,则不会重复提交响应错误的数据。clearBufferOnFail默认是true的。
在setAutoFlush里有设置。
public void setAutoFlush(boolean autoFlush, boolean clearBufferOnFail) {
this.autoFlush = autoFlush;
this.clearBufferOnFail = autoFlush || clearBufferOnFail;
}
public void setAutoFlush(boolean autoFlush) {
setAutoFlush(autoFlush, autoFlush);
}
下面再来看比较难的connection.processBatch方法
connection是在HConnectionManager方法里进行实例化的,processBatchCallback 方法里进行真正的提交动作的。
这个方法比较长..........
public <R> void processBatchCallback(
List<? extends Row> list,
byte[] tableName,
ExecutorService pool,
Object[] results,
Batch.Callback<R> callback)
throws IOException, InterruptedException {
// results must be the same size as list
if (results.length != list.size()) {
throw new IllegalArgumentException(
"argument results must be the same size as argument list");
}
if (list.isEmpty()) {
return;
}
// Keep track of the most recent servers for any given item for better
// exceptional reporting. We keep HRegionLocation to save on parsing.
// Later below when we use lastServers, we'll pull what we need from
// lastServers.
HRegionLocation [] lastServers = new HRegionLocation[results.length];
List<Row> workingList = new ArrayList<Row>(list);
boolean retry = true;
// count that helps presize actions array
int actionCount = 0;
Throwable singleRowCause = null;
for (int tries = 0; tries < numRetries && retry; ++tries) {
// sleep first, if this is a retry
if (tries >= 1) {
long sleepTime = getPauseTime(tries);
LOG.debug("Retry " +tries+ ", sleep for " +sleepTime+ "ms!");
Thread.sleep(sleepTime);
}
// step 1: break up into regionserver-sized chunks and build the data structs
Map<HRegionLocation, MultiAction<R>> actionsByServer =
new HashMap<HRegionLocation, MultiAction<R>>();
for (int i = 0; i < workingList.size(); i++) {
Row row = workingList.get(i);
if (row != null) {
HRegionLocation loc = locateRegion(tableName, row.getRow(), true);
byte[] regionName = loc.getRegionInfo().getRegionName();
MultiAction<R> actions = actionsByServer.get(loc);
if (actions == null) {
actions = new MultiAction<R>();
actionsByServer.put(loc, actions);
}
Action<R> action = new Action<R>(row, i);
lastServers[i] = loc;
actions.add(regionName, action);
}
}
// step 2: make the requests
Map<HRegionLocation, Future<MultiResponse>> futures =
new HashMap<HRegionLocation, Future<MultiResponse>>(
actionsByServer.size());
for (Entry<HRegionLocation, MultiAction<R>> e: actionsByServer.entrySet()) {
futures.put(e.getKey(), pool.submit(createCallable(e.getKey(), e.getValue(), tableName)));
}
// step 3: collect the failures and successes and prepare for retry
for (Entry<HRegionLocation, Future<MultiResponse>> responsePerServer
: futures.entrySet()) {
HRegionLocation loc = responsePerServer.getKey();
try {
Future<MultiResponse> future = responsePerServer.getValue();
MultiResponse resp = future.get();
if (resp == null) {
// Entire server failed
LOG.debug("Failed all for server: " + loc.getHostnamePort() +
", removing from cache");
continue;
}
for (Entry<byte[], List<Pair<Integer,Object>>> e : resp.getResults().entrySet()) {
byte[] regionName = e.getKey();
List<Pair<Integer, Object>> regionResults = e.getValue();
for (Pair<Integer, Object> regionResult : regionResults) {
if (regionResult == null) {
// if the first/only record is 'null' the entire region failed.
LOG.debug("Failures for region: " +
Bytes.toStringBinary(regionName) +
", removing from cache");
} else {
// Result might be an Exception, including DNRIOE
results[regionResult.getFirst()] = regionResult.getSecond();
if (callback != null && !(regionResult.getSecond() instanceof Throwable)) {
callback.update(e.getKey(),
list.get(regionResult.getFirst()).getRow(),
(R)regionResult.getSecond());
}
}
}
}
} catch (ExecutionException e) {
LOG.warn("Failed all from " + loc, e);
}
}
// step 4: identify failures and prep for a retry (if applicable).
// Find failures (i.e. null Result), and add them to the workingList (in
// order), so they can be retried.
retry = false;
workingList.clear();
actionCount = 0;
for (int i = 0; i < results.length; i++) {
// if null (fail) or instanceof Throwable && not instanceof DNRIOE
// then retry that row. else dont.
if (results[i] == null ||
(results[i] instanceof Throwable &&
!(results[i] instanceof DoNotRetryIOException))) {
retry = true;
actionCount++;
Row row = list.get(i);
workingList.add(row);
deleteCachedLocation(tableName, row.getRow());
} else {
if (results[i] != null && results[i] instanceof Throwable) {
actionCount++;
}
// add null to workingList, so the order remains consistent with the original list argument.
workingList.add(null);
}
}
}
if (retry) {
// Simple little check for 1 item failures.
if (singleRowCause != null) {
throw new IOException(singleRowCause);
}
}
List<Throwable> exceptions = new ArrayList<Throwable>(actionCount);
List<Row> actions = new ArrayList<Row>(actionCount);
List<String> addresses = new ArrayList<String>(actionCount);
for (int i = 0 ; i < results.length; i++) {
if (results[i] == null || results[i] instanceof Throwable) {
exceptions.add((Throwable)results[i]);
actions.add(list.get(i));
addresses.add(lastServers[i].getHostnamePort());
}
}
if (!exceptions.isEmpty()) {
throw new RetriesExhaustedWithDetailsException(exceptions,
actions,
addresses);
}
}
还好有注释帮忙........主要分了四步来做这件事。
step 1
把需要提交的数据重新整合一下
转化成这样:Map<HRegionLocation, MultiAction<R>> , 这个看起来很变扭(囧......),说的直白一些就是Map<集群机器信息(host:port), Map<RegionName,actions(一个实体动作put)>>
说白了就是把Put动作,先按机器分,在一台机器里再按regionName分。
step 2
真正提交请求。
得到返回是 Map<HRegionLocation, Future<MultiResponse>> , 也是按机器分的......
step 3
把返回成功的和失败的结果收集起来,并准备重试
仔细看代码就是 还是原来那个顺序,先按每个regionServer遍历,每个regionServer里又按regionName得到返回值,把所有的返回值放入存储结果的result数组里。
并在最后执行coprocessor的Callback
step 4
确定失败信息,并准备重试
如果 结果数据 results[i] instanceof Throwable && !(results[i] instanceof DoNotRetryIOException ,那么就把数据重新放入到workingList里
最后 再大循环,直到到达重试次数上限。
如果 还有错误信息出现,那么提交会抛异常到上一层。
今天就先看到这儿吧........
等着后面再续.......