HBase MVCC基本原理

HBase MVCC(Multi Version Consistencey Control)

mvcc多版本并发控制,是相对锁来说对并发处理的一种方法,

在HBase中,当writernumber > reade number

表明这个memstore在写,如此时读需要等待。

1. MVCC初始化
在HRegion 的initializeRegionInternals方法中,初始化
Return the largest memstoreTS found across all storefiles in the given list. Store files that were created by a mapreduce bulk load are ignored,

          long maxStoreMemstoreTS = store.getMaxMemstoreTS();
if (maxStoreMemstoreTS > maxMemstoreTS) {
maxMemstoreTS = maxStoreMemstoreTS;
}
---
mvcc.initialize(maxMemstoreTS + 1);


2. 例如 HRegion中internalFlushcache方法
首先.
w = mvcc.beginMemstoreInsert();
主要是为了设置nextWriteNumber并生成WriteEntry的对象e并加入writeQueue(LinkList)队尾。
简单的说就是通过MVCC表明当前的memstore已经开始写了,并且写的位置是nextWriteNumber
  public WriteEntry beginMemstoreInsert() {
synchronized (writeQueue) {
long nextWriteNumber = ++memstoreWrite;
WriteEntry e = new WriteEntry(nextWriteNumber);
writeQueue.add(e);
return e;
}
}

mvcc.advanceMemstore(w);
主要是有个while循环从writeQueue队头中取出WriteEntry的对象一个个判断
如果nextWriteNumber>0, if (nextReadValue+1 != queueFirst.getWriteNumber()),抛异常 。
如果WriteEntry的对象已经完成,更新nextReadValue并从writeQueue中删除当前对象,否则break;
跳出while后更新memstoreRead并通知readWaiters.notifyAll().
简单的说这个方法主要是为了更新memstoreRead,也就是可以读的位置, 并通知readWaiters.notifyAll()。

  boolean advanceMemstore(WriteEntry e) {
synchronized (writeQueue) {
e.markCompleted();

long nextReadValue = -1;
boolean ranOnce=false;
while (!writeQueue.isEmpty()) {
ranOnce=true;
WriteEntry queueFirst = writeQueue.getFirst();

if (nextReadValue > 0) {
if (nextReadValue+1 != queueFirst.getWriteNumber()) {
throw new RuntimeException("invariant in completeMemstoreInsert violated, prev: "
+ nextReadValue + " next: " + queueFirst.getWriteNumber());
}
}

if (queueFirst.isCompleted()) {
nextReadValue = queueFirst.getWriteNumber();
writeQueue.removeFirst();
} else {
break;
}
}

if (!ranOnce) {
throw new RuntimeException("never was a first");
}

if (nextReadValue > 0) {
synchronized (readWaiters) {
memstoreRead = nextReadValue;
readWaiters.notifyAll();
}
}
if (memstoreRead >= e.getWriteNumber()) {
return true;
}
return false;
}
}


3. 例如 HRegion中internalFlushcache方法中调用
mvcc.waitForRead(w);
这个方法就是wait直到memstore可以读,那么memstore什么时候可以读呢?
memstoreRead >= e.getWriteNumber()时才可以读。
  public void waitForRead(WriteEntry e) {
boolean interrupted = false;
synchronized (readWaiters) {
while (memstoreRead < e.getWriteNumber()) {
try {
readWaiters.wait(0);
} catch (InterruptedException ie) {
// We were interrupted... finish the loop -- i.e. cleanup --and then
// on our way out, reset the interrupt flag.
interrupted = true;
}
}
}
if (interrupted) Thread.currentThread().interrupt();
}

那么在internalFlushcache中,调用waitForRead主要作用是为了在flush之前等待还在处理中的事务commit到Hlog中,并阻止未提交的事务写到HFile中。
之后就进行flush.


同样在HRegion的doMiniBatchMutation方法中有类似的mvcc应用,通过mvcc实现写完成的数据能被及时读到。

      // ------------------------------------
// Acquire the latest mvcc number
// ----------------------------------
w = mvcc.beginMemstoreInsert();


// ------------------------------------------------------------------
// STEP 8. Advance mvcc. This will make this put visible to scanners and getters.
// ------------------------------------------------------------------
if (w != null) {
mvcc.completeMemstoreInsert(w);
w = null;
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值