leveldb对skiplist的使用

最新推荐文章于 2019-07-03 19:55:24 发布

Agent_Tao

最新推荐文章于 2019-07-03 19:55:24 发布

阅读量1.5k

点赞数

分类专栏：数据结构文章标签： leveldb

本文链接：https://blog.csdn.net/hohojiang/article/details/52528811

版权

数据结构专栏收录该内容

4 篇文章 0 订阅

订阅专栏

关于skiplist（跳表），原理很简单，网上有很多介绍，这里不再阐述。
本文主要介绍leveldb如何使用skiplist。
leveldb使用skiplist的地方主要在memtable。源码主要在skiplist.h 、memtable.cc和dbformat.cc里。
先看skiplist.h：

template<typename Key, class Comparator>
class SkipList {
 private:
  struct Node;

 public:
  // Create a new SkipList object that will use "cmp" for comparing keys,
  // and will allocate memory using "*arena".  Objects allocated in the arena
  // must remain allocated for the lifetime of the skiplist object.
  explicit SkipList(Comparator cmp, Arena* arena);

  // Insert key into the list.
  // REQUIRES: nothing that compares equal to key is currently in the list.
  void Insert(const Key& key);

  // Returns true iff an entry that compares equal to key is in the list.
  bool Contains(const Key& key) const;

可见，skiplist的操作只有插入操作和判断是否存在操作，没有删除和修改操作，实际上，修改操作也是通过插入带有“删除标识”的记录实现的，这样实现了多线程无锁操作skiplist，大大提高了效率。

skiplist 数据的读取，主要通过Iterator实现，主要代码如下：

    // Advances to the next position.
    // REQUIRES: Valid()
    void Next();

    // Advances to the previous position.
    // REQUIRES: Valid()
    void Prev();

    // Advance to the first entry with a key >= target
    void Seek(const Key& target);

下面看看怎么插入和删除节点，代码如下：

template<typename Key, class Comparator>
void SkipList<Key,Comparator>::Insert(const Key& key) {
  // TODO(opt): We can use a barrier-free variant of FindGreaterOrEqual()
  // here since Insert() is externally synchronized.
  Node* prev[kMaxHeight];
  Node* x = FindGreaterOrEqual(key, prev);

  // Our data structure does not allow duplicate insertion
  assert(x == NULL || !Equal(key, x->key));

  int height = RandomHeight();
  if (height > GetMaxHeight()) {
    for (int i = GetMaxHeight(); i < height; i++) {
      prev[i] = head_;
    }
    //fprintf(stderr, "Change height from %d to %d\n", max_height_, height);

    // It is ok to mutate max_height_ without any synchronization
    // with concurrent readers.  A concurrent reader that observes
    // the new value of max_height_ will see either the old value of
    // new level pointers from head_ (NULL), or a new value set in
    // the loop below.  In the former case the reader will
    // immediately drop to the next level since NULL sorts after all
    // keys.  In the latter case the reader will use the new node.
    max_height_.NoBarrier_Store(reinterpret_cast<void*>(height));
  }

  x = NewNode(key, height);
  for (int i = 0; i < height; i++) {
    // NoBarrier_SetNext() suffices since we will add a barrier when
    // we publish a pointer to "x" in prev[i].
    x->NoBarrier_SetNext(i, prev[i]->NoBarrier_Next(i));
    prev[i]->SetNext(i, x);
  }
}

逻辑比较简单，主要是通过FindGreaterOrEqual方法比较internalkey，获取插入的位置，然后通过RandomHeight生成跳表的随机高度，然后把节点加入。
再看读取方法，主要逻辑在memtable中：

bool MemTable::Get(const LookupKey& key, std::string* value, Status* s) {
  Slice memkey = key.memtable_key();
  Table::Iterator iter(&table_);
  iter.Seek(memkey.data());
  if (iter.Valid()) {
    // entry format is:
    //    klength  varint32
    //    userkey  char[klength]
    //    tag      uint64
    //    vlength  varint32
    //    value    char[vlength]
    // Check that it belongs to same user key.  We do not check the
    // sequence number since the Seek() call above should have skipped
    // all entries with overly large sequence numbers.
    const char* entry = iter.key();
    uint32_t key_length;
    const char* key_ptr = GetVarint32Ptr(entry, entry+5, &key_length);
    if (comparator_.comparator.user_comparator()->Compare(
            Slice(key_ptr, key_length - 8),
            key.user_key()) == 0) {
      // Correct user key
      const uint64_t tag = DecodeFixed64(key_ptr + key_length - 8);
      switch (static_cast<ValueType>(tag & 0xff)) {
        case kTypeValue: {
          Slice v = GetLengthPrefixedSlice(key_ptr + key_length);
          value->assign(v.data(), v.size());
          return true;
        }
        case kTypeDeletion:
          *s = Status::NotFound(Slice());
          return true;
      }
    }
  }
  return false;
}

可见，在memtable中，比较的是userkey，然后把userkey相同的若干个node中，取第一个读出。这是为什么呢？答案在dbformat.cc中的InternalKeyComparator中，代码如下：

int InternalKeyComparator::Compare(const Slice& akey, const Slice& bkey) const {
  // Order by:
  //    increasing user key (according to user-supplied comparator)
  //    decreasing sequence number
  //    decreasing type (though sequence# should be enough to disambiguate)
  int r = user_comparator_->Compare(ExtractUserKey(akey), ExtractUserKey(bkey));
  if (r == 0) {
    const uint64_t anum = DecodeFixed64(akey.data() + akey.size() - 8);
    const uint64_t bnum = DecodeFixed64(bkey.data() + bkey.size() - 8);
    if (anum > bnum) {
      r = -1;
    } else if (anum < bnum) {
      r = +1;
    }
  }
  return r;
}

在插入时调用的InternalKeyComparator，实际上先调用的user_comparator_，保证userkey排序的正确性，然后再比较Seq Number ，版本号大的排在前，即userkey最新的排在前，这样，通过userkey查找时，第一个就是最新的内容，后面的为历史版本内容，这样，只需通过插入，就实现了userkey对应的value的修改和删除，且还保证了历史数据不丢失。