LevelDB 介绍

最新推荐文章于 2024-08-24 07:46:09 发布

vhomes

最新推荐文章于 2024-08-24 07:46:09 发布

阅读量2.7k

点赞数

分类专栏：分布式相关

本文链接：https://blog.csdn.net/vhomes/article/details/8127173

版权

分布式相关专栏收录该内容

12 篇文章 0 订阅

订阅专栏

早上过来，微博上无意中看到有人转LevelDB相关实现的文章，突然脑海里对这个k/v数据库记得曾经有个印象，因为记得淘宝tair是基于此实现的，（有时间记录下阅读tair对其架构的简单记录笔记），为此上网查了相关资料，简单记录下，以备以后有需要可以快速的查看。

以下为官网上介绍的简单翻译：

一.LevelDB是一个高效的key/value存储库，由google开源，提供了基本的Sring类型的key/value映射存储。（官网：http://code.google.com/p/leveldb/）

二.功能
1.key/value可以是任意长度的字节数组。
2.数据存储是key进行排序。
3.使用者可以提供一个自定义的比较函数覆盖key的排序顺序
4.基本操作支持put(key,value),get(key)以及delete(key).
5.支持批量的提交的原子操作。
6.支持数据快照功能
7.支持从前或者从尾部迭代数据。
8.Data支持自动压缩（使用google的Snappy压缩库，压缩性能很不错，有项目中使用过）
（相关官方档可参考：http://leveldb.googlecode.com/svn/trunk/doc/index.html）

三.局限性
1.不是sql数据库，没有关系数据库模型，不支持SQL查询，不支持索引。

2.单线程处理。

3.使用LevelDB需要基于lib实现对应的client-server.

从以上特性可以看出，LevelDB的特点在于其对key进行排序，支持指提交的原子操作，相比于redis的AOF，其性能是否有优势，不过从其实现的简单查看，也是基于单线程，两都不会相差太大。由于还需要自己基于lib包实现相关的client,server,不便于测试。由于未使用，暂不进行相关测试了。（为什么没有-.-）

相关资料网上还是比较全，其中有人记录的比较详细，下面内容转自：http://hideto.iteye.com/blog/1328921

leveldb介绍
http://code.google.com/p/leveldb/
http://en.wikipedia.org/wiki/LevelDB
http://highscalability.com/blog/2011/8/10/leveldb-fast-and-lightweight-keyvalue-database-from-the-auth.html
http://news.ycombinator.com/item?id=2526032
http://basho.com/blog/technical/2011/07/01/Leveling-the-Field/
http://blog.yufeng.info/archives/1327
http://www.slideshare.net/sunzhidong/google-leveldb-study-discuss

leveldb官方文档
http://leveldb.googlecode.com/svn/trunk/doc/index.html
http://leveldb.googlecode.com/svn/trunk/doc/benchmark.html
http://leveldb.googlecode.com/svn/trunk/doc/impl.html
http://leveldb.googlecode.com/svn/trunk/doc/table_format.txt
http://leveldb.googlecode.com/svn/trunk/doc/log_format.txt

leveldb内部实现和源码解析
http://blog.xiaoheshang.info/?cat=26
http://rdc.taobao.com/blog/cs/?p=1378
http://www.cnblogs.com/haippy/archive/2011/12/04/2276064.html

bigtable/mapreduce/gfs/lsm-tree/skiplist论文
http://blademaster.ixiezi.com/2010/03/27/bigtable：一个分布式的结构化数据存储系统中文版/
http://blademaster.ixiezi.com/2010/03/27/google-mapreduce中文版/
http://blademaster.ixiezi.com/2010/03/27/the-google-file-system中文版/
http://staff.ustc.edu.cn/~jpq/paper/flash/1996-The%20Log-Structured%20Merge-Tree%20%28LSM-Tree%29.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.15.9072&rep=rep1&type=pdf

Tair ldb
http://rdc.taobao.com/blog/cs/?p=1394
http://code.taobao.org/p/tair/wiki/index/
http://code.taobao.org/p/tair/src/branches/ldb/src/storage/ldb/

相关资料
http://www.quora.com/What-is-an-SSTable-in-Googles-internal-infrastructure
http://www.ningoo.net/html/tag/dynamo
http://wiki.apache.org/cassandra/MemtableSSTable
http://wiki.apache.org/cassandra/ArchitectureSSTable
http://en.wikipedia.org/wiki/Queuing_theory
http://rdc.taobao.com/team/jm/archives/1344

Notes
leveldb的Write/Delete:
DB::Put/Delete(DB::Open时*dbptr = impl) => DBImpl::Write => (1) 写log: log_->AddRecord (2) 写memtable: WriteBatchInternal::InsertInto(updates, mem_)

leveldb的Get:
DBImpl::Get => (1) 查memtable: mem->Get (2) 查immutable memtable: imm->Get (3) 查文件 versions_->current() => current->Get => Version::Get

leveldb的Compaction:
leveldb在Open/Get/Write时都有可能做Compaction: DB::Open/DBImpl::Get/DBImpl::Write(DBImpl::MakeRoomForWrite) =>DBImpl::MaybeScheduleCompaction => env_->Schedule(&DBImpl::BGWork, this) => (1) 启后台线程 PosixEnv::Schedule (2) DBImpl::BGWork => DBImpl::BackgroundCall => DBImpl::BackgroundCompaction

leveldb的多线程写:
DBImpl::Write的瓶颈在AcquireLoggingResponsibility，多线程写同一个db时互相竞争logger_，性能反而没有单写线程快. 所以为了scale，对leveldb做sharding，将key做hash后分到多个db，这样多线程读写不会相互竞争，经测试 num_threads : num_dbs为1:1时性能最好，充分利用多核

leveldb的性能调优:
通过sharding/batch writes/increase block_size(size per data block, default 4KB)/increase block_cache(LRUCache, default 8MB)/increase write_buffer_size(memtable size, default 4MB)来提高性能，经过测试，单机24-core采用16 threads/16 shards/1000 batch_sizes/block_size 8K/write_buffer_size 32MB能达到70w+ ops/sec的写性能