MapDB初探

最新推荐文章于 2022-01-04 10:40:56 发布

mahui_1980

最新推荐文章于 2022-01-04 10:40:56 发布

阅读量785

点赞数

本文链接：https://blog.csdn.net/mahui_1980/article/details/110424254

版权

MapDB是一个开源，最快的Java数据库，性能可与java.util 集合相当。
Maven依赖

<dependency>
    <groupId>org.mapdb</groupId>
    <artifactId>mapdb</artifactId>
    <version>VERSION</version>
</dependency>

快照库

<repositories>
    <repository>
    <id>sonatype-snapshots</id>
    <url>https://oss.sonatype.org/content/repositories/snapshots</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>org.mapdb</groupId>
        <artifactId>mapdb</artifactId>
        <version>VERSION</version>
    </dependency>
</dependencies>

maven hello world

//import org.mapdb.*
DB db = DBMaker.memoryDB().make();
ConcurrentMap map = db.hashMap("map").createOrOpen();
map.put("something", "here");


DB db = DBMaker.fileDB("file.db").make(); 
ConcurrentMap map = db.hashMap("map").createOrOpen();
map.put("something", "here");
db.close();

MapDB使用泛型序列化

DB db = DBMaker.fileDB("file.db").fileMmapEnable().make(); 
ConcurrentMap<String,Long> map = 
    db.hashMap("map", Serializer.STRING, Serializer.LONG)
        .createOrOpen(); 
map.put("something", 111L);
db.close();

DBMaker‌使用
可以使用许多*DB 静态方法，例如DBMaker.fileDB()。MapDB具有很多格式和模式，每个xxxDB()使用不同的模式：memoryDB()打开一个由byte [] 数组支持的内存数据库，
appendFileDB()打开一个使追加的日志文件等。

DB db = DBMaker.fileDB("/some/file")
//TODO encryption API
//.encryptionEnable("password")
.make();

打开并创建集合
create() 创建新的集合。如果集合存在，将扔出异常。
open() 打开存在的集合。如果集合不存在，将扔出异常。
createOrOpen() 如果存在就打开, 否则创建。

NavigableSet<String> treeSet = 
    db.treeSet("treeSet").maxNodeSize(112).
        createOrOpen();

事务
DB具有处理事务生命周期的方法: commit() , rollback() and close() .

ConcurrentNavigableMap<Integer,String> map = 
    db.treeMap("collectionName", Serializer.INTEGER, Serializer.STRING).
        createOrOpen();
map.put(1,"one");
map.put(2,"two");
//map.keySet() is now [1,2] even before commit
db.commit(); //persist changes into disk map.put(3,"three");
//map.keySet() is now [1,2,3] db.rollback(); //revert recent changes
//map.keySet() is now [1,2]
db.close();

HTreeMap

序列化器定义

HTreeMap<String, Long> map = 
    db.hashMap("name_of_map").
    keySerializer(Serializer.STRING).
    create();

//or shorter form 
HTreeMap<String, Long> map2 = 
    db.hashMap("some_other_map", Serializer.STRING, Serializer.LONG).
    create();

非序列化器定义

HTreeMap map = db.hashMap("name_of_map").create();

哈希编码
大多数哈希映射使用由Object.hashCode（）生成的32位哈希值，并检查其相等性
Object.equals（other）。但是很多类（byte [] ，int [] ）不能正确实现。
MapDB使用Key Serializer生成哈希码并比较键。例如，如果使用Serializer.BYTE_ARRAY 作为键序列化器，byte [] 可以直接用作HTreeMap中的键：
byte [] 可以直接用作HTreeMap中的键

HTreeMap<byte[], Long> map = 
    db.hashMap("map").
    keySerializer(Serializer.BYTE_ARRAY).
    valueSerializer(Serializer.LONG).
    create();

一些类中的hashCode（）是弱的，它会导致冲突并降低性能。

弱hashCode

//this will use weak `String.hashCode()` 
HTreeMap<String, Long> map2 = db.hashMap("map2")
// use weak String.hashCode()
.keySerializer(Serializer.STRING_ORIGHASH)
.valueSerializer(Serializer.LONG)
.create();

强hashCode

Serializer.STRING 使用更强的XXHash，这会产生较少的冲突。
//this will use strong XXHash for Strings 
HTreeMap<String, Long> map = db.hashMap("map")
// by default it uses strong XXHash
.keySerializer(Serializer.STRING)
.valueSerializer(Serializer.LONG)
.create();

哈希MAP容易受到哈希碰撞攻击。HTreeMap 增加了Hash Seed的保护。在创建集合时随机生成，并与其定义一起保持。用户还可以提供自己的哈希种子：

HTreeMap<String, Long> map = 
    db.hashMap("map", Serializer.STRING, Serializer.LONG)
    .hashSeed(111) //force Hash Seed value
    .create();

计数器

HTreeMap<String, Long> map = 
    db.hashMap("map", Serializer.STRING, Serializer.LONG).
    counterEnable().
    create();

值加载器如果没有找到现有的键，则加载一个值的函数。新创建的键/值将插入到map中。这样map.get（key）就不会返回null。

HTreeMap<String,Long> map = db.hashMap("map", Serializer.STRING, Serializer.LONG).valueLoader(s -> 1L).create();
//return 1, even if key does not exist 
Long one = map.get("Non Existent");
// Value Creator output was added to Map 
map.size(); // => 1

分片存储

HTreeMap<String, byte[]> map = DBMaker
//param is number of Stores (concurrency factor)
.memoryShardedHashMap(8)
.keySerializer(Serializer.STRING)
.valueSerializer(Serializer.BYTE_ARRAY)
.create();

//DB does not exist, so close map directly 
map.close();

超期限制

HTreeMap cache = 
    db.hashMap("cache").
    expireAfterUpdate(10, TimeUnit.MINUTES).
    expireAfterCreate(10, TimeUnit.MINUTES).
    expireAfterGet(1, TimeUnit.MINUTES).
    create();

空间限制2GB

// Off-heap map with max size 16GB Map 
cache = db.hashMap("map").
    expireStoreSize(2 * 1024*1024*1024).
    expireAfterGet().
    create();

最大尺寸限制

HTreeMap cache = db.hashMap("cache").
    expireMaxSize(128).
    expireAfterGet().
    create();

HTreeMap为每个段维护LIFO超期队列,更新（值更改）条目放入到期队列后，新条目才会过期。

HTreeMap cache = db.hashMap("cache").
    expireAfterUpdate(1000).
    create();

到期触发器expireAfterCreate（），expireAfterUpdate（）和expireAfterGet（）

DB db = DBMaker.memoryDB().make();

ScheduledExecutorService executor = Executors.newScheduledThreadPool(2);

HTreeMap cache = db
.hashMap("cache")
.expireMaxSize(1000)
.expireAfterGet()
.expireExecutor(executor)
.expireExecutorPeriod(10000)
.create();

//once we are done, background threads needs to be stopped db.close();

过期与分段组合，获得更好并发性能

HTreeMap cache = DBMaker
.memoryShardedHashMap(16)
.expireAfterUpdate()
.expireStoreSize(128*1024*1024)
.create();

压缩空间
分段后内存分散，造成无法回收空间,压缩处理

HTreeMap cache = DBMaker
.memoryShardedHashMap(16)
.expireAfterUpdate()
.expireStoreSize(128*1024*1024)

//entry expiration in 3 background threads
.expireExecutor(
Executors.newScheduledThreadPool(3))

//trigger Store compaction if 40% of space is free
.expireCompactThreshold(0.4)

.create();

过期转储到硬盘

DB dbDisk = DBMaker
.fileDB(file)
.make();

DB dbMemory = DBMaker
.memoryDB()
.make();

// Big map populated with data expired from cache 
HTreeMap onDisk = dbDisk.hashMap("onDisk").create();

// fast in-memory collection with limited size 
HTreeMap inMemory = dbMemory.hashMap("inMemory").expireAfterGet(1, TimeUnit.SECONDS)
//this registers overflow to `onDisk`
.expireOverflow(onDisk)
//good idea is to enable background expiration
.expireExecutor(Executors.newScheduledThreadPool(2))
.create();

map.remove（）也将删除onDisk 中的条目。

//insert entry manually into both maps for demonstration 
inMemory.put("key", "map");

//first remove from inMemory 
inMemory.remove("key"); 
onDisk.get("key"); // -> not found

如果调用了inMemory.get（key），并且值不存在，则Value Loader将尝试在onDisk中查找Map 。如果在onDisk中找到值，它将被添加到inMemory中。

onDisk.put(1,"one");
inMemory.size();
//onDisk has content, inMemory is empty
//> 0

// get method will not find value inMemory, and will get value from onDisk 
inMemory.get(1); //> "one"
// inMemory now caches result, it will latter expire and move to onDisk
inMemory.size(); //> 1

清除整个主map并将所有数据移动到磁盘中：

inMemory.put(1,11); inMemory.put(2,11);
//expire entire content of inMemory Map 
inMemory.clearWithExpire();

BTreeMap‌

提供TreeMap 和TreeSet 。它基于无锁并发B-Linked-Tree。它为小键提供了出色的性能，并具有良好的垂直可扩展性。
制造商指定的可选参数：

BTreeMap<Long, String> map = 
    db.treeMap("map")
        .keySerializer(Serializer.LONG)
        .valueSerializer(Serializer.STRING)
        .createOrOpen();

最大节点大小为32个条目，可以通过以下方式更改：

BTreeMap<Long, String> map = 
    db.treeMap("map", Serializer.LONG, Serializer.STRING)
        .counterEnable()
        .createOrOpen();

值也存储为BTree叶节点的一部分。大值意味着巨大的开销，并且在单个map.get(“key”)上，32个值被反序列化，但只返回一个值。在这种情况下，最好将叶节点外的值存储在单独的记录中。在这个
情况下，叶节点只有一个6字节的recid指向该值。
也可以压缩大值以节省空间。此示例将值存储在BTree Leaf Node外部，并对每个值应用压缩：

BTreeMap<Long, String> map = 
    db.treeMap("map")
        .valuesOutsideNodesEnable()
        .valueSerializer(new SerializerCompressionWrapper(Serializer.STRING))
        .createOrOpen();

BTreeMap需要以某种方式对其进行排序。默认情况下，它依赖于大多数Java类实现的Comparable 接口。如果此接口未实现，则必须提供主键串行器。可以比较对象数组：

BTreeMap<Object[], Long> map = db.treeMap("map")
    // use array serializer for unknown objects
    .keySerializer(new SerializerArray())
    // or use wrapped serializer for specific objects such as String
    .keySerializer(new SerializerArray(Serializer.STRING))
    .createOrOpen();

使用byte[]来替换String将获得更好的性能,使用专用的主键序列化程序,优化是被自动使用

BTreeMap<byte[], Long> map = db.treeMap("map")
    .keySerializer(Serializer.BYTE_ARRAY)
    .valueSerializer(Serializer.LONG)
    .createOrOpen();

懒加载

BTreeMap<byte[], Integer> map = db
    .treeMap("towns", Serializer.BYTE_ARRAY, Serializer.INTEGER)
    .createOrOpen();

map.put("New York".getBytes(), 1);
map.put("New Jersey".getBytes(), 2);
map.put("Boston".getBytes(), 3);

//get all New* cities
Map<byte[], Integer> newCities = map.prefixSubMap("New".getBytes());

复合主键和元组

// initialize db and map
DB db = DBMaker.memoryDB().make(); 
BTreeMap<Object[], Integer> map = db.treeMap("towns")
    .keySerializer(new SerializerArrayTuple(
        Serializer.STRING, Serializer.STRING, Serializer.INTEGER))
    .valueSerializer(Serializer.INTEGER)
    .createOrOpen();