MapDB是一个快速、易用的嵌入式Java数据库引擎,它提供了基于磁盘或者堆外(off-heap允许Java直接操作内存空间, 类似于C的malloc和free)存储的并发的Maps、Sets、Queues。
业务场景:
朋友公司需要根据坐标,在200m的地址库中寻找离该坐标最近的经纬度坐标,难点主要有以下两个:
1.快速把坐标落点到二维的平面上区域,假设(-1,-1),应该落点到xy二维的左下方,这里我采用KDTree的方式
2.因为考虑到tree构建成功后,不想每次都重新构建树,那就需要把树缓存起来,但是通过redis等分布式的cache觉得网络带宽是瓶颈,而且我们的地址库可能会频繁更新,如果用jvm等map的缓存,内存马上就被爆仓了,后来转用MapDB发现它提供多种缓存方式,而且对比后,不管速率以及占用空间都相对较小
3.计算点点之间的距离,在二维平面上其实并不难,通过向量,计算sin、cos等常用手段,马上计算所得结果
Spring中但配置
<bean id="dbFile" class="java.io.File">
<constructor-arg value="/usr/local/DB/monitor.DB"></constructor-arg>
</bean>
<bean id="dbFactory" class="org.mapdb.DBMaker"
factory-method="newFileDB">
<constructor-arg ref="dbFile" />
</bean>
<bean id="shutdownHook"
factory-bean="dbFactory"
factory-method="closeOnJvmShutdown">
</bean>
<bean id="database"
factory-bean="dbFactory"
factory-method="make">
</bean>
Spring应用启动时加载
public class StartupListener implements ServletContextListener {
private static final Logger LOG = LoggerFactory.getLogger(StartupListener.class);
@Override
public void contextInitialized(ServletContextEvent e) {
ApplicationContext ctx = WebApplicationContextUtils.getWebApplicationContext(e.getServletContext());
// AddressInfoMapper addressInfoMapper = (AddressInfoMapper)ctx.getBean("addressInfoMapper");
DB db = (DB) ctx.getBean("database");
BTreeMap<String, String> monitorDataMap = db.getTreeMap("monitorDataMap");
// monitorDataMap.put("name", "Young");
//you can load address information to mapdb
db.commit();
if (ctx == null) {
LOG.error("app start fail!", e);
throw new RuntimeException("WebApplicationContextUtils.getWebApplicationContext() Fail!");
}
LOG.info("app start success.");
}
@Override
public void contextDestroyed(ServletContextEvent sce) {
}
}
Service中使用
// Injected database the map are obtained from it.
private DB database;
private BTreeMap<String, String> monitorDataMap;
public void setDatabase(DB database) {
this.database = database;
}
@PostConstruct
public void init() throws Exception {
this.monitorDataMap = database.getTreeMap("monitorDataMap");
}
KDTree构建
public class KDTree {
// prevent instantiation
private KDTree() {}
private KDTreeNode root;
public static KDTree build(List<? extends Point> points) {
KDTree tree = new KDTree();
tree.root = build(points, 0);
return tree;
}
private static KDTreeNode build(List<? extends Point> points, int depth) {
if (points.isEmpty()) return null;
final int axis = depth % 2;
Collections.sort(points, new Comparator<Point>() {
public int compare(Point p1, Point p2) {
double coord1 = p1.getCoords()[axis];
double coord2 = p2.getCoords()[axis];
return Double.compare(coord1, coord2);
}
});
int index = points.size() / 2;
KDTreeNode leftChild = build(points.subList(0, index), depth + 1);
KDTreeNode rightChild = build(points.subList(index + 1, points.size()), depth + 1);
Point point = points.get(index);
return new KDTreeNode(point, axis, leftChild, rightChild);
}
@SuppressWarnings({"unchecked"})
public <T extends Point> T findNearest(Point point) {
return (T) findNearest(point, 1).get(0);
}
public List<? extends Point> findNearest(Point point, int amount) {
return root.findNearest(point, amount);
}
@SuppressWarnings({"unchecked"})
public <T extends Point> T getRootPoint() {
return (T) root.getPoint();
}
}
个人结论:
在使用mapdb的使用后,本人并未去深入了解mapdb的底层原理,只是应急使用,后续肯定会有很多bug显现,但是在使用其框架后,确实性能不少,3-5ms内就能够很容易的找到点之间最近关联的,内存损耗40多m左右。