这里简单介绍目前商业市场上出现的宣称是“第三代”图数据库产品,能支持OLAP和OLTP的场景。这个厂商提出了一个新的名词叫NPG(Native Parallel Graph)原生并行图(感觉广告软文在创造新词汇...o(╯□╰)o)。
A Native Distributed Graph(原生分布式图)
Its data store holds nodes, links, and their attributes. Some graph database products on the market are really wrappers built on top of a more generic NoSQL data store. This virtual graph strategy has a double penalty when it comes to performance.
另外,Neo4j也是native graph, index-free的形式,看来没有别的捷径,要想图数据库引擎跑得快,需要最大程度减少磁盘IO和网络IO,native+memory是最自然的实现方式,除非后续计算机业界有新的存储技术突破
Compact Storage with Fast Access(存储压缩与快速访问)
Internally hash indices are used to reference nodes and links. In Big-O terms, our average access time is O(1) and our average index update time is also O(1).
Users can set parameters that specify how much of the available memory may be used for holding the graph. If the full graph does not fit in memory, then the excess is stored on disk. Best performance is achieved when the full graph fits in memory, of course.
Data values are stored in encoded formats that effectively compress the data. The compression factor varies with the graph structure and data, but typical compression factors are between 2x and 10x. Compression has two advantages: First, a larger amount of graph data can fit in memory and in CPU cache. Such compression reduces not only the memory footprint, but also CPU cache misses, speeding up overall query performance. Second, for users with very large graphs, hardware costs are reduced.
In general, decompression is needed only for displaying the data. When values are used internally, often they may remain encoded and compressed.