市面数据库调研（二）

最新推荐文章于 2024-05-30 17:26:40 发布

masteryhx

最新推荐文章于 2024-05-30 17:26:40 发布

阅读量378

点赞数

分类专栏：数据库文章标签： Key-value Document Column Time-series Graph

本文链接：https://blog.csdn.net/hehedaheheda/article/details/87891422

版权

数据库专栏收录该内容

2 篇文章 0 订阅

订阅专栏

承接上文对剩余5类数据库进行Overview（资料来源于某次调研英文展示，此部分尚未翻译并展开叙述，之后会补以新博文）

Key-value database

What :
a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash table.

Concept :
schema-free
Sharding;Distributed;CAP;BASE

Why :

Reduce application server CPU and memory pressure
Reduce IO read operations and IO stress
The extensibility of relational database is not strong and it is difficult to change the table structure

Feature :

Flexible data modeling
Fast write performance
Fast query performance

Use for :

Session
Distributed cache(High frequency data)(CAP)(BASE)
A distributed lock

Comparison :

Storage : map,tree,…

Implement :
DynamoDB;Riak KV;Oracle NoSQL Database(table-style);
Oracle Berkeley DB(ACID);ArangoDB;FoundationDB
Memcached;Redis;Hazelcast(ACID)
LevelDB;RocksDB

Problem :

support OLTP poorly

Document-oriented database

What :
a computer program designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.

Why(Comparison with Key-value) :

Data transparency(extract metadata that the database engine uses for further optimization)
Designed to offer a richer experience with modern programming techniques

Concept :
Collections;Document
Inverted index

Use for :

Search engines
Log

Comparison :

Encoding type : XML;JSON;YAML;BSON;PDF;WORD;EXCEL

Implement :
MongoDB;DynamoDB;CouchDB
Elasticsearch
Terrastore;RavenDB(ACID);Thrudb
OrientDB

Column-oriented database

What :
a database management system (DBMS) that stores data tables by column rather than by row.

Feature :

Extremely high loading speed
For large data sets rather than small data sets
Efficient storage space utilization
High compression rate
High CPU and memory utilization
Don’t need the view
Invisible index
Efficient projection
Efficient relational computing
Efficient aggregation operations
Efficient use of cache

Use for :
Data warehouse(Multiple data sources，Multidimensional data statistics，Large amount of stored data，More queries and fewer updates)

Implement :
HBase;Cassandra;Hypertable

Problem :

Not suitable for scanning small amounts of data
Not suitable for real-time operations with deletes and updates

Time-series database

What :
A time series database (TSDB) is a software system that is optimized for handling time series data, arrays of numbers indexed by time (a datetime or a datetime range).

if we don’t use TSDB,maybe hitory table is suitable.

Why :

Large data size(important data history)(CRud)
Availability(Time operation)
Relational database primary key and foreign key support poorly

Use for :

Monitor
Real-time data processing(IOT)

Feature :

Write more and read less
Sequential insert
Update less
Bulk delete

Distributed
Time index
Time function
Time accuracy
Timeliness
Data compression

Storage engine :
HDFS, Hbase,Cassandra,LevelDB,…

Implement :

InfluxDB,OpenTSDB,KDB+,Prometheus,KairosDB,HiTSDB
Druid,Pinot
RRDtool, Graphite

Focus :

Scalability, Query language, Downsampling, High compression ratio
OLAP
Visualization

Graph database

What :
a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data.

Feature :

Better at handling relationships than relational databases
Graph theory algorithm (shortest path calculation, concentration calculation,…)

Use for :

OLTP
OLAP

Search engine
Social network
Financial fraud prevention
Risk early warning
Business relationship
Knowledge graph (emerging, high business abstraction, few landing except social network and search engine, more attempts in financial field)

Storage :