市面数据库调研(二)

承接上文对剩余5类数据库进行Overview(资料来源于某次调研英文展示,此部分尚未翻译并展开叙述,之后会补以新博文)

Key-value database

What :
a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash table.

Concept :
schema-free
Sharding;Distributed;CAP;BASE

Why :

  • Reduce application server CPU and memory pressure
  • Reduce IO read operations and IO stress
  • The extensibility of relational database is not strong and it is difficult to change the table structure

Feature :

  • Flexible data modeling
  • Fast write performance
  • Fast query performance

Use for :

  • Session
  • Distributed cache(High frequency data)(CAP)(BASE)
  • A distributed lock

Comparison :

Storage : map,tree,…

Implement :
DynamoDB;Riak KV;Oracle NoSQL Database(table-style);
Oracle Berkeley DB(ACID);ArangoDB;FoundationDB
Memcached;Redis;Hazelcast(ACID)
LevelDB;RocksDB

Problem :

  • support OLTP poorly
Document-oriented database

What :
a computer program designed for storing, retrieving and managing document-oriented information, also known as semi-structured data.

Why(Comparison with Key-value) :

  • Data transparency(extract metadata that the database engine uses for further optimization)
  • Designed to offer a richer experience with modern programming techniques

Concept :
Collections;Document
Inverted index

Use for :

  • Search engines
  • Log

Comparison :

Encoding type : XML;JSON;YAML;BSON;PDF;WORD;EXCEL

Implement :
MongoDB;DynamoDB;CouchDB
Elasticsearch
Terrastore;RavenDB(ACID);Thrudb
OrientDB

Column-oriented database

What :
a database management system (DBMS) that stores data tables by column rather than by row.

Feature :

  • Extremely high loading speed

  • For large data sets rather than small data sets

  • Efficient storage space utilization

  • High compression rate

  • High CPU and memory utilization

  • Don’t need the view

  • Invisible index

  • Efficient projection

  • Efficient relational computing

  • Efficient aggregation operations

  • Efficient use of cache

Use for :
Data warehouse(Multiple data sources,Multidimensional data statistics,Large amount of stored data,More queries and fewer updates)

Implement :
HBase;Cassandra;Hypertable

Problem :

  • Not suitable for scanning small amounts of data
  • Not suitable for real-time operations with deletes and updates
Time-series database

What :
A time series database (TSDB) is a software system that is optimized for handling time series data, arrays of numbers indexed by time (a datetime or a datetime range).

if we don’t use TSDB,maybe hitory table is suitable.

Why :

  • Large data size(important data history)(CRud)
  • Availability(Time operation)
  • Relational database primary key and foreign key support poorly

Use for :

  • Monitor
  • Real-time data processing(IOT)

Feature :

  • Write more and read less
  • Sequential insert
  • Update less
  • Bulk delete

  • Distributed
  • Time index
  • Time function
  • Time accuracy
  • Timeliness
  • Data compression

Storage engine :
HDFS, Hbase,Cassandra,LevelDB,…

Implement :

  • InfluxDB,OpenTSDB,KDB+,Prometheus,KairosDB,HiTSDB
  • Druid,Pinot
  • RRDtool, Graphite

Focus :

  • Scalability, Query language, Downsampling, High compression ratio
  • OLAP
  • Visualization
Graph database

What :
a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data.

Feature :

  • Better at handling relationships than relational databases
  • Graph theory algorithm (shortest path calculation, concentration calculation,…)

Use for :

  • OLTP
  • OLAP

  • Search engine
  • Social network
  • Financial fraud prevention
  • Risk early warning
  • Business relationship
  • Knowledge graph (emerging, high business abstraction, few landing except social network and search engine, more attempts in financial field)

Storage :

  • Native graph (adjacency list, adjacency matrix) (good at traversal)
  • Non-native graph (MySQL,…)(good at non-traversal operation)

Processing engine :
Cassovary,Pegasus,Giraph

Comparison :
Implement :
Neo4j(Neo4j Bloom),AllegroGraph
FlockDB,GraphDB

Problem :

  • not applicable to large-scale data
  • does not apply to binary stored data
  • not suitable for large amounts of event-driven data
  • super node problem
  • distributed large graph problem
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值