http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
While
But
MongoDB
-
Written
in: C++ -
Main
point: Retains some friendly properties of SQL. (Query, index) -
License:
AGPL (Drivers: Apache) -
Protocol:
Custom, binary (BSON) -
Master/slave
replication (auto failover with replica sets) -
Sharding
built-in -
Queries
are javascript expressions -
Run
arbitrary javascript functions server-side -
Better
update-in-place than CouchDB -
Uses
memory mapped files for data storage -
Performance
over features -
Journaling
(with --journal) is best turned on -
On
32bit systems, limited to ~2.5Gb -
An
empty database takes up 192Mb -
GridFS
to store big data + metadata (not actually an FS) -
Has
geospatial indexing
Best
For
Riak
-
Written
in: Erlang & C, some Javascript -
Main
point: Fault tolerance -
License:
Apache -
Protocol:
HTTP/REST or custom binary -
Tunable
trade-offs for distribution and replication (N, R, W) -
Pre-
and post-commit hooks in JavaScript or Erlang, for validation and security. -
Map/reduce
in JavaScript or Erlang -
Links
& link walking: use it as a graph database -
Secondary
indices: but only one at once -
Large
object support (Luwak) -
Comes
in "open source" and "enterprise" editions -
Full-text
search, indexing, querying with Riak Search server (beta) -
In
the process of migrating the storing backend from "Bitcask" to Google's "LevelDB" -
Masterless
multi-site replication replication and SNMP monitoring are commercially licensed
Best
For
CouchDB
-
Written
in: Erlang -
Main
point: DB consistency, ease of use -
License:
Apache -
Protocol:
HTTP/REST -
Bi-directional
(!) replication, -
continuous
or ad-hoc, -
with
conflict detection, -
thus,
master-master replication. (!) -
MVCC
- write operations do not block reads -
Previous
versions of documents are available -
Crash-only
(reliable) design -
Needs
compacting from time to time -
Views:
embedded map/reduce -
Formatting
views: lists & shows -
Server-side
document validation possible -
Authentication
possible -
Real-time
updates via _changes (!) -
Attachment
handling -
thus,
CouchApps (standalone js apps) -
jQuery
library included
Best
For
Redis
-
Written
in: C/C++ -
Main
point: Blazing fast -
License:
BSD -
Protocol:
Telnet-like -
Disk-backed
in-memory database, -
Currently
without disk-swap (VM and Diskstore were abandoned) -
Master-slave
replication -
Simple
values or hash tables by keys, -
but
complex operations like ZREVRANGEBYSCORE. -
INCR
& co (good for rate limiting or statistics) -
Has
sets (also union/diff/inter) -
Has
lists (also a queue; blocking pop) -
Has
hashes (objects of multiple fields) -
Sorted
sets (high score table, good for range queries) -
Redis
has transactions (!) -
Values
can be set to expire (as in a cache) -
Pub/Sub
lets one implement messaging (!)
Best
For
HBase
-
Written
in: Java -
Main
point: Billions of rows X millions of columns -
License:
Apache -
Protocol:
HTTP/REST (also Thrift) -
Modeled
after Google's BigTable -
Uses
Hadoop's HDFS as storage -
Map/reduce
with Hadoop -
Query
predicate push down via server side scan and get filters -
Optimizations
for real time queries -
A
high performance Thrift gateway -
HTTP
supports XML, Protobuf, and binary -
Cascading,
hive, and pig source and sink modules -
Jruby-based
(JIRB) shell -
Rolling
restart for configuration changes and minor upgrades -
Random
access performance is like MySQL -
A
cluster consists of several different types of nodes
Best
For
Neo4j
-
Written
in: Java -
Main
point: Graph database - connected data -
License:
GPL, some features AGPL/commercial -
Protocol:
HTTP/REST (or embedding in Java) -
Standalone,
or embeddable into Java applications -
Full
ACID conformity (including durable data) -
Both
nodes and relationships can have metadata -
Integrated
pattern-matching-based query language ("Cypher") -
Also
the "Gremlin" graph traversal language can be used -
Indexing
of nodes and relationships -
Nice
self-contained web admin -
Advanced
path-finding with multiple algorithms -
Indexing
of keys and relationships -
Optimized
for reads -
Has
transactions (in the Java API) -
Scriptable
in Groovy -
Online
backup, advanced monitoring and High Availability is AGPL/commercial licensed
Best
For
Cassandra
-
Written
in: Java -
Main
point: Best of BigTable and Dynamo -
License:
Apache -
Protocol:
Custom, binary (Thrift) -
Tunable
trade-offs for distribution and replication (N, R, W) -
Querying
by column, range of keys -
BigTable-like
features: columns, column families -
Has
secondary indices -
Writes
are much faster than reads (!) -
Map/reduce
possible with Apache Hadoop -
All
nodes are similar, as opposed to Hadoop/HBase
Best
For
Membase
-
Written
in: Erlang & C -
Main
point: Memcache compatible, but with persistence and clustering -
License:
Apache 2.0 -
Protocol:
memcached plus extensions -
Very
fast (200k+/sec) access of data by key -
Persistence
to disk -
All
nodes are identical (master-master replication) -
Provides
memcached-style in-memory caching buckets, too -
Write
de-duplication to reduce IO -
Very
nice cluster-management web GUI -
Software
upgrades without taking the DB offline -
Connection
proxy for connection pooling and multiplexing (Moxi)
Best
For