CouchDB vs Redis vs MongoDB vs Riak vs Membase vs Neo4j vs Cassandra vs HBase comparison
by krishna
CouchDB | Redis | MongoDB | Riak | Membase | Neo4j | Cassandra | Hbase |
Written in: Erlang | Written in: C/C++ | Written in: C++ | Written in: Erlang & C, some Javascript | Written in: Erlang & C | Written in: Java | Written in: Java | Written in: Java |
Main point: DB consistency, ease of use | Main point: Blazing fast | Main point: Retains some friendly properties of SQL. (Query, index) | Main point: Fault tolerance | Main point: Memcache compatible, but with persistence and clustering | Main point: Graph database – connected data | Main point: Best of BigTable and Dynamo | Main point: Billions of rows X millions of columns |
License: Apache | License: BSD | License: AGPL (Drivers: Apache) | License: Apache | License: Apache 2.0 | License: GPL, some features AGPL/commercial | License: Apache | License: Apache |
Protocol: HTTP/REST | Protocol: Telnet-like | Protocol: Custom, binary (BSON) | Protocol: HTTP/REST or custom binary | Protocol: memcached plus extensions | Protocol: HTTP/REST (or embedding in Java) | Protocol: Custom, binary (Thrift) | Protocol: HTTP/REST (also Thrift) |
Bi-directional (!) replication, | Disk-backed in-memory database, | Master/slave replication (auto failover with replica sets) | Tunable trade-offs for distribution and replication (N, R, W) | Very fast (200k+/sec) access of data by key | Standalone, or embeddable into Java applications | Tunable trade-offs for distribution and replication (N, R, W) | Modeled after BigTable |
continuous or ad-hoc, | Currently without disk-swap (VM and Diskstore were abandoned) | Sharding built-in | Pre- and post-commit hooks in JavaScript or Erlang, for validation and security. | Persistence to disk | Full ACID conformity (including durable data) | Querying by column, range of keys | Map/reduce with Hadoop |
with conflict detection, | Master-slave replication | Queries are javascript expressions | Map/reduce in JavaScript or Erlang | All nodes are identical (master-master replication) | Both nodes and relationships can have metadata | BigTable-like features: columns, column families | Query predicate push down via server side scan and get filters |
thus, master-master replication. (!) | Simple values or hash tables by keys, | Run arbitrary javascript functions server-side | Links & link walking: use it as a graph database | Provides memcached-style in-memory caching buckets, too | Integrated pattern-matching-based query language (“Cypher”) | Writes are much faster than reads (!) | Optimizations for real time queries |
MVCC – write operations do not block reads | but complex operations like ZREVRANGEBYSCORE. | Better update-in-place than CouchDB | Secondary indices: search in metadata | Write de-duplication to reduce IO | Also the “Gremlin” graph traversal language can be used | Map/reduce possible with Apache Hadoop | A high performance Thrift gateway |
Previous versions of documents are available | INCR & co (good for rate limiting or statistics) | Uses memory mapped files for data storage | Large object support (Luwak) | Very nice cluster-management web GUI | Indexing of nodes and relationships | I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc) | HTTP supports XML, Protobuf, and binary |
Crash-only (reliable) design | Has sets (also union/diff/inter) | Performance over features | Comes in “open source” and “enterprise” editions | Software upgrades without taking the DB offline | Nice self-contained web admin | Cascading, hive, and pig source and sink modules | |
Needs compacting from time to time | Has lists (also a queue; blocking pop) | Journaling (with –journal) is best turned on | Full-text search, indexing, querying with Riak Search server (beta) | Connection proxy for connection pooling and multiplexing (Moxi) | Advanced path-finding with multiple algorithms | Jruby-based (JIRB) shell | |
Views: embedded map/reduce | Has hashes (objects of multiple fields) | On 32bit systems, limited to ~2.5Gb | In the process of migrating the storing backend from “Bitcask” to Google’s “LevelDB” | Indexing of keys and relationships | No single point of failure | ||
Formatting views: lists & shows | Sorted sets (high score table, good for range queries) | An empty database takes up 192Mb | Masterless multi-site replication replication and SNMP monitoring are commercially licensed | Optimized for reads | Rolling restart for configuration changes and minor upgrades | ||
Server-side document validation possible | Redis has transactions (!) | GridFS to store big data + metadata (not actually an FS) | Has transactions (in the Java API) | Random access performance is like MySQL | |||
Authentication possible | Values can be set to expire (as in a cache) | Scriptable in Groovy | |||||
Real-time updates via _changes (!) | Pub/Sub lets one implement messaging (!) | Online backup, advanced monitoring and High Availability is AGPL/commercial licensed | |||||
Attachment handling | |||||||
thus, CouchApps (standalone js apps) | |||||||
jQuery library included | |||||||
http://couchapp.org/page/index | http://redis.io/commands | ||||||
Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important. | Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory). | Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks. | Best used: If you want something Cassandra-like (Dynamo-like), but no way you’re gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site replication. | Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement. | Best used: For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense. | Best used: When you write more than you read (logging). If every component of the system must be in Java. (“No one gets fired for choosing Apache’s stuff.”) | Best used: If you’re in love with BigTable. 🙂 And when you need random, realtime read/write access to your Big Data. |
For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments. | For example: Stock prices. Analytics. Real-time data collection. Real-time communication. | For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back. | For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server. | For example: Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga). | For example: Social relations, public transport links, road maps, network topologies. | For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis. | For example: Facebook Messaging Database (more general example coming soon) |
CouchDB Redis MongoDB Riak Membase Neo4j Cassandra Hbase Written in: Erlang Written in: C/C++ Written in: C++ Written in: Erlang & C, some Javascript Written in: Erlang & C Written in: Java Written in: Java Written in: Java Main point: DB consistency, ease of use Main point: Blazing fast Main point: Retains some friendly…
Recent Comments
Archives
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- February 2012
- January 2012
- December 2011
- October 2011
- August 2011
- July 2011
- May 2011
- January 2011
- November 2010
- October 2010
- September 2010
- July 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- August 2008
- July 2008
- June 2008
- December 2007
- April 2007
- January 2007