Machine Learning Quiz (134 Objective Questions) Start ML Quiz

Deep Learning Quiz (205 Objective Questions) Start DL Quiz

Deep Learning Free eBook Download

Saturday, 4 January 2014

12 Best Free and Open Source NoSQL Databases

12 Best Free and Open Source NoSQL Databases

NoSQL databases are becoming popular day by day. I have come up with the list of best, free and open source NoSQL databases. MongoDB tops the list of Open Source NoSQL databases. This list of free and open source databases comprises of MongoDB, Cassandra, CouchDB, Hypertable, Redis, Riak, Neo4j, HBASE, Couchbase, MemcacheDB, RevenDB and Voldemort. These free and open source NoSQL databases are really highly scale-able, flexible and good for big data storage and processing. These open source NoSQL databases are far ahead in terms of performance as compared to traditional relational databases. However, these may not be always the best choice for you. Most of common applications can still be developed using traditional relational databases. NoSQL databases are still not the best option for a mission critical transaction needs. I have listed down a small description of all these best free and oper source NoSQL databases. Lets have a look..

1. MongoDB

MongoDB is a document-oriented database that uses a JSON-style data format. It's ideal for website data storage, content management and caching applications, and can be configured for replication and high availability.

This highly scale-able and agile NoSQL database is a amazing performing system. This open source database written in C++ comes with a storage that is document oriented. Also, you will be provided with benefits like full index support, high availability across WANs and LANs along with easy replication, horizontal scaling, rich queries that are document based, flexibility in data processing and aggregation along with proper training, support and consultation.

2. Cassandra

An Apache Software Foundation project, Cassandra is a distributed database that allows for decentralized data storage that is fault tolerant and has no single point of failure. In other words, "Cassandra is suitable for applications that can't afford to lose data."

3. CouchDB

A product of the Apache Software Foundation, CouchDB is another document-oriented database that stores data in JSON format. It's ACID compliant, and like MongoDB, can be used to store data and content for websites, and to provide caching. You can use JavaScript to run MapReduce Queries on CouchDB. It also provides a very convenient web based administration console. This database could be really handy for web applications.

4. Hypertable

Modeled after Google's BigTable database system, Hypertable's creators aim for it to be the "open source standard for highly available, petabyte scale, database systems." In other words, Hypertable is designed for storing massive amounts of data reliably across many cheap servers.

5. Redis

This is an open source, key value store of an advanced level. Owing to the presence of hashes, sets, strings, sorted sets and lists in a key; Redis is also called as a data structure server. This system will help you in running atomic operations like incrementing value present in a hash, set intersection computation, string appending, difference and union. Redis makes use of in-memory dataset to achieve high performance. Also, this system is compatible with most of the programming languages.

6. Riak

This is one of the most powerful, distributed databases ever to be introduced. It provides for easy and predictable scaling and equips users with the ability for quick testing, prototyping and application deployment so as to simplify development.

7. Neo4j

This is a NoSQL graph database which exhibits a high level of performance. It comes well equipped with all the features of a robust and mature system. It provides the programmers with a flexible and object oriented network structure and allows them to enjoy all the benefits of a database that is fully transactional. Compared to RDBMS, Neo4j will also provide you with performance improvements on some of the applications.

8. Hadoop HBASE

HBase can be easily considered as a scalable, distributed and a big data store. This database can be used when you are looking for real time and random access to your data. It comes with modular and linear scalability along with reads and writes that are strictly consistent. Other features include Java API that has an easy client access, table sharding that is configurable and automatic, Bloom filters and block caches and much more.

9. Couchbase

While Couchbase was a fork of CouchDB, it has become more of a full-fledged data product and less of a ball of framework than CouchDB. Its transition to a document database will give MongoDB a run for its money. It is multithreaded per node, which can be a major scalability benefit -- especially when hosted on custom or bare-metal hardware. With some nice integration features, including with Hadoop, Couchbase is a great choice for an operational data store.

10. MemcacheDB

This is a distributed storage system of key value. It should not be confused with a cache solution; rather, it is a persistent storage engine which is meant for data storage and retrieval in a fast and reliable manner. Confirmation to memcache protocol is provided for. The storing backend that is used is the Berkeley DB which supports features like replication and transaction.


RAVENDB is a second generation open source DB. This DB is document oriented and schema free such as you simply have to dump in your objects into it. It provides extremely flexible and fast queries. This application makes scaling extremely easy by providing out-of-the-box support for replication, multi tenancy and sharding. There is full support for ACID transactions along with safety of your data. Easy extensibility via bundles is provided along with high performance.

12. Voldemort

This is an automatically replicating distributed storage system. It provides for automatic partitioning of data, transparent handling of server failure, pluggable serialization, independence of nodes and versioning of data items along with support for data distribution across various centers.

No comments:

Post a Comment

About the Author

I have more than 10 years of experience in IT industry. Linkedin Profile

I am currently messing up with neural networks in deep learning. I am learning Python, TensorFlow and Keras.

Author: I am an author of a book on deep learning.

Quiz: I run an online quiz on machine learning and deep learning.