NoSQL Zone is brought to you in partnership with:

Lijin Joseji is a Senior IT Specialist working with IBM Global Business Services since 2008. He has been involved in different projects which make use of WebSphere eXtream deployment components as well as other open source technologies. His areas of expertise includes design and development of J2EE applications, WebSphere eXtream Deployment Components such as IBM Object Grid, Compute Grid, SOA Architecture, open source frameworks such as Spring, Hibernate, Web service frameworks, NoSQL & SQL Databases and mobile development. Currently he works and specializes in WebSphere eXtreme Scale and WebSphere Extended Deployment Compute Grid and Object Grid related Projects, NoSQL dabatases, Android Development and Cloud computing. He used to write his technical views and experience through his Blog called OrangeSlate.com. Lijin is a DZone MVB and is not an employee of DZone and has posted 6 posts at DZone. You can read more from them at their website. View Full User Profile

11 OPEN NoSQL Document-Oriented Databases

07.22.2012
| 17371 views |
  • submit to reddit
A document-oriented database is a designed for storing, retrieving, and managing document-oriented, or semi structured data. Document-oriented databases are one of the main categories of NoSQL databases. The central concept of a document-oriented database is the notion of a Document. While each document-oriented database implementation differs on the details of this definition, in general, they all assume documents encapsulate and encode data (or information) in some standard format(s) (or encoding(s)). Encodings in use include XML, YAML, JSON and BSON, as well as binary forms like PDF and Microsoft Office documents (MS Word, Excel, and so on).

  • MongoDB:  MongoDB is a collection-oriented, schema-free document database. Data is grouped into sets that are called ‘collections’. Each collection has a unique name in the database, and can contain an unlimited number of documents. Collections are analogous to tables in a RDBMS, except that they don’t have any defined schema.

It store data (which is in BASON – “Binary Serialized dOcument Notation” format) that is a structured collection of key-value pairs, where keys are strings, and values are any of a rich set of data types, including arrays and documents.

Home: http://www.mongodb.org/
Quick Start: http://www.mongodb.org/display/DOCS/Quickstart
Download: http://www.mongodb.org/downloads

  • CouchDB:  CouchDB is a document database server, accessible via a RESTful JSON API.  It is Ad-hoc and schema-free with a flat address space. Its Query-able and index-able, featuring a table oriented reporting engine that uses JavaScript as a query language. A CouchDB document is an object that consists of named fields. Field values may be strings, numbers, dates, or even ordered lists and associative maps.

Home: http://couchdb.apache.org/
Quick Start: http://couchdb.apache.org/docs/intro.html
Download: http://couchdb.apache.org/downloads.html

  • Terrastore: Terrastore is a modern document store which provides advanced scalability and elasticity features without sacrificing consistency. It is based on Terracotta, so it relies on an industry-proven, fast clustering technology.

Home: http://code.google.com/p/terrastore/
Quick Start: http://code.google.com/p/terrastore/wiki/Documentation
Download: http://code.google.com/p/terrastore/downloads/list

  • RavenDB: Raven is a .NET Linq enabled Document Database, focused on providing high performance, schema-less, flexible and scalable NoSQL data store for the .NET and Windows platforms.
    Raven store any JSON document inside the database. It is schema-less database where you can define indexes using C#’s Linq syntax.

Home: http://ravendb.net/
Quick Start: http://ravendb.net/tutorials
Download: http://ravendb.net/download

  • OrientDB: OrientDB is an open source NoSQL database management system written in Java. Even if it is a document-based database, the relationships are managed as in graph databases with direct connections between records. It supports schema-less, schema-full and schema-mixed modes. It has a strong security profiling system based on users and roles and supports SQL as a query languages.

Home: http://www.orientechnologies.com/
Quick Start: http://code.google.com/p/orient/wiki/Tutorials
Download: http://code.google.com/p/orient/wiki/Download

  • ThruDB: Thrudb is a set of simple services built on top of the Apache Thrift framework that provides indexing and document storage services for building and scaling websites. Its purpose is to offer web developers flexible, fast and easy-to-use services that can enhance or replace traditional data storage and access layers.
    It supports multiple storage backends such as BerkeleyDB, Disk, MySQL and also having     Memcache and Spread integration.

Home: http://code.google.com/p/thrudb/
Quick Start: http://thrudb.googlecode.com/svn/trunk/doc/Thrudb.pdf
Download: http://code.google.com/p/thrudb/source/checkout

  • SisoDB:  SisoDb is a document-oriented db-provider for Sql-Server written in C#. It lets you store object graphs of POCOs (plain old clr objects) without having to configure any mappings. Each entity is treated as an aggregate root and will get separate tables created on the fly.

Home: http://www.sisodb.com
Quick Start: http://www.sisodb.com/Wiki
Download: https://github.com/danielwertheim/SisoDb-Provider/

  • RaptorDB: RaptorDB is a extremely small size and fast embedded, noSql, persisted dictionary database using b+tree or MurMur hash indexing. It was primarily designed to store JSON data (see my fastJSON implementation), but can store any type of data that you give it.

Home: http://www.codeproject.com/KB/database/RaptorDB.aspx
Quick Start: http://www.codeproject.com/KB/database/RaptorDB.aspx
Download: http://www.codeproject.com/KB/database/RaptorDB.aspx

  • CloudKit: CloudKit provides schema-free, auto-versioned, RESTful JSON storage with optional OpenID and OAuth support, including OAuth Discovery.

Home: http://getcloudkit.com/
Quick Start: http://getcloudkit.com/api/
Download: https://github.com/jcrosby/cloudkit

  • Perservere: Persevere is an open source set of tools for persistence and distributed computing using an intuitive standards-based JSON interfaces of HTTP REST, JSON-RPC, JSONPath, and REST Channels. The core of the Persevere project is the Persevere Server. The Persevere server includes a Persevere JavaScript client, but the standards-based interface is intended to be used with any framework or client.

Home: http://code.google.com/p/persevere-framework/
Quick Start: http://code.google.com/p/persevere-framework/w/list
Download: http://code.google.com/p/persevere-framework/downloads/list

  • Jackrabbit: The Apache Jackrabbit™ content repository is a fully conforming implementation of the Content Repository for Java Technology API (JCR, specified in JSR 170 and 283). A content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more.

Home: http://jackrabbit.apache.org
Quick Start: http://jackrabbit.apache.org/getting-started-with-apache-jackrabbit.html
Download: http://jackrabbit.apache.org/downloads.html

Conclusion:
Document databases store and retrieve documents and basic atomic stored unit is a document.  As always your requirement leads into the decision. You need to think about your data-access patterns / use-cases to create a smart document-model. When your domain model can be split and partitioned across some documents, a document-database will be a suitable one for you. For example for a blog-software, a CMS or a wiki-software a document-db works extremely well. But at the same time a non-relational database is not better than a relational one in some cases where  your database have a lot of relations and normalization.

Just check the following link from stackoverflow also to cover the pros/cons of Relational Vs Document based databases.
http://stackoverflow.com/questions/337344/pros-cons-of-document-based-databases-vs-relational-databases

Published at DZone with permission of Lijin Joseji, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Mr Eragon replied on Mon, 2012/07/23 - 10:52pm

Greate article! thanks you.

i think you should add GT.M - it's a very mature nosql database which support transaction also.

Lijin Joseji replied on Tue, 2012/07/24 - 9:07am in response to: Mr Eragon

Thanks Eragon for your comment! In the above article I have included only the document oriented NoSQL DBs. GT.M comes mainly in key-value store NoSQL section.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.