NoSQL Zone is brought to you in partnership with:

Amresh is a software designer/ developer based in New Delhi, India. He has worked with renowned software service providers. His area of interests are server side web development, NoSQL databases and Java programming. He loves reading, and sharing knowledge. Amresh is a DZone MVB and is not an employee of DZone and has posted 8 posts at DZone. You can read more from them at their website. View Full User Profile

SQLifying NoSQL – Are ORM tools relevant to NoSQL?

09.12.2012
| 10242 views |
  • submit to reddit

Introduction

If you reached this page, it’s fair to assume that you must have worked on at least one relational database in your lifetime. They have been in use for a quarter of a century and are found in almost all business applications.

But, NoSQL databases are gaining traction these days. they are often called “Not only SQL” databases. It’s an umbrella term for a loosely defined class of non-relational data-stores.

They exhibit following main characteristics:

  • They don’t use SQL as their query language.
  • They may not give full ACID guarantees.
  • They have distributed, fault-tolerant architecture.

In this article, I am going to explore whether ORM tools (whatever they are) make sense in NoSQL world…and whether they will be able to solve problems that are NoSQL specific. Next we’ll delve into approaches and challenges in making such a tool. This article assumes you are already familiar with and have worked on one (and possibly more) NoSQL database.

ORM Solutions

ORM (Object Relational Mapping) solutions came into existence to solve OO-impedance mismatching problem. Most popular among them are Hibernate, Toplink, EclipseLink etc. They worked beautifully with relational databases like Oracle and MySQL, among others.

Each ORM solution had its own API and object query language (like HQL for hibernate) which made it difficult for programmers to switch from one framework to another. As a result, efforts were made to make standards and specifications. Most popular ORM Standards are:

  1. EJB – Enterprise Java Beans (Entity Beans to be specific)
  2. JPA – Java Persistence API
  3. JDO – Java Data Objects
  4. SDO – Service Data Objects

Problems in working with NoSQL

Problem with NoSQL databases is that there is NOT EVEN ONE existing industry standard (like SQL) for them. The very basic idea of “something opposed to SQL”…and as a result – deviation from standards and rules, is going to be suicidal, if not corrected at right time. Learning to work with a new NoSQL database is always cumbersome as a result.

Apart from that, people lack in-depth knowledge of NoSQL. Even if they do, they are confined to one or two. In relational world, people depend upon their knowledge of SQL and JDBC to work on basic and intermediate database things. Switching to another database requires little or almost no effort, which otherwise is painful in NoSQL  world.

ORM for NoSQL?

ORM for NoSQL is a bit mis-leading term. People prefer to call it “OM tool for NoSQL” or maybe “ODM – Object data-store Mapping tool”. ORM frameworks have already been there for 30+ years and it’s a de-facto industry standard. People are very clear about what ORM tools are supposed to do. There are no surprises.

Key here is to let people forget worrying about complexities inherent in NoSQLs. Let them do things in a way they already know and are comfortable with. Why not use an approach that is there for this problem domain for decades and has proven its usefulness.

A good use case advocating use of ORM tools is migration of applications (built using ORM tool) from RDBMS to NoSQL database. (or even from one NoSQL database to another). This requires (at least in theory) little or no programming effort in business domain.

Challenges in Making ORM Solution

Here are some real challenges that ORM solution providers are going to face:

  • Making one solution for many common-looking problems requires some really tough generalizations and abstractions. Design is going to be a challenge as abstractions sometimes snatch away (or at least makes it difficult to put in front) many powers of NoSQL.
  • NoSQL databases are built for performance and scalability. ORM tools have to employ techniques that don’t put too much burden on performance which users would otherwise get while using plain vanilla driver available with the database.
  • Many ORM standards (like JPA and JDO) were written for relational databases that don’t always provide ways of doing certain things in NoSQL. In fact, most of the ORM tools (like hibernate) were built to solve only 80% of the frequently used mapping problems. Low level tweaking may prove to be crucial for applications built on NoSQL databases. Challenge here is to provide developers hooks for solving remaining 20% of the mapping problem.
  • Each NoSQL database has its own architecture and network topology. They make specific assumptions in data center, racks and disks connectivity and how data is stored and distributed on them. In simple terms – NoSQL databases are “born distributed”. It’s a real challenge to provide all these configuration using ORM specifications.

Approach

There are some guidelines worth sharing, that helps develop ORM tools for NoSQL:

  • Map notations in ORM framework to structures in NoSQL that they are most likely to be thought related to. Example is @Embedded in JPA with Super column in Cassandra.
  • If framework provides a way to operate on database directly, leverage them. Example is mapping Native queries in JPA with CQL in Cassandra.
  • NoSQL is built for performance. noticeable overhead over plain drivers is going to be a big turn-off for users. Make special efforts to keep overhead of ORM to minimum even if that requires sleepless nights and weeks of design-discussion fights.

Other Benefits

There are many other benefits of using a standard ORM tool over plain low-level driver library:

  1. Ease of use, faster development, increased productivity.
  2. NoSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene.
  3. ORM solutions (like Kundera and SpringData) may provide polyglot persistence transparently that would otherwise be impossible. This may prove to be a boon for complex application requiring storage in multiple databases.
  4. Most NoSQL databases lack transaction management capabilities as they were not built that way (and because they were built to solve problem that didn’t require it at the first place). Many ORM specifications (like JPA) mandate this capability. If by any chance your application requires likes of atomicity, they are going to be your rescuer.

As it happens with everything in life, Discipline and Rules that you promise to yourself, pay in the longer run…Those NoSQLs that provide best of balance in features and ease of use are going to be successful eventually. ORM tools could be one facilitator in that pursuit.

References

1. http://java.dzone.com/articles/martin-fowler-orm-hate

2. http://architects.dzone.com/news/non-sense-nosql-orm-frameworks

3. https://github.com/impetus-opensource/Kundera

4. http://gora.apache.org/

5. http://architects.dzone.com/news/non-sense-nosql-orm-frameworks

Published at DZone with permission of Amresh Singh, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Jilles Van Gurp replied on Thu, 2012/09/13 - 1:54am

ORM tools are a problem, not a solution. ORM tools are basically intended to cover up for the fact that the storage layer is not an object oriented database. Object oriented databases were briefly popular in the nineties, until ORM tooling came around. Basically they work well for simple use cases but start to impose a lot of complexity when things get a bit less straightforward and scaling becomes hard/impossible. ORM never quite solved this problem and often the step before switching to NoSQL involves switching to using pure SQL without the ORM crap in between, which in my experience can be quite refreshing.

NoSQL is the solution to this problem. The fact that you can have some business logic without all of the traditional enterprise crap around your storage layer is one of the primary attractions of NoSQL for a lot of people. There is a big danger of these people over simplifying their architectures and removing important things like ACID that traditional solutions provide, which has been the main criticism on NoSQL.

Besides NoSql and ORM mostly don't make sense at all because NoSql basically means anything but SQL. Which implies a wide variety of storing and querying solutions that mostly don't have that much in common. And even "anything but sql" is a loose notion as there are some NoSQL solution that support SQL like features and even SQL like languages. Covering such a system up with a generic abstraction layer so you can pretend it is something else sounds like a huge anti pattern to me.

If you look at the space, there are a couple of groups of systems here:

Big table style data stores. At first glance these look a lot like a good old SQL database. You have tables and columns. However, de-normalization is a key philosophy behind using them: you are not supposed to have a lot of different tables. How would you use an ORM tool here? Have a domain model of one class?

Key value stores. These break down into those with some means of indexing and querying (couchdb) and those without such capability (e.g. memcache, voldemort). Especially if the values are json or xml, an ORM tool is kind of redundant since there plenty of good marshalling tools that can marshall and unmarshall to and from whatever domain model.

Graph databases (neo4j). These store graphs of nodes and edges, each with properties and allow for graph traversal based querying.

There are a few more but you get the point: these systems vary wildly in how they store data and probably pretending they are object oriented databases doesn't make a lot of sense. Mostly the whole point of these solutions is that they are not object oriented solutions (or that they are in which case ORM is redundant).

 

Chris Travers replied on Sat, 2012/09/22 - 6:35am

First of all I am not a fan of ORM's.  If you want to cover for the fact that your data store isn't object-oriented, use something like Oracle Objects or PostgreSQL (where everything is an object) and encapsulate your database that way.  It's actually quite possible (Liskov-related weirdness aside) to build SOLID database interfaces in these RDBMS's.  I am working on an upcoming series on this regarding PostgreSQL.  ORMs are good for some simple things, where the data is already at least somewhat encapsulated in the db.  They fail big-time when these are not the case.  Encapsulation which keeps the ORM in mind pushes the limits further.  However, my approach is the opposite.  I make discoverable interfaces in the db, and libraries which detect and use them.  From there the db access is simple, it is encapsulated, and it is discoverable.  Keep in mind that the impedence mismatch arises rather directly from the separation of concerns between layers, and so it is something that has to be addressed.

 However that is if you are using an RDBMS, which you should be using in some capacity for any important information.  (This is not to say you can't also be using NoSQL db's of course.)

However, what if you are connecting to a more object-friendly db instead?  Then the question is how to make the transition work.  You don't need an ORM per se.  What you need is something to convert the returned formats into the objects of your app.  This is kind of like an ORM but not really.

 So what I hear you saying is that NoSQL db's need object-generating interfaces too.  That's a fair point, but basing it on a tool that has to traverse boundaries which separate concerns the way they do with an RDBMS is a poor starting point IMHO. 

Amresh Singh replied on Tue, 2012/10/16 - 12:46pm in response to: Jilles Van Gurp

Hi Jilles,

Your comments make a lot of sense to me. In fact, while working on Object-datastore mapping library called "Kundera", I find all these challenges problematic. Till now, I have found JPA - a standard, that is more or less expessive enough to model your nosql data in the form of entities...at least till now with my working on cassandra, hbase and mongodb. But sometimes, I wish, JPA had constructs specific to common concepts found in many nosql databases. or probably we had a separate standard for nosql altogether.

 

     Leaving ORM aside, I would still say - a standard that intends to solve interoperability among nosql is something that's going to be useful. In fact, there has been effort in the past (UnQL) in that direction.

These standards have a lot of potential in simplifying and opening a new range of possibilities. (Think switching between nosql data-store or even more important - Polyglot-persistence) 

Amresh Singh replied on Tue, 2012/10/16 - 1:09pm in response to: Chris Travers

Chris,

That's true as long as we are working with OO databases. Impedance mismatch problem arises when this is not the case.

compared to RDBMS, NoSQLs are even more at the other end of object orientation. These are strongly unstructured (which is not a bad thing considering problem they intend to solve). So impedance-mismatch is even severe for NoSQLs.

So, although we don't have a foolproof and mature solution yet, it's going to help if we have any in future. In my experience with cassandra, I have seen people preferring high level clients that abstract things, over low level native thrift clients. So I find a desire and usecase for such libraries and tools...But I agree that this is a topic that is debatable, people have differing views and things are grey at this stage.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.