Did you know? DZone has great portals for Python, Cloud, NoSQL, and HTML5!
NoSQL Zone is brought to you in partnership with:

Mike is currently working as a senior LAMP architect for a San Francisco start-! !up. With over 30-years of exeperience in the programming field, his only comment is "I should know better by now." Micheal is a DZone MVB and is not an employee of DZone and has posted 7 posts at DZone. You can read more from them at their website. View Full User Profile

Searching MongoDB Sub-Documents…

02.08.2012
Email
Views: 1866
  • submit to reddit
This article is part of the DZone NoSQL Resource Portal, which is brought to you in collaboration with Neo Technology and DataStax. Visit the NoSQL Resource Portal for additional tutorials, videos, opinions, and other resources on this topic.

I’ve recently finished a mongo collection that stores all auditing data from my application — specifically, it records every database transaction, conducted in either mySQL or mongo, assigning an event-identifier to the event, and storing the data under an event ID within a single sessionManger object.

Sounds good?

Well, I like it.   This design eliminated the need to maintain meta-data in my data tables since I can pull transaction history for any record that I’ve accessed.

The problem is that, being new to mongodb, accessing what I’ve put into mongodb isn’t (yet) as intuitive as, say, my mySQL skills are.

Sub-documents within a mongo document are analogous to the results of a mySQL join.  One of the key motivators in storing this information in mongodb to begin with was that I could de-normalize the data by storing the sub-document with it’s parent instead of having to incur the expense of a search-join-fetch later.

Traditionally, any data objects defined as a one-to-many type of a relationship (1:m) were stored in multiple mySQL tables and were accessed via some sort of join mechanism.

Mongodb breaks that traditional mold by allowing you to store a sub-document (the “m” part of the 1:m relationship) within the same document in which you’re currently working.

Using my sessionManger document, I have a document that looks something like this:

{
_id : somevalue,
foo : bar,
event : {},
argle : bargle,
}

My desire is to, for every database event that is recorded, enter information about that event within the sub-document that I’ve wittily named “event”.

In my PHP code, I’ve written a sequence manager for mongo that maintains a document containing sequence values for various tables.  Think of this as the functional version of mySQL’s auto-increment feature.  I decided, then, for the sessionManager events, I would use this key sequence to obtain unique values and use those as my sub-document index.  I’d then store whatever data I needed to store using the sequence value as a sub-document key, or index:

{
_id : somevalue,
foo: bar,
event : {
n : {
created : dateval,
table : tableName,
schema : dbSchema,
query : lastQuery
}
}
argle : bargle
}

So, when I need to add another event, I just create a new sub-document under the event key, then add the data I need to store under the sub-document index key.

Worked like a champ!

And then I asked myself:  ”So, Brainiac, how would you go about extracting event -n- from your collection?”

I went through a lot of failed query attempts, bugged a lot of people, googled and saw stuff  that led me down many plush ratholes until I finally, through some serious trial-and-error, got the answer…

> db.mytable.find( { foo : bar }, { ‘event.n’ : 1 } );

where n = the number of the event I want to find.

If I want to get all of the events for a particular document (sessionManger object), then I would write something like:

> db.mytable.find( {foo : bar}, { event : 1});

If I wanted to return all of the events for all of the objects, then I would write this:

> db.mytable.find( {}, {event : 1});

What I’ve not been able to figure out, so far, is how I can use $slice to grab a range of events within a document.  Everything I try returns the full sub-set of documents back to me.  The doc tells me that $slice is used to return a subrange of array elements, which is what I thought “event.n” was but, apparently, it’s not.  (I think it’s an object (sub-document) which is why $slice fails for me.)

It’s not a big deal because, programmatically, I can grap the entire sub-document from it’s parent and parse in-memory to get the desired record.  And, if I know what the value for -n- is, then I can fetch just that one sub-document.  So, I’m ok for now.  However, please feel free to enlighten me with your expertise and experience should you see where I am failing here, ok?

Tags:
Published at DZone with permission of Micheal Shallop, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Neo Technology and DataStax are leading the charge for the NoSQL movement.  You can learn more about the Neo4j Graph Database in the project discussion forums and try out the new Spring Data Neo4j, which enables POJO-based development.  You can also see how Apache Cassandra, a ColumnFamily data store, is pushing the boundaries of persistence with cloud capabilities and deployments at SocialFlow and Netflix.