NoSQL Zone is brought to you in partnership with:

Jason Whaley is a Java developer specializing in service oriented architectures, enterprise integration, cloud computing, and continuous integration. Previously, Jason has worked in multiple roles for both public companies as well as government institutions in a variety of roles for several broad ranging Java based projects. Presently, Jason works at Nodeable as a platform developer where he is helping build a next generation infrastructure monitoring and analytics tool. He is also a contributor to The Basement Coders Podcast. Jason is a DZone MVB and is not an employee of DZone and has posted 17 posts at DZone. You can read more from them at their website. View Full User Profile

Index Sub-Documents in MongoDB

02.18.2013
| 2109 views |
  • submit to reddit

Let’s say I have a collection of documents that all kind of look like this:

> db.foos.find().pretty()
{
    "_id" : ObjectId("511fe286777e76dfdddbf440"),
    "foo" : "bar",
    "timestamp" : ISODate("2013-02-16T20:02:39.417Z")
}

Then, all of a sudden I get a requirement handed to me that this collection needs to be searched for all documents with a timestamp within a given hour, on a given month, within a given year. Writing a query for that is going be a bit gnarly, so we create a sub-document in our documents that makes this query easier.

{
    "_id" : ObjectId("511fe286777e76dfdddbf440"),
    "foo" : "bar",
    "timestamp" : ISODate("2013-02-16T20:02:39.417Z"),
    "timefields" : {
        "y" : 2013,
        "mo" : 2,
        "d" : 16,
        "h" : 20,
        "mi" : 02,      
    }
}

So now my query will look like the following and all will be happy in the world. And yes, I could have put each of these elements at the top level, but bear with me for demonstration purposes…

> db.foos.find({"timefields.y":2013, "timefields.mo": 2, "timefields.h":20}).pretty()
{
    "_id" : ObjectId("511fe286777e76dfdddbf440"),
    "foo" : "bar",
    "timestamp" : ISODate("2013-02-16T20:02:39.417Z"),
    "timefields" : {
        "y" : 2013,
        "mo" : 2,
        "d" : 16,
        "h" : 20,
        "mi" : 2
    }
}

Ok, but what about that index that was (hypothetically) on timestamp? Because we are searching other fields we aren’t using indexes in our query and the search is happening in linear time. The horror!

Well, mongo is flexible if it isn’t anything else. You can just as easily index fields in a sub-document like you can any other field. For instance:

> var indices = { "timefields.h" : 1,
... "timefields.y" : 1,
... "timefields.mo" : 1,
... "timefields.d" : 1,
... "timefields.m" : 1}
> db.foos.ensureIndex(indices)

> db.foos.getIndices()
[
    ...
    {
        "v" : 1,
        "key" : {
            "timefields.h" : 1,
            "timefields.y" : 1,
            "timefields.mo" : 1,
            "timefields.d" : 1,
            "timefields.m" : 1
        },
        "ns" : "test.foos",
        "name" : "timefields.h_1_timefields.y_1_timefields.mo_1_timefields.d_1_timefields.m_1"
    }
]

And now when I run explain on the cursor using the same query I performed before, you can now see those indices are being used:

> db.foos.find({"timefields.y":2013, "timefields.mo": 2, "timefields.h":20}).pretty().explain()
{
    ...
    "indexBounds" : {
        "timefields.h" : [
            [
                20,
                20
            ]
        ],
        "timefields.y" : [
            [
                2013,
                2013
            ]
        ],
        "timefields.mo" : [
            [
                2,
                2
            ]
        ],
        "timefields.d" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ],
        "timefields.m" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ]
    },
}


 

Published at DZone with permission of Jason Whaley, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)