NoSQL Zone is brought to you in partnership with:

Andreas Kollegger is a leading speaker and writer on graph databases and Neo4j and the bridge between community and developer efforts. He works actively in the community, speaking around the world and promoting the larger Neo4j ecosystem of projects. Author of Fair Trade Software, and the lead for Neo4j in the cloud, Andreas plays a valuable role for progressive happenings within Neo4j. Andreas is a DZone MVB and is not an employee of DZone and has posted 68 posts at DZone. You can read more from them at their website. View Full User Profile

Nodes are People, Too

04.09.2013
| 5392 views |
  • submit to reddit

 

Milestone Release: Neo4j 2.0.0-M01 lets you define sets of nodes within the graph
Philip Rathle
Product Manager Neo4j


Today we are releasing Milestone Release Neo4j 2.0.0-M01 of the Neo4j 2.0 series which we expect to be generally available (GA) in the next couple months. This release is significant in that it is the first time since the inception of Neo4j thirteen years ago that we are making a change to the property graph model. Specifically, we will be adding a new construct: labels.
We’ve completed a first cut at what we believe will be a significant improvement in the data model, one that we believe nearly every graph will benefit from. Because it opens up quite a few questions, we are opening the code up now for early comment. Please therefore consider 2.0 to be an experimental release. This first milestone is intended to solicit thinking and feedback. In addition to nailing down the good (and not so good) practices, some terminology is also being sorted out. We’ll do this together, based on your feedback, and the uses you discover. But enough with introductions... let’s talk about the release!

It’s a What? Let’s say you created a node for a person named Joe. Joe is not just any node: he is a person. Therefore you would probably want to designate the node for Joe as being a “Person”. If you’ve worked with Neo4j before, chances are that you’ve done this by adding a property called “Type” with value “Person”, as follows:

This is useful, because now I can differentiate Joe from things in my graph that are quite different, such as “household goods” nodes and “geo location” nodes. Rightly so, these things should receive very different treatment.
Now let’s say you also want to give Joe a party affiliation (e.g. Left Wing or Right Wing). While you could do this with a property as well, you may decide that you want to easily find all people of a given party affiliation (in this case Joe is "Middle Wing"). Knowing this, you might decide to break the parties into nodes, and then associate Joe with the appropriate party, as below:
One thing you’d now naturally want the graph to do, is to automatically index the “Person” node (and no other nodes), according to the unique identifier for “Person”. (Let’s oversimplify and say this is name.) If you’re using Cypher, this is a challenge today. In fact it’s not possible at all, because Neo4j doesn’t inherently know anything about “Person” being different from geo locations. If you want to index “name”, you end up doing it for everything in the graph, which mixes concerns. Geo Location names aren’t the same as person names, any more than a city is like a person. As for the “Democrat” node, it ends up becoming extremely dense, cluttering the graph with lots of connections whose sole purpose is to designate nodes as belonging to a group.
We’ve been looking at better ways to do this. The ideal solution would help to make one’s graph more understandable, as well as to make Cypher more powerful, by allowing it to home in on nodes (as well as to index them) according to what they are.
2.0 therefore, introduces a means of grouping or categorizing nodes. Provisionally we are calling this construct a “Label”. The term “Label” speaks to its generic use, and to the fact that nodes can have multiple labels. However their foremost use is very likely going to be assigning types to nodes. This makes the word “Type” also quite compelling.
Because we are still unsure of the best name, and because this release is experimental, we are going to use the term “Type-Labels” in this blog post. Yes, this sounds a little silly. And yes, the documentation calls these “Labels” for now. But we’d rather be extra clear that this decision hasn’t been made yet.

Type-Labels
A graph is a graph because it has relationships in the data. In a Property Graph, a relationship always has a type, describing how two nodes are related. Type-Labels expand on that idea, describing how entire sets of nodes are related. This is a grouping mechanism for nodes. How does it work? Very simple: in the example above, rather than adding a “Type” property and connecting Joe to a Party node, you would add two Type-Labels: one for “Person”, and one for “Democrat”, just like so:

This opens up quite a few possibilities, and probably stirs up a lot of ideas in your head. Rather than color your thinking about how to use Type-Labels, let’s look at an example using different color sets.
Color me happy
Let’s say we have an arbitrary domain of loosely related stuff, within which we at least know that things can be red, green, or blue. We could just add a “color” property to each node, or relate them to a value node for each color. But because we want to always work within this group, we’ll use Type-Labels to identify members of the sets.

First, create something red:

CREATE a node with a Type-Label


CREATE (thing:Red {uid: "TK-421", make: 191860 })
RETURN thing;
To find the thing we just created, we can search within just the Red nodes, then return the labels:

Find the Type-Labels on a node

MATCH (thing:Red) WHERE thing.uid = "TK-421" RETURN labels(thing);
Why labels, plural? Because nodes can have multiple labels. Let's say that "TK-421" also belongs to the blue set. Add a blue label like this:

Add a Type-Label to a node

MATCH (thing:Red) WHERE thing.uid = "TK-421" SET thing :Blue;
The benefits of intentional (type) labeling
While some Danes may be nervous about labels, much good comes from their use. Applying a Type-Label to a set of nodes makes your intention obvious — "these nodes are accessed frequently and thought of as a group." The database itself can gain benefit from having your intention be explicit, because it can now do things with this information.
For starters, Neo4j can create indexes that will improve the performance when looking for nodes within the set. (Note the new Cypher syntax for index creation!):

CREATE INDEXES to speed up finding Red and Blue nodes

CREATE INDEX ON :Red(uid); CREATE INDEX ON :Blue(uid);

Create a second labeled node and a relationship

CREATE (other_thing:Blue {uid: "TURK-182", make: 181663}) WITH other_thing MATCH (thing:Red) WHERE thing.uid = "TK-421" CREATE (thing)-[:HONORS]->(other_thing) RETURN thing, other_thing;
There is much more fun to be had. Details are, as always, in the Neo4j Manual. Again, this simple change can have profound impact. As we're exploring the possibilities and tuning the language and APIs, we'd love for you to play around with Type-Labels. Let us know how you want to use them, by providing feedback on the Google Group. (That way other people can see your feedback and respond with their own opinions and observations.)
One more thing...

Just in CASE


Cypher has a new CASE expression for mapping inputs to result values: a cousin to similar constructs found in every common programming language.  
  • In its simple form, CASE uses a direct comparison of a property for picking the result value from the first matching WHEN:
MATCH (r:Red) RETURN CASE r.uid  WHEN "TK-421" THEN "Why aren’t you at your post?"  WHEN "TURK-182" THEN "the work of one man"  ELSE "..." END
  • In the general form, each WHEN uses an arbitrary predicate for picking the result:

MATCH (r:Red) RETURN CASE  WHEN r.color > 180000 THEN "redish"  WHEN r.color < 180000 THEN "purplish"  ELSE "simply red" END
Summary
Enjoy this preview milestone! Use the Neo4j Google Group to tell the Neo4j team and other members of the Neo4j community what you think. There are a few other improvements baked into this release as well, including to the shell, that we'll cover in upcoming blogs. And of course you'll be seeing more in upcoming Milestones of Neo4j 2.0. Meanwhile, we have upgraded a preview of the online console for you to test the new features, it now features the Matrix graph enhanced with labels. One final note: if you are planning to go into production soon, we strongly recommend developing against 1.9, which we expect to be going GA in the next couple weeks (look for an RC this week).
Published at DZone with permission of Andreas Kollegger, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)