NoSQL Zone is brought to you in partnership with:

Max De Marzi, is a seasoned web developer. He started building websites in 1996 and has worked with Ruby on Rails since 2006. The web forced Max to wear many hats and master a wide range of technologies. He can be a system admin, database developer, graphic designer, back-end engineer and data scientist in the course of one afternoon. Max is a graph database enthusiast. He built the Neography Ruby Gem, a rest api wrapper to the Neo4j Graph Database. He is addicted to learning new things, loves a challenge and finding pragmatic solutions. Max is very easy to work with, focuses under pressure and has the patience of a rock. Max is a DZone MVB and is not an employee of DZone and has posted 60 posts at DZone. You can read more from them at their website. View Full User Profile

# Using Gremlin and Transaction to Solve the Problem of Maximum Flow

02.24.2012
| 2568 views |

The maximum flow problem was formulated by T.E. Harris as follows:

Consider a rail network connecting two cities by way of a number of intermediate cities, where each link of the network has a number assigned to it representing its capacity. Assuming a steady state condition, a nd a maximal flow from one given city to the other.

Back in the mid 1950s the US Military had an interest in finding out how much capacity the Soviet railway network had to move cargo from the Western Soviet Union to Eastern Europe. This lead to the Maximum Flow problem and the Ford–Fulkerson algorithm to solve it.

If you’ve been reading the Neo4j Gremlin Plugin documentation, you’ll remember it has a section on Flow algorithms with Gremlin. Let’s add a couple of things and bring this example to life.

We’re going to be modeling a simple railway system that needs to transport cargo from California to Illinois. A couple of direct routes exist, and additionally a route going through Texas. First step is to create our graph:

```def create_graph
neo = Neography::Rest.new
graph_exists = neo.get_node_properties(1)
return if graph_exists && graph_exists['name']

states = [{:name => "California", :coordinates => [-119.355165,35.458606]},
{:name => "Illinois",   :coordinates => [ -88.380238,41.278216]},
{:name => "Texas",      :coordinates => [ -97.388631,30.943149]}]

commands = states.map{ |n| [:create_node, n]}

states.each_index.map do  |n|
commands << [:add_node_to_index, "states_index", "name", states[n][:name], "{#{n}}"]
end

commands << [:create_relationship, "connected", "{#{0}}", "{#{1}}", {:capacity => 1}]
commands << [:create_relationship, "connected", "{#{0}}", "{#{1}}", {:capacity => 2}]
commands << [:create_relationship, "connected", "{#{0}}", "{#{2}}", {:capacity => 1}]
commands << [:create_relationship, "connected", "{#{2}}", "{#{1}}", {:capacity => 3}]

batch_result = neo.batch *commands
end
```

You’ve seen me do this a few times already, so I won’t spend too much time on it. Just notice we’re adding the states names to an index, and using the Batch REST command to create it all at once. We’ll write our max_flow method next:

```def max_flow
neo = Neography::Rest.new
neo.execute_script("source = g.idx('states_index')[[name:'California']].iterator().next();
sink = g.idx('states_index')[[name:'Illinois']].iterator().next();

max_flow = 0;
g.setMaxBufferSize(0);
g.startTransaction();

source.outE.inV.loop(2){
!it.object.equals(sink)}.
paths.each{
flow = it.capacity.min();
max_flow += flow;
it.findAll{
it.capacity}.each{
it.capacity -= flow}
};
g.stopTransaction(TransactionalGraph.Conclusion.FAILURE);

max_flow;")
end
```

Let’s take a closer look at a few things. We use the index to look up our source (start) and sink (end) nodes, and use iterator().next() to get the first node from the Gremlin Groovy Pipeline returned by the index lookup. We also create a variable max_flow where our answer will go.

```source = g.idx('states_index')[[name:'California']].iterator().next();
sink = g.idx('states_index')[[name:'Illinois']].iterator().next();
max_flow = 0;
```

We then set the transaction mode to manual by setting the MaxBufferSize to zero and start a new transaction. I’ll explain why in a minute.

```g.setMaxBufferSize(0);
g.startTransaction();
```

From our source, we go to a neighboring node looping these two outE.inV steps until we reach the sink node.

```source.outE.inV.loop(2){
!it.object.equals(sink)}.
```

For each path we find the lowest capacity along the edges we traversed using the min() function and add it to the max_flow variable we created earlier.

```paths.each{
flow = it.capacity.min();
max_flow += flow;
```

Then we subtract the flow from the capacity property of each of the edges in our path. Take note we are actually altering data in this step.

```it.findAll{
it.capacity}.each{
it.capacity -= flow}
};
```

At the end we return max_flow which has the answer to our question.

`max_flow;`

If you tried to run this method again, or tried to run a similar method using different sinks and sources that traveled over these nodes you’ll have a problem. The capacities were modified and will most likely be zero or show the residual capacity of the transportation network we built.

So to prevent this we stop the transaction with a Failure. The changes we made to capacity are not committed and the graph stays the way it was.

```g.stopTransaction(TransactionalGraph.Conclusion.FAILURE);
```

We can visualize this example using D3.js and its Geo Path projections:

As usual, all code is available on github. The max flow and related problems manifest in many ways. Water or sewage through underground pipes, passengers on a subway system, data through a network (the internet is just a series of tubes!), roads and highway planning, airline routes, even determining which sports teams have been eliminated from the playoffs.

Published at DZone with permission of Max De Marzi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:
"Starting from scratch" is seductive but disease ridden