NoSQL Zone is brought to you in partnership with:

Mark is a graph advocate and field engineer for Neo Technology, the company behind the Neo4j graph database. As a field engineer, Mark helps customers embrace graph data and Neo4j building sophisticated solutions to challenging data problems. When he's not with customers Mark is a developer on Neo4j and writes his experiences of being a graphista on a popular blog at http://markhneedham.com/blog. He tweets at @markhneedham. Mark is a DZone MVB and is not an employee of DZone and has posted 553 posts at DZone. You can read more from them at their website. View Full User Profile

# Neo4j / Cypher: Translating 1.9 FILTER Queries to Use 2.0 List Comprehensions

10.10.2013
| 3463 views |

I was looking back over some Cypher queries I’d written earlier in the year against my football data set to find some examples of where list comprehensions could be useful and I came across this query which is used to work out which teams were the most badly behaved in terms of accumulating red and yellow cards:

```START team = node:teams('name:*')
MATCH team<-[:for]-like_this<-[:started|as_sub]-player-[r?:sent_off_in|booked_in]->game<-[:in]-like_this
WITH team, COLLECT(r) AS cards
WITH team,
FILTER(x IN cards: TYPE(x) = "sent_off_in") AS reds,
FILTER(x IN cards: TYPE(x) = "booked_in") AS yellows
RETURN team.name, LENGTH(reds) AS redCards, LENGTH(yellows) AS yellowCards
ORDER BY (yellowCards*1 + redCards*3) DESC```

We start by getting all the teams, work out which players played in that game, then work out who got booked or sent off before separating the yellow/red cards into their own respective collections. Finally we assign 3 points for a red card and 1 point for a yellow card and order the teams based on that.

```==> +-------------------------------------------------+
==> | team.name              | redCards | yellowCards |
==> +-------------------------------------------------+
==> | "Stoke City"           | 4        | 81          |
==> | "Newcastle United"     | 4        | 74          |
==> | "Aston Villa"          | 3        | 74          |
==> | "West Ham United"      | 1        | 74          |
==> | "West Bromwich Albion" | 4        | 63          |
==> | "Sunderland"           | 3        | 63          |
==> | "Wigan Athletic"       | 2        | 66          |
==> | "Manchester City"      | 3        | 62          |
==> | "Everton"              | 3        | 62          |
==> | "Queens Park Rangers"  | 3        | 60          |
==> | "Swansea City"         | 2        | 59          |
==> | "Norwich City"         | 1        | 60          |
==> | "Chelsea"              | 3        | 53          |
==> | "Liverpool"            | 2        | 54          |
==> | "Manchester United"    | 1        | 57          |
==> | "Tottenham Hotspur"    | 2        | 54          |
==> | "Arsenal"              | 5        | 44          |
==> | "Fulham"               | 3        | 48          |
==> | "Southampton"          | 2        | 44          |
==> | "Reading"              | 1        | 45          |
==> +-------------------------------------------------+
==> 20 rows```

Unfortunately if we run that query on a 2.0.0-M05 neo4j database we’ll get the following error:

```==> SyntaxException: Invalid input '(': expected an identifier character, whitespace, NodeLabel, '.', '[', node labels, "=~", IN, IS, '*', '/', '%', '^', '+', '-', '<', '>', "<=", ">=", '=', "<>", "!=", AND, XOR, OR, WHERE, ')' or ',' (line 1, column 207)
==> "START team = node:teams('name:*')  MATCH team<-[:for]-like_this<-[:started|as_sub]-player-[r?:sent_off_in|booked_in]->game<-[:in]-like_this  WITH team, COLLECT(r) AS cards WITH team, FILTER(x IN cards: TYPE(x) = "sent_off_in") AS reds, FILTER(x IN cards: TYPE(x) = "booked_in") AS yellows RETURN team.name, LENGTH(reds) AS redCards, LENGTH(yellows) AS yellowCards ORDER BY (yellowCards*1 + redCards*3) DESC"
==>```

The syntax when using the FILTER function has changed a bit so we need to use a WHERE clause rather than a ‘:’:

```START team = node:teams('name:*')
MATCH team<-[:for]-like_this<-[:started|as_sub]-player-[r?:sent_off_in|booked_in]->game<-[:in]-like_this
WITH team, COLLECT(r) AS cards
WITH team,
FILTER(x IN cards WHERE TYPE(x) = "sent_off_in") AS reds,
FILTER(x IN cards WHERE TYPE(x) = "booked_in") AS yellows
RETURN team.name, LENGTH(reds) AS redCards, LENGTH(yellows) AS yellowCards
ORDER BY (yellowCards*1 + redCards*3) DESC```

However, as I hinted at the beginning of this post, we can also translate the query to make use of the new for comprehensions. I ended up with the following:

```START team = node:teams('name:*')
MATCH team<-[:for]-like_this<-[:started|as_sub]-player-[r?:sent_off_in|booked_in]->game<-[:in]-like_this
WITH team, COLLECT(r) AS cards
WITH team,
[card IN cards WHERE TYPE(card) = "sent_off_in"] as reds,
[card in cards WHERE TYPE(card) = "booked_in"] as yellows
RETURN team.name, LENGTH(reds) AS redCards, LENGTH(yellows) AS yellowCards
ORDER BY (yellowCards*1 + redCards*3) DESC```

The syntax is similar to Python’s list comprehesions and there are a collection of examples on the manual that you can follow.

A query I wanted to write in May, but had to park until list comprehensions were implemented, is the following one which finds the top 5 scorers for each month and returns their goal tally as well:

```START month = node:months('name:*')
MATCH month-[:in_month]-game-[:scored_in]-player
WITH month, player, COUNT(game) AS games
ORDER BY games DESC
WITH month,
[x IN COLLECT([player.name, games])[0..5] | x[0]] AS players,
[x IN COLLECT([player.name, games])[0..5] | x[1]] AS goals
ORDER BY month.position
RETURN month.name, players, goals```

My previous attempt was littered with calls to HEAD and TAIL but this version is much more concise. This is what it returns:

```==> +--------------------------------------------------------------------------------------------------------------------+
==> | month.name  | players                                                                                | goals       |
==> +--------------------------------------------------------------------------------------------------------------------+
==> | "August"    | ["Michu","Nathan Dyer","Fernando Torres","Mladen Petric","Damien Duff"]                | [3,2,2,2,2] |
==> | "September" | ["Demba Ba","Steven Fletcher","Peter Crouch","Robin Van Persie","Luis Suárez"]         | [5,5,4,4,4] |
==> | "October"   | ["Juan Mata","Wayne Rooney","Jose Fonte","Michu","Grant Holt"]                         | [3,2,2,2,2] |
==> | "November"  | ["Marouane Fellaini","Luis Suárez","Gareth Bale","Sergio Agüero","Olivier Giroud"]     | [4,4,3,3,3] |
==> | "December"  | ["Demba Ba","Wayne Rooney","Robin Van Persie","Michu","Theo Walcott"]                  | [5,5,5,5,4] |
==> | "January"   | ["Adam Le Fondre","Luis Suárez","Robin Van Persie","Frank Lampard","Leighton Baines"]  | [5,4,4,3,3] |
==> | "February"  | ["Gareth Bale","Romelu Lukaku","Moussa Sissoko","Christian Benteke","Santi Cazorla"]   | [5,3,3,3,3] |
==> | "March"     | ["Luis Suárez","Jan Vertonghen","Christian Benteke","Shinji Kagawa","Stewart Downing"] | [4,3,3,3,2] |
==> | "April"     | ["Robin Van Persie","Christian Benteke","Daniel Sturridge","Oscar","Andrew Carroll"]   | [6,4,3,2,2] |
==> | "May"       | ["Grant Holt","Romelu Lukaku","Daniel Sturridge","Kevin Nolan","Theo Walcott"]         | [3,3,3,3,2] |
==> +--------------------------------------------------------------------------------------------------------------------+
==> 10 rows```

If you like cypher and haven’t played around with list comprehensions yet I’d recommend it – this syntax will help a great deal in reducing the complexity of some queries.

Published at DZone with permission of Mark Needham, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:
"Starting from scratch" is seductive but disease ridden