Scoping variables for a query
You have already learned how you can specify a variable for a node or relationship in a query:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = 'Tom Hanks'
RETURN m.title AS movies
In this query the variable p is used to test each Person node against the value Tom Hanks. The variable m is used to return the movie titles.
You can define and initialize variables to be used in the query with a WITH
clause.
WITH 'Tom Hanks' AS actorName
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = actorName
RETURN m.title AS movies
Before the MATCH
clause, we define a variable, actorName to have a value of Tom Hanks.
The variable, actorName is in the scope of the query, so it can be used like a parameter.
The query itself can be reused with a different value for actorName.
You will learn about using parameters in Cypher later in this course.
Using WITH
to redefine scope
Let’s look at scoping variables in more detail. Suppose we have this query:
WITH 'toy story' AS mt, 'Tom Hanks' AS actorName
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = actorName
AND toLower(m.title) CONTAINS mt
RETURN m.title AS movies
For this query, mt and actorName are within scope of the MATCH
clause that also uses the WHERE
clause.
It retrieves the Person node, then all the movies that Tom Hanks acted in, then it filters and returns the movies that contain mt.
Now, let’s look at this query:
WITH 'toy story' AS mt, 'Tom Hanks' AS actorName
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WITH m, toLower(m.title) AS movieTitle
WHERE p.name = actorName
AND movieTitle CONTAINS mt
RETURN m.title AS movies, movieTitle
The variables mt, and actorName are available to the MATCH
clause and the WHERE
clause just like the previous query.
What is different here, however, is that we must add the m to the second WITH
clause so that the node can be used to return the title of the node.
A WITH
clause is used to define or redefine the scope of variables.
Because we want to redefine what is used for the WHERE
clause, we add a new WITH
clause.
This creates a new scope for the remainder of the query so that m and movieTitle can be used to return values.
If you were to remove the m in the second WITH
clause, the query would not compile.
Notice also that you can define expressions using WITH
.
When you define an expression (for example, toLower(m.title), you must specify an alias defined with the AS
keyword.
Limiting results
Suppose we have this query where we want to return only two rows:
WITH 'Tom Hanks' AS theActor
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = theActor
RETURN m.title AS movies LIMIT 2
Another way to write this query is:
WITH 'Tom Hanks' AS theActor
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = theActor
WITH m LIMIT 2
// possibly do more with the two m nodes
RETURN m.title AS movies
With this query, two Movie nodes are retrieved. What is different here is that you can use WITH
to limit how many m nodes are used later in the query.
Passing nodes on to the next MATCH
clause is called pipelining that you will learn about in the next lesson.
Ordering results
If you are limiting the nodes to process further on in the query or for the RETURN
clause, you can also order them:
WITH 'Tom Hanks' AS theActor
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = theActor
WITH m ORDER BY m.year LIMIT 5
// possibly do more with the five m nodes in a particular order
RETURN m.title AS movies, m.year AS yearReleased
Using map projections in a WITH
clause
Here is another example where we want to customize the properties of nodes that are returned. Using a map projection, you can specify which properties are returned. This type of customization of nodes returned is very useful when you are integrating with an application.
MATCH (n:Movie)
WHERE n.imdbRating IS NOT NULL
AND n.poster IS NOT NULL
WITH n {
.title,
.year,
.languages,
.plot,
.poster,
.imdbRating,
directors: [ (n)<-[:DIRECTED]-(d) | d { tmdbId:d.imdbId, .name } ]
}
ORDER BY n.imdbRating DESC LIMIT 4
RETURN collect(n)
This query returns a subset of the data in a Movie node. It returns the top four rated movies. Because we have specified a limit of four, only 4 objects with the specified properties are added to the list. This type of data returned is commonly used by GraphQL and JavaScript applications.
Although this is nice for processing on the client side, it takes more memory on the server as records cannot be streamed to the client but are collected into the list structure on the server.
Check your understanding
1. Scoping variables
Here is a query to return the name of the actor (Clint Eastwood) and all the movies that he acted in that contain the string 'high'. How do you complete this query so it can return the desired results?
Once you have selected your option, click the Check Results query button to continue.
WITH 'Clint Eastwood' AS a, 'high' AS t
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
/*select:WITH p, m, toLower(m.title) AS movieTitle*/
WHERE p.name = a
AND movieTitle CONTAINS t
RETURN p.name AS actor, m.title AS movie
-
❏
WITH toLower(m.title) AS movieTitle
-
❏
WITH p, toLower(m.title) AS movieTitle
-
❏
WITH m, toLower(m.title) AS movieTitle
-
✓
WITH p, m, toLower(m.title) AS movieTitle
Hint
The WITH
clause removes variables from scope for the RETURN
clause so you must add them back to the scope.
Solution
The correct answer is: WITH p, m, toLower(m.title) AS movieTitle
. movieTitle is created during the query and is passed down for use in the WHERE
clause. In order to use the variables p and m, they must also be included.
If you do not include both p and m in the WITH
clause (which re-scopes variables) these variables cannot be used later in the query and in the RETURN
clause.
2. Cypher scoping
What Cypher keyword is used to redefine the scope of variables in a query?
-
❏ MATCH
-
❏ RETURN
-
✓ WITH
-
❏ SCOPE
Hint
This clause can be used anywhere in a query to redefine scope. It is not SCOPE
.
Solution
The correct answer is: WITH
Summary
In this lesson, how WITH
is used to scope variables in a query and how you can limit scope to a subset of data.
In the next challenge, you will write a query that scopes variables.