Cypher Projections

Introduction

While the native projection requires less configuration, its filtering and aggregation capabilities aren’t as flexible as Cypher. The Cypher projection, as its name implies, uses Cypher to define the projection pattern and enables more flexibility.

Whether Cypher or native projection is faster, depends on your query. Its easiest to compare the two against your use-case.

In this lesson, we will go over the cypher projection syntax, an applied example, where cypher projections are useful, and common strategies for transition from Cypher to native projections as workflows mature.

Syntax

To create a Cypher project you define a Cypher query and use the gds.graph.project aggregation function to create the projection.

cypher
MATCH (sourceNode:Node)-[r:RELATIONSHIP]->(targetNode:Node)
WITH gds.graph.project(
    graphName: String,
    sourceNode: Node or Integer,
    targetNode: Node or Integer,
    dataConfig: Map,
    configuration: Map
) AS g
RETURN g

A Cypher projection takes two mandatory arguments: graphName and sourceNode. For many use cases, you would also define a targetNode. In addition, the optional configuration parameter allows us to further configure graph creation.

Name Optional Description

graphName

no

The name under which the graph is stored in the catalog.

sourceNode

no

The source node of the relationship. Must not be null.

targetNode

yes

The target node of the relationship. The targetNode can be null (for example due to an OPTIONAL MATCH), in which case the source node is projected as an unconnected node.

dataConfig

yes

Properties and labels configuration for the source and target nodes as well as properties and type configuration for the relationship.

configuration

yes

Additional parameters to configure the projection.

Applied Example

In the last lesson we answered which actors were most prolific based on the number of movies they acted in. Suppose instead we wanted to know which actors are the most influential in terms of the number of other actors they have been in recent, high grossing, movies with.

For the sake of this example, we will call a movie “recent” if it was released on or after 1990, and high-grossing if it had revenue >= $1M.

The graph is not set up to answer this question well with a direct native projection. However, we can use a cypher projection to filter to the appropriate nodes and perform an aggregation to create an ACTED_WITH relationship that has a actedWithCount property going directly between actor nodes.

The Cypher query to create this data set would be:

cypher
MATCH (source:Actor)-[r:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(target)
WHERE m.year >= 1990 AND m.revenue >= 1000000
RETURN source.name, count(r) as actedWithCount

The actor’s name and the count of movies they acted in with other actors in recent, high-grossing movies are returned.

You can use this query to create a Cypher projection.

cypher
MATCH (source:Actor)-[r:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(target)
WHERE m.year >= 1990 AND m.revenue >= 1000000
WITH source, target, count(r) as actedWithCount
WITH gds.graph.project(
    'cypher-proj',
    source,
    target,
    { relationshipProperties:
        {
            actedWithCount: actedWithCount
        }
    }
) AS g
RETURN
  g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels

Note how the source and target nodes are included as parameters. The actedWithCount property is included in the relationshipProperties in the dataConfig parameter.

Once the projection is created we can apply degree centrality like we did last lesson. Except we will weight the degree centrality by actedWithCount property and also directly stream the top 10 results back. This counts how many times the actor has acted with other actors in recent, high grossing movies.

cypher
CALL gds.degree.stream('cypher-proj',{relationshipWeightProperty: 'actedWithCount'})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 10

The results include some big name actors as we would expect.

name score

Robert De Niro

123.0

Bruce Willis

120.0

Johnny Depp

102.0

Denzel Washington

99.0

Nicolas Cage

90.0

Julianne Moore

87.0

Brad Pitt

87.0

Samuel L. Jackson

85.0

George Clooney

84.0

Morgan Freeman

84.0

Flexibility of Cypher Projections

In the above example, there were two things that prevented us from directly using a native projection. They also happen to be two of the most common cases for using Cypher Projections.

  1. Complex Filtering: Using node and/or relationship property conditions or other more complex MATCH/WHERE conditions to filter the graph, rather than just node label and relationship types.

  2. Aggregating Multi-Hop Paths with Weights: The relationship projection required aggregating the (Actor)-[ACTED_IN]-(Movie)-[ACTED_IN]-(Actor) pattern to a (Actor)-[ACTED_WITH {actedWithCount}]-(Actor) pattern where the actedWithCount is a relationship weight property. This type of projection, where we need to transform multi-hop paths into an aggregated relationship that connects the source and target node, is a commonly occurring pattern in graph analytics.

Further options enabled by Cypher projections include merging different node labels and relationship types and defining virtual relationships between nodes based on property conditions or other query logic.

Check your understanding

1. Creating a Cypher Projection

Select the correct Cypher statement to create a Cypher projection between Customer and Product nodes:

cypher
MATCH (customer:Customer)-[:PURCHASED]->(product:Product)
/*select:WITH gds.graph.project('proj', customer, product)
 AS g*/
RETURN g
  • WITH gds.graph.project('proj', source)

  • WITH gds.graph.project('proj', source, target)

  • WITH gds.graph.project('proj', customer, product)

  • WITH gds.graph.project('proj', ['Customer', 'Product'])

Hint

gds.graph.project expects the source and target nodes as defined in the MATCH clause.

Solution

The answer is WITH gds.graph.project('proj', customer, product) as the MATCH clause defines the customer and product nodes.

2. Cypher Projection Use Cases

Which of the below situations may be good use cases for cypher projections? Select all that apply

  • ❏ Need to filter the projection to only specific node labels and relationship types

  • ✓ Need to aggregate multi-hop paths into weighted relationships

  • ❏ Need to change the orientation of relationships

  • ✓ Need to filter the projection to a small community of nodes

Hint

Cypher projections allow you to create more specific graph projections including aggregations and advanced filtering.

Solution

The correct answers are:

Need to aggregate multi-hop paths into weighted relationships
Need to filter the projection to a small community of nodes

3. Cypher Projection Usage

Which of the below statements are true? Select all that apply

  • ✓ Cypher projections can offer more customization and flexibility than native projections

  • ❏ Cypher projections are not as flexible as native projections, but they are faster and scale better to larger graphs

  • ✓ Cypher projections enable projecting small subsets of the graph

  • ❏ Cypher projections are the only option for projecting graphs while aggregating parallel relationships.

Hint

Cypher projections provide greater customization and flexibility compared to native projections.

Solution

The correct answers are:

Cypher projections can offer more customization and flexibility than native projections
Cypher projections enable projecting small subsets of the graph

Summary

In this lesson we learned about Cypher projections. What they are, how and when to use them.

In the next lesson, you will be challenged to create your own Cypher projection.