Projection Practice

Introduction

Now it’s time to put your knowledge into practice. You’ll create both types of projections you learned about:

Monopartite transformations — connecting nodes through shared neighbours
Labelled bipartite projections — preserving two-partition structures

Diagram showing monopartite and bipartite projections with node connections.

What You’ll Learn

By the end of this lesson, you’ll be able to:

Transform bipartite graphs into monopartite networks by traversing intermediate nodes
Create labelled bipartite projections that preserve node types
Match projection strategies to algorithm requirements
Recognize why projecting "everything" rarely produces meaningful results

Exercise 1: Monopartite Transformation

Your database has a bipartite structure: (:Actor)-[:ACTED_IN]→(:Movie)

Transform this into a monopartite Actor-to-Actor network by connecting actors through shared movies.

You have five minutes to try this. If you need help, pop a message in the chat.

This pattern traverses Movie nodes without capturing them, creating direct Actor-to-Actor connections. The result is a true monopartite graph where any actor can potentially connect to any other actor.

Exercise 1: Solution

The following query traverses through Movie nodes without including them in source or target.

cypher

MATCH (source:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project('actor-collab', source, target) AS g // (2)
RETURN g.graphName, g.nodeCount, g.relationshipCount // (3)

Match Actors connected through Movies, creating Actor-to-Actor pairs
Project a graph named 'actor-collab' using the matched source and target nodes
Return statistics about the created projection

Why This Works for PageRank

In your Actor-to-Actor network:

Actors connect directly to other Actors
Importance can flow freely between nodes

Actor-to-Actor network with direct connections and importance flow.

Exercise 2: Run PageRank

Run PageRank on the actor-collab graph in write mode, storing the result as a pageRank property.

Then query the database to find the top 10 actors by PageRank score.

Try this on your own for five minutes.

Exercise 2: Solution

The first query runs PageRank on the actor-collab graph and stores the results in the database:

cypher

CALL gds.pageRank.write('actor-collab', { // (1)
  writeProperty: 'pageRank' // (2)
})
YIELD nodePropertiesWritten // (3)
RETURN nodePropertiesWritten

Run PageRank in write mode on the actor-collab projection
Store the PageRank score in a property named 'pageRank'
Return the count of properties written to the database

Exercise 2: Solution (continued)

The second query retrieves the top 10 actors by PageRank score:

cypher

MATCH (a:Actor) // (1)
WHERE a.pageRank IS NOT NULL // (2)
RETURN a.name AS actor, a.pageRank AS score // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)

Match all Actor nodes
Filter to actors with a PageRank score
Return the actor name and their PageRank score
Order by score from highest to lowest
Return only the top 10 results

You should see meaningful rankings—actors ranked by their importance in the collaboration network.

Compare this to the bipartite structure where PageRank produced nearly identical scores for all nodes.

Exercise 3: Movie Network

Create a monopartite Movie-to-Movie network by connecting movies through shared actors.

Exercise 3: Solution

The following query traverses through Actor nodes to connect Movies.

cypher

MATCH (source:Movie)<-[:ACTED_IN]-(:Actor)-[:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project('movie-collab', source, target) AS g // (2)
RETURN g.graphName, g.nodeCount, g.relationshipCount // (3)

Match Movies connected through Actors, creating Movie-to-Movie pairs
Project a graph named 'movie-collab' using the matched source and target nodes
Return statistics about the created projection

Exercise 4: Labelled Bipartite Projection

Create a bipartite projection of (:User)-[:RATED]→(:Movie), preserving node labels.

Try this by yourself for five minutes.

If you need help, pop a message in the chat.

Bipartite projection showing User-Movie connections with preserved labels.

The configuration parameters preserve the User and Movie labels. GDS now knows which nodes belong to which partition.

Exercise 4: Solution

cypher

MATCH (source:User)-[r:RATED]->(target:Movie) // (1)
WITH gds.graph.project(
  'user-movie-bipartite', // (2)
  source, // (3)
  target, // (4)
  { // (5)
    sourceNodeLabels: labels(source),
    targetNodeLabels: labels(target),
    relationshipType: type(r)
  },
  {}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (6)

Match Users connected to Movies via RATED relationships
Name the projection 'user-movie-bipartite'
Use User nodes as source
Use Movie nodes as target
Configure the projection to preserve node labels and relationship type
Return statistics about the created projection

What Have We Modelled?

What insights can we glean from a graph of (:User)-[:RATED]→(:Movie)?

User-Movie graph with connections and annotated insights.

Take a moment to discuss.

Expected answers: user preferences, taste profiles, recommendation foundations, collaborative filtering data.

Node Similarity

Node Similarity compares nodes based on shared neighbours across the bipartite structure.

In this case: finding users with overlapping taste in movies.

Exercise 5: Run Node Similarity

Run Node Similarity on the user-movie-bipartite graph in write mode.

Use SIMILAR as the relationship type and score as the property name.

Exercise 5: Solution

cypher

CALL gds.nodeSimilarity.write('user-movie-bipartite', { // (1)
  writeRelationshipType: 'SIMILAR', // (2)
  writeProperty: 'score' // (3)
})
YIELD nodesCompared, relationshipsWritten // (4)
RETURN nodesCompared, relationshipsWritten

Run Node Similarity in write mode on the user-movie-bipartite projection
Create SIMILAR relationships between similar nodes
Store the similarity score as a 'score' property on the relationships
Return statistics about the nodes compared and relationships created

This creates SIMILAR relationships between Users who rated the same Movies, and between Movies rated by the same Users.

Verify the Results

Check the similar users that Node Similarity found:

cypher

MATCH (u1:User)-[s:SIMILAR]->(u2:User)
RETURN u1.name AS user1, u2.name AS user2, s.score AS similarity
ORDER BY s.score DESC
LIMIT 10

These are users with similar movie rating patterns—the foundation of collaborative filtering.

What Happens with Multipartite?

Node Similarity works well on bipartite structures. But what if you project the entire graph?

Comparison of Node Similarity on bipartite vs full graph projections.

Let’s find out.

Project the Full Graph

Project everything: Actors, Movies, Users, and their relationships:

cypher

MATCH (source)-[r]->(target)
WHERE source:Actor OR source:Movie OR source:User OR target:Actor OR target:Movie OR target:User
LIMIT 100000
WITH gds.graph.project(
  'everything',
  source,
  target
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

This projects the entire graph, including all node types. However, note that we are limiting the number of relationships to ensure that node similarity runs faster for this demonstration.

Run Node Similarity on Everything

cypher

CALL gds.nodeSimilarity.stream('everything', {topK: 3})
YIELD node1, node2, similarity
WITH gds.util.asNode(node1) AS n1,
     gds.util.asNode(node2) AS n2,
     similarity
WHERE similarity < 0.3 AND n1 < n2
  AND none(l IN labels(n1) WHERE l IN labels(n2))
  AND none(l IN labels(n2) WHERE l IN labels(n1))
RETURN labels(n1) AS label1,
       coalesce(n1.name, n1.title) AS node1,
       labels(n2) AS label2,
       coalesce(n2.name, n2.title) AS node2,
       similarity
ORDER BY similarity DESC
LIMIT 10

Unexpected Results

Problems with the "everything" projection:

Spurious links between Users, Actors and Genres
The results are conceptually meaningless
The algorithm takes significantly longer to run
The query to interpret results is unnecessarily complex

Flowchart showing issues with projecting the entire graph.

The algorithm has no concept of node types. It just sees nodes with shared neighbours—regardless of whether those comparisons are meaningful.

Why This Happens

Node Similarity compares nodes based on shared neighbours. On a multipartite graph:

An Actor and a User might both connect to the same Movie
The algorithm sees them as "similar"—even though that comparison is nonsensical

An Actor and User both connected to a Movie

The algorithm has no concept of node types. It just sees nodes with shared neighbours—regardless of whether those comparisons are meaningful.

Cleanup

Before moving on, drop the projections we created:

cypher

CALL gds.graph.list()
YIELD graphName
CALL gds.graph.drop(graphName)
YIELD graphName AS droppedGraphs
RETURN droppedGraphs

Practice Summary

You practiced both projection approaches:

Monopartite transformations for algorithms like PageRank that need single-type networks
Labelled bipartite projections for algorithms like Node Similarity that work with two-partition structures

You experienced the consequences of projecting "everything" graphs.

The same data can answer different questions depending on how you project it. Choose your approach based on what your algorithm expects and what insights you’re seeking.

Graph Data Science in Practice

GDS Foundations

Community Detection for Fraud

Projection Practice

Introduction

What You’ll Learn

Exercise 1: Monopartite Transformation

Exercise 1: Solution

Why This Works for PageRank

Exercise 2: Run PageRank

Exercise 2: Solution

Exercise 2: Solution (continued)

Exercise 3: Movie Network

Exercise 3: Solution

Exercise 4: Labelled Bipartite Projection

Exercise 4: Solution

What Have We Modelled?

Node Similarity

Exercise 5: Run Node Similarity

Exercise 5: Solution

Verify the Results

What Happens with Multipartite?

Project the Full Graph

Run Node Similarity on Everything

Unexpected Results

Why This Happens

Cleanup

Practice Summary

Chatbot

Data Model