Introduction
Now that you understand the difference between graph structure and graph labels, and when to preserve labels in your projections, it’s time to practice creating bipartite projections.
In this lesson, you’ll work with the Movies dataset to create various labelled bipartite projections and run Node Similarity on them—an algorithm specifically designed to work with bipartite structures.
By the end of this lesson, you will be able to:
-
Create bipartite projections with preserved labels
-
Run Node Similarity on different bipartite structures
-
Understand how the bipartite structure determines what "similarity" means
Quick recap: Why preserve labels?
In the previous lessons, you learned:
-
Structure = how nodes connect (bipartite: two non-overlapping sets with connections only between them)
-
Labels = what GDS knows about node types
For Node Similarity, preserving labels matters because the algorithm compares nodes within each partition based on their shared connections across the partition. GDS needs to know which nodes belong to which partition.oan
The movies dataset
Your database contains several bipartite structures:
-
(:User)-[:RATED]→(:Movie)— Users rate Movies -
(:Actor)-[:ACTED_IN]→(:Movie)— Actors appear in Movies -
(:Movie)-[:IN_GENRE]→(:Genre)— Movies belong to Genres
Each of these is naturally bipartite: one node type connects only to the other, never to itself.
Projection 1: User-Movie bipartite
Let’s start with the User-Movie projection from the previous lesson. Run this command to create a labelled bipartite network:
MATCH (source:User)-[r:RATED]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
'user-movie', // (3)
source, // (4)
target, // (5)
{
sourceNodeLabels: labels(source), // (6)
targetNodeLabels: labels(target), // (7)
relationshipType: type(r) // (8)
},
{}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)Projection breakdown
-
Match User nodes connected to Movie nodes via RATED relationships
-
Call the GDS projection function
-
Name the projection 'user-movie'
-
Include source (User) nodes
-
Include target (Movie) nodes
-
Preserve source node labels (User)
-
Preserve target node labels (Movie)
-
Preserve relationship type (RATED)
-
Return projection statistics
What you’ve created:
-
A bipartite structure: Users connect to Movies, but Users never connect to other Users
-
Labels preserved: GDS knows which nodes are Users and which are Movies
-
Perfect for Node Similarity: the algorithm can compare Users based on shared Movie connections
Running Node Similarity
write() mode
In this lesson, you’ll use .write() mode to persist algorithm results to your database. This creates new relationships that you can query afterwards. Module 3 covers all execution modes in detail—for now, follow the patterns shown.
Run Node Similarity on this projection:
CALL gds.nodeSimilarity.write( // (1)
'user-movie', // (2)
{
writeRelationshipType: 'SIMILAR_USER', // (3)
writeProperty: 'score' // (4)
})
YIELD nodesCompared, relationshipsWritten // (5)Algorithm breakdown
-
Call Node Similarity algorithm in write mode
-
Run on 'user-movie' projection
-
Write new relationships with type 'SIMILAR_USER'
-
Write similarity scores as 'score' property
-
Yield the number of nodes compared and relationships written
How Node Similarity uses the bipartite structure:
-
It identifies the two partitions (Users and Movies)
-
It compares nodes within each partition based on shared neighbours in the other partition
-
Users who rated similar Movies get connected by SIMILAR_USER relationships
-
Movies rated by similar Users get connected too (though we’ll focus on Users here)
Verify the results:
MATCH path = (u1:User)-[:SIMILAR_USER]->(u2:User)-[:RATED]->(m:Movie) // (1)
RETURN path // (2)
LIMIT 10 // (3)Query breakdown
-
Match similar users and a movie one of them rated
-
Return the complete path
-
Limit to 10 results
What this reveals: Users with similar movie rating patterns—the foundation of collaborative filtering recommendation systems.
Projection 2: Actor-Movie bipartite
Now create a different bipartite projection: Actors and Movies.
Complete the projection by replacing the ???? placeholders:
MATCH (source:????)-[r:????]->(target:????) // (1)
WITH gds.graph.project( // (2)
'actor-movie', // (3)
source, // (4)
target, // (5)
{
sourceNodeLabels: ????, // (6)
targetNodeLabels: ????, // (7)
relationshipType: ???? // (8)
},
{}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)Projection breakdown
-
Match Actor nodes connected to Movie nodes (fill in the labels and relationship type)
-
Call the GDS projection function
-
Name the projection 'actor-movie'
-
Include source nodes
-
Include target nodes
-
Preserve source node labels (fill in)
-
Preserve target node labels (fill in)
-
Preserve relationship type (fill in)
-
Return projection statistics
Solution
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
'actor-movie', // (3)
source, // (4)
target, // (5)
{
sourceNodeLabels: labels(source), // (6)
targetNodeLabels: labels(target), // (7)
relationshipType: type(r) // (8)
},
{}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)Projection breakdown
-
Match Actor nodes connected to Movie nodes via ACTED_IN relationships
-
Call the GDS projection function
-
Name the projection 'actor-movie'
-
Include source (Actor) nodes
-
Include target (Movie) nodes
-
Preserve source node labels (Actor)
-
Preserve target node labels (Movie)
-
Preserve relationship type (ACTED_IN)
-
Return projection statistics
What you’ve created:
-
A bipartite structure: Actors connect to Movies, never directly to other Actors
-
Labels preserved: GDS knows which nodes are Actors and which are Movies
Now run Node Similarity:
CALL gds.nodeSimilarity.write( // (1)
'actor-movie', // (2)
{
writeRelationshipType: 'SIMILAR_ACTOR', // (3)
writeProperty: 'score' // (4)
})
YIELD nodesCompared, relationshipsWritten // (5)Algorithm breakdown
-
Call Node Similarity algorithm in write mode
-
Run on 'actor-movie' projection
-
Write new relationships with type 'SIMILAR_ACTOR'
-
Write similarity scores as 'score' property
-
Yield the number of nodes compared and relationships written
Verify the results:
MATCH (a1:Actor)-[s:SIMILAR_ACTOR]->(a2:Actor) // (1)
RETURN a1.name AS actor1, a2.name AS actor2, s.score AS similarity // (2)
ORDER BY s.score DESC // (3)
LIMIT 10 // (4)Query breakdown
-
Match pairs of Actor nodes connected by SIMILAR_ACTOR relationships
-
Return actor names and similarity scores
-
Sort by score in descending order
-
Limit to top 10 most similar pairs
What this reveals: Actors who frequently appear in the same movies—collaboration clusters in the film industry.
Projection 3: Movie-Genre bipartite
Now create a bipartite projection of Movies and Genres, then run Node Similarity. Complete both steps yourself.
Step 1: Create the projection by replacing the ????? placeholders:
MATCH (source:?????)-[r:?????]->(target:?????) // (1)
WITH gds.graph.project( // (2)
'movie-genre', // (3)
source, // (4)
target, // (5)
{
sourceNodeLabels: ?????, // (6)
targetNodeLabels: ?????, // (7)
relationshipType: ????? // (8)
},
{}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)Projection breakdown
-
Match Movie nodes connected to Genre nodes (fill in labels and relationship type)
-
Call the GDS projection function
-
Name the projection 'movie-genre'
-
Include source nodes
-
Include target nodes
-
Preserve source node labels (fill in)
-
Preserve target node labels (fill in)
-
Preserve relationship type (fill in)
-
Return projection statistics
Step 2: Run Node Similarity on your projection:
CALL gds.nodeSimilarity.write( // (1)
'movie-genre', // (2)
{
writeRelationshipType: '?????', // (3)
writeProperty: 'score' // (4)
})
YIELD nodesCompared, relationshipsWritten // (5)Algorithm breakdown
-
Call Node Similarity algorithm in write mode
-
Run on 'movie-genre' projection
-
Choose a relationship type name (e.g., 'SIMILAR_MOVIE')
-
Write similarity scores as 'score' property
-
Yield the number of nodes compared and relationships written
Solution
Step 1: Create the projection
MATCH (source:Movie)-[r:IN_GENRE]->(target:Genre) // (1)
WITH gds.graph.project( // (2)
'movie-genre', // (3)
source, // (4)
target, // (5)
{
sourceNodeLabels: labels(source), // (6)
targetNodeLabels: labels(target), // (7)
relationshipType: type(r) // (8)
},
{}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)Projection breakdown
-
Match Movie nodes connected to Genre nodes via IN_GENRE relationships
-
Call the GDS projection function
-
Name the projection 'movie-genre'
-
Include source (Movie) nodes
-
Include target (Genre) nodes
-
Preserve source node labels (Movie)
-
Preserve target node labels (Genre)
-
Preserve relationship type (IN_GENRE)
-
Return projection statistics
Step 2: Run Node Similarity
CALL gds.nodeSimilarity.write(
'movie-genre',
{
writeRelationshipType: 'SIMILAR_MOVIE',
writeProperty: 'score'
})
YIELD nodesCompared, relationshipsWrittenVerify the results:
MATCH (m1:Movie)-[s:SIMILAR_MOVIE]->(m2:Movie) // (1)
RETURN m1.title AS movie1, m2.title AS movie2, s.score AS similarity // (2)
ORDER BY s.score DESC // (3)
LIMIT 10 // (4)Query breakdown
-
Match pairs of Movie nodes connected by SIMILAR_MOVIE relationships
-
Return movie titles and similarity scores
-
Sort by score in descending order
-
Limit to top 10 most similar pairs
What this reveals: Movies that share genre classifications—content-based similarity for recommendations.
How bipartite structure determines similarity
Each projection you created uses the same algorithm but produces different results because the bipartite structure determines what "shared neighbours" means:
| Projection | Partitions | Similarity means… |
|---|---|---|
User-Movie |
Users ↔ Movies |
Users who rated the same movies |
Actor-Movie |
Actors ↔ Movies |
Actors who appeared in the same movies |
Movie-Genre |
Movies ↔ Genres |
Movies that belong to the same genres |
The key insight: Node Similarity doesn’t define what "similar" means—your projection structure does. The algorithm simply finds nodes that share neighbours across the bipartite divide.
Bipartite projections vs monopartite transformations
In Lesson 4, you created monopartite projections by transforming bipartite structures:
-
(Actor)-[:ACTED_IN]→(Movie)←[:ACTED_IN]-(Actor)→ Actor-to-Actor network
In this lesson, you created labelled bipartite projections that preserve both node types:
-
(Actor)-[:ACTED_IN]→(Movie)with labels preserved
When to use each approach:
| Monopartite transformation | Labelled bipartite projection |
|---|---|
Algorithms that expect single-type networks (PageRank, many centrality measures) |
Algorithms designed for bipartite structures (Node Similarity) |
You want direct Type-to-Type connections |
You want to compare nodes based on shared cross-type connections |
The intermediate nodes are just "bridges" |
Both node types are meaningful to your analysis |
Both approaches are valid—choose based on your algorithm and analytical goals.
What’s next
You’ve practiced creating labelled bipartite projections and running Node Similarity on each one.
Each projection used the same bipartite-aware algorithm but revealed different insights:
-
User-Movie: Similar users based on rating patterns (collaborative filtering)
-
Actor-Movie: Similar actors based on collaboration patterns
-
Movie-Genre: Similar movies based on genre classifications (content-based filtering)
In the next lesson, you’ll put this knowledge to the test with a challenge that requires you to design your own projection and choose the appropriate approach—monopartite transformation or labelled bipartite—based on your analytical goal.
Check your understanding
How Projections Affect Similarity
You ran node similarity on three different bipartite projections: User-Movie, Actor-Movie, and Movie-Genre.
Why did each projection produce different similarity results?
-
✓ The projection structure determines what "similarity" means—shared ratings vs. shared cast vs. shared genres
-
❏ Node similarity uses different algorithms depending on the node types in the projection
-
❏ The three projections had different numbers of nodes, changing the algorithm’s behavior
-
❏ GDS automatically adjusts similarity calculations based on the relationship types
Hint
Think about what node similarity does: it connects nodes that have shared neighbors. What are the neighbors in each projection?
Solution
The projection structure determines what "similarity" means—shared ratings vs. shared cast vs. shared genres.
Node similarity always does the same thing: it connects nodes on the same side of a bipartite graph based on shared neighbors on the other side.
However, what those neighbors represent changes the meaning of similarity:
-
User-Movie: Users are similar if they rated the same movies (collaborative filtering)
-
Actor-Movie: Actors are similar if they appeared in the same movies (collaboration patterns)
-
Movie-Genre: Movies are similar if they belong to the same genres (content-based similarity)
The algorithm doesn’t change—the analytical context does. This is why thoughtful projection design is fundamental to GDS work.
Summary
Labelled bipartite projections preserve the two-partition structure and enable algorithms like Node Similarity to compare nodes within each partition based on shared connections across the partition.
You practiced creating three bipartite projections:
-
User-Movie: Found similar users based on shared movie ratings
-
Actor-Movie: Found similar actors based on shared movie appearances
-
Movie-Genre: Found similar movies based on shared genre classifications
Key insight: The same algorithm reveals different insights depending on your projection structure. Node Similarity doesn’t define what "similar" means—your bipartite structure does.
Choosing your approach:
-
Use monopartite transformations for algorithms that expect single-type networks
-
Use labelled bipartite projections for algorithms designed for two-partition structures
Understanding when to use each approach is a fundamental GDS skill for designing effective graph analytics.