Practice bipartite projections

Introduction

Now that you understand the difference between graph structure and graph labels, and when to preserve labels in your projections, it’s time to practice creating bipartite projections.

In this lesson, you’ll work with the Movies dataset to create various labelled bipartite projections and run Node Similarity on them—an algorithm specifically designed to work with bipartite structures.

By the end of this lesson, you will be able to:

  • Create bipartite projections with preserved labels

  • Run Node Similarity on different bipartite structures

  • Understand how the bipartite structure determines what "similarity" means

Quick recap: Why preserve labels?

In the previous lessons, you learned:

  • Structure = how nodes connect (bipartite: two non-overlapping sets with connections only between them)

  • Labels = what GDS knows about node types

For Node Similarity, preserving labels matters because the algorithm compares nodes within each partition based on their shared connections across the partition. GDS needs to know which nodes belong to which partition.oan

The movies dataset

Your database contains several bipartite structures:

  • (:User)-[:RATED]→(:Movie) — Users rate Movies

  • (:Actor)-[:ACTED_IN]→(:Movie) — Actors appear in Movies

  • (:Movie)-[:IN_GENRE]→(:Genre) — Movies belong to Genres

Each of these is naturally bipartite: one node type connects only to the other, never to itself.

Projection 1: User-Movie bipartite

Let’s start with the User-Movie projection from the previous lesson. Run this command to create a labelled bipartite network:

cypher
Project user-movie bipartite graph
MATCH (source:User)-[r:RATED]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'user-movie', // (3)
  source, // (4)
  target, // (5)
  {
    sourceNodeLabels: labels(source), // (6)
    targetNodeLabels: labels(target), // (7)
    relationshipType: type(r) // (8)
  },
  {}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)

Projection breakdown

  1. Match User nodes connected to Movie nodes via RATED relationships

  2. Call the GDS projection function

  3. Name the projection 'user-movie'

  4. Include source (User) nodes

  5. Include target (Movie) nodes

  6. Preserve source node labels (User)

  7. Preserve target node labels (Movie)

  8. Preserve relationship type (RATED)

  9. Return projection statistics

What you’ve created:

  • A bipartite structure: Users connect to Movies, but Users never connect to other Users

  • Labels preserved: GDS knows which nodes are Users and which are Movies

  • Perfect for Node Similarity: the algorithm can compare Users based on shared Movie connections

Running Node Similarity

write() mode

In this lesson, you’ll use .write() mode to persist algorithm results to your database. This creates new relationships that you can query afterwards. Module 3 covers all execution modes in detail—for now, follow the patterns shown.

Run Node Similarity on this projection:

cypher
Run Node Similarity on user-movie
CALL gds.nodeSimilarity.write( // (1)
  'user-movie', // (2)
  {
    writeRelationshipType: 'SIMILAR_USER', // (3)
    writeProperty: 'score' // (4)
  })
YIELD nodesCompared, relationshipsWritten // (5)

Algorithm breakdown

  1. Call Node Similarity algorithm in write mode

  2. Run on 'user-movie' projection

  3. Write new relationships with type 'SIMILAR_USER'

  4. Write similarity scores as 'score' property

  5. Yield the number of nodes compared and relationships written

How Node Similarity uses the bipartite structure:

  1. It identifies the two partitions (Users and Movies)

  2. It compares nodes within each partition based on shared neighbours in the other partition

  3. Users who rated similar Movies get connected by SIMILAR_USER relationships

  4. Movies rated by similar Users get connected too (though we’ll focus on Users here)

Verify the results:

cypher
View similar users
MATCH path = (u1:User)-[:SIMILAR_USER]->(u2:User)-[:RATED]->(m:Movie) // (1)
RETURN path // (2)
LIMIT 10 // (3)

Query breakdown

  1. Match similar users and a movie one of them rated

  2. Return the complete path

  3. Limit to 10 results

What this reveals: Users with similar movie rating patterns—the foundation of collaborative filtering recommendation systems.

Projection 2: Actor-Movie bipartite

Now create a different bipartite projection: Actors and Movies.

Complete the projection by replacing the ???? placeholders:

cypher
Complete the actor-movie projection (replace ????)
MATCH (source:????)-[r:????]->(target:????) // (1)
WITH gds.graph.project( // (2)
  'actor-movie', // (3)
  source, // (4)
  target, // (5)
  {
    sourceNodeLabels: ????, // (6)
    targetNodeLabels: ????, // (7)
    relationshipType: ???? // (8)
  },
  {}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)

Projection breakdown

  1. Match Actor nodes connected to Movie nodes (fill in the labels and relationship type)

  2. Call the GDS projection function

  3. Name the projection 'actor-movie'

  4. Include source nodes

  5. Include target nodes

  6. Preserve source node labels (fill in)

  7. Preserve target node labels (fill in)

  8. Preserve relationship type (fill in)

  9. Return projection statistics

Solution
cypher
Solution: Project actor-movie bipartite graph
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actor-movie', // (3)
  source, // (4)
  target, // (5)
  {
    sourceNodeLabels: labels(source), // (6)
    targetNodeLabels: labels(target), // (7)
    relationshipType: type(r) // (8)
  },
  {}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)

Projection breakdown

  1. Match Actor nodes connected to Movie nodes via ACTED_IN relationships

  2. Call the GDS projection function

  3. Name the projection 'actor-movie'

  4. Include source (Actor) nodes

  5. Include target (Movie) nodes

  6. Preserve source node labels (Actor)

  7. Preserve target node labels (Movie)

  8. Preserve relationship type (ACTED_IN)

  9. Return projection statistics

What you’ve created:

  • A bipartite structure: Actors connect to Movies, never directly to other Actors

  • Labels preserved: GDS knows which nodes are Actors and which are Movies

Now run Node Similarity:

cypher
Run Node Similarity on actor-movie
CALL gds.nodeSimilarity.write( // (1)
  'actor-movie', // (2)
  {
    writeRelationshipType: 'SIMILAR_ACTOR', // (3)
    writeProperty: 'score' // (4)
  })
YIELD nodesCompared, relationshipsWritten // (5)

Algorithm breakdown

  1. Call Node Similarity algorithm in write mode

  2. Run on 'actor-movie' projection

  3. Write new relationships with type 'SIMILAR_ACTOR'

  4. Write similarity scores as 'score' property

  5. Yield the number of nodes compared and relationships written

Verify the results:

cypher
View similar actors
MATCH (a1:Actor)-[s:SIMILAR_ACTOR]->(a2:Actor) // (1)
RETURN a1.name AS actor1, a2.name AS actor2, s.score AS similarity // (2)
ORDER BY s.score DESC // (3)
LIMIT 10 // (4)

Query breakdown

  1. Match pairs of Actor nodes connected by SIMILAR_ACTOR relationships

  2. Return actor names and similarity scores

  3. Sort by score in descending order

  4. Limit to top 10 most similar pairs

What this reveals: Actors who frequently appear in the same movies—collaboration clusters in the film industry.

Projection 3: Movie-Genre bipartite

Now create a bipartite projection of Movies and Genres, then run Node Similarity. Complete both steps yourself.

Step 1: Create the projection by replacing the ????? placeholders:

cypher
Complete the movie-genre projection (replace ?????)
MATCH (source:?????)-[r:?????]->(target:?????) // (1)
WITH gds.graph.project( // (2)
  'movie-genre', // (3)
  source, // (4)
  target, // (5)
  {
    sourceNodeLabels: ?????, // (6)
    targetNodeLabels: ?????, // (7)
    relationshipType: ????? // (8)
  },
  {}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)

Projection breakdown

  1. Match Movie nodes connected to Genre nodes (fill in labels and relationship type)

  2. Call the GDS projection function

  3. Name the projection 'movie-genre'

  4. Include source nodes

  5. Include target nodes

  6. Preserve source node labels (fill in)

  7. Preserve target node labels (fill in)

  8. Preserve relationship type (fill in)

  9. Return projection statistics

Step 2: Run Node Similarity on your projection:

cypher
Run Node Similarity on movie-genre (replace ?????)
CALL gds.nodeSimilarity.write( // (1)
  'movie-genre', // (2)
  {
    writeRelationshipType: '?????', // (3)
    writeProperty: 'score' // (4)
  })
YIELD nodesCompared, relationshipsWritten // (5)

Algorithm breakdown

  1. Call Node Similarity algorithm in write mode

  2. Run on 'movie-genre' projection

  3. Choose a relationship type name (e.g., 'SIMILAR_MOVIE')

  4. Write similarity scores as 'score' property

  5. Yield the number of nodes compared and relationships written

Solution

Step 1: Create the projection

cypher
Solution: Project movie-genre bipartite graph
MATCH (source:Movie)-[r:IN_GENRE]->(target:Genre) // (1)
WITH gds.graph.project( // (2)
  'movie-genre', // (3)
  source, // (4)
  target, // (5)
  {
    sourceNodeLabels: labels(source), // (6)
    targetNodeLabels: labels(target), // (7)
    relationshipType: type(r) // (8)
  },
  {}
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (9)

Projection breakdown

  1. Match Movie nodes connected to Genre nodes via IN_GENRE relationships

  2. Call the GDS projection function

  3. Name the projection 'movie-genre'

  4. Include source (Movie) nodes

  5. Include target (Genre) nodes

  6. Preserve source node labels (Movie)

  7. Preserve target node labels (Genre)

  8. Preserve relationship type (IN_GENRE)

  9. Return projection statistics

Step 2: Run Node Similarity

cypher
Solution: Run Node Similarity on movie-genre
CALL gds.nodeSimilarity.write(
  'movie-genre',
  {
    writeRelationshipType: 'SIMILAR_MOVIE',
    writeProperty: 'score'
  })
YIELD nodesCompared, relationshipsWritten

Verify the results:

cypher
View similar movies by genre
MATCH (m1:Movie)-[s:SIMILAR_MOVIE]->(m2:Movie) // (1)
RETURN m1.title AS movie1, m2.title AS movie2, s.score AS similarity // (2)
ORDER BY s.score DESC // (3)
LIMIT 10 // (4)

Query breakdown

  1. Match pairs of Movie nodes connected by SIMILAR_MOVIE relationships

  2. Return movie titles and similarity scores

  3. Sort by score in descending order

  4. Limit to top 10 most similar pairs

What this reveals: Movies that share genre classifications—content-based similarity for recommendations.

How bipartite structure determines similarity

Each projection you created uses the same algorithm but produces different results because the bipartite structure determines what "shared neighbours" means:

Projection Partitions Similarity means…​

User-Movie

Users ↔ Movies

Users who rated the same movies

Actor-Movie

Actors ↔ Movies

Actors who appeared in the same movies

Movie-Genre

Movies ↔ Genres

Movies that belong to the same genres

The key insight: Node Similarity doesn’t define what "similar" means—your projection structure does. The algorithm simply finds nodes that share neighbours across the bipartite divide.

Bipartite projections vs monopartite transformations

In Lesson 4, you created monopartite projections by transforming bipartite structures:

  • (Actor)-[:ACTED_IN]→(Movie)←[:ACTED_IN]-(Actor) → Actor-to-Actor network

In this lesson, you created labelled bipartite projections that preserve both node types:

  • (Actor)-[:ACTED_IN]→(Movie) with labels preserved

When to use each approach:

Monopartite transformation Labelled bipartite projection

Algorithms that expect single-type networks (PageRank, many centrality measures)

Algorithms designed for bipartite structures (Node Similarity)

You want direct Type-to-Type connections

You want to compare nodes based on shared cross-type connections

The intermediate nodes are just "bridges"

Both node types are meaningful to your analysis

Both approaches are valid—choose based on your algorithm and analytical goals.

What’s next

You’ve practiced creating labelled bipartite projections and running Node Similarity on each one.

Each projection used the same bipartite-aware algorithm but revealed different insights:

  • User-Movie: Similar users based on rating patterns (collaborative filtering)

  • Actor-Movie: Similar actors based on collaboration patterns

  • Movie-Genre: Similar movies based on genre classifications (content-based filtering)

In the next lesson, you’ll put this knowledge to the test with a challenge that requires you to design your own projection and choose the appropriate approach—monopartite transformation or labelled bipartite—based on your analytical goal.

Check your understanding

How Projections Affect Similarity

You ran node similarity on three different bipartite projections: User-Movie, Actor-Movie, and Movie-Genre.

Why did each projection produce different similarity results?

  • ✓ The projection structure determines what "similarity" means—shared ratings vs. shared cast vs. shared genres

  • ❏ Node similarity uses different algorithms depending on the node types in the projection

  • ❏ The three projections had different numbers of nodes, changing the algorithm’s behavior

  • ❏ GDS automatically adjusts similarity calculations based on the relationship types

Hint

Think about what node similarity does: it connects nodes that have shared neighbors. What are the neighbors in each projection?

Solution

The projection structure determines what "similarity" means—shared ratings vs. shared cast vs. shared genres.

Node similarity always does the same thing: it connects nodes on the same side of a bipartite graph based on shared neighbors on the other side.

However, what those neighbors represent changes the meaning of similarity:

  • User-Movie: Users are similar if they rated the same movies (collaborative filtering)

  • Actor-Movie: Actors are similar if they appeared in the same movies (collaboration patterns)

  • Movie-Genre: Movies are similar if they belong to the same genres (content-based similarity)

The algorithm doesn’t change—the analytical context does. This is why thoughtful projection design is fundamental to GDS work.

Summary

Labelled bipartite projections preserve the two-partition structure and enable algorithms like Node Similarity to compare nodes within each partition based on shared connections across the partition.

You practiced creating three bipartite projections:

  • User-Movie: Found similar users based on shared movie ratings

  • Actor-Movie: Found similar actors based on shared movie appearances

  • Movie-Genre: Found similar movies based on shared genre classifications

Key insight: The same algorithm reveals different insights depending on your projection structure. Node Similarity doesn’t define what "similar" means—your bipartite structure does.

Choosing your approach:

  • Use monopartite transformations for algorithms that expect single-type networks

  • Use labelled bipartite projections for algorithms designed for two-partition structures

Understanding when to use each approach is a fundamental GDS skill for designing effective graph analytics.

Chatbot

How can I help you today?