Practice monopartite projections

Introduction

Now that you understand the difference between graph structure and graph labels, it’s time to practice creating true monopartite projections.

In this lesson, you’ll work with the Movies dataset to create various monopartite projections—transforming bipartite and multipartite structures into single-type networks that reveal different patterns within the same data.

By the end of this lesson, you will be able to:

  • Create true monopartite projections from bipartite data

  • Project different node types as monopartite networks

  • Understand how projection structure affects algorithm results

The movies dataset

Your database contains a multipartite structure with several node types and relationship types:

  • Actor nodes with properties like name and born

  • Movie nodes with properties like title and released

  • User nodes with properties like name

  • Genre nodes with properties like name

  • ACTED_IN relationships (Actor → Movie)

  • DIRECTED relationships (Person → Movie)

  • RATED relationships (User → Movie)

  • IN_GENRE relationships (Movie → Genre)

Each of these relationships creates a bipartite structure. For example, (:Actor)-[:ACTED_IN]→(:Movie) is bipartite because Actors only connect to Movies, never directly to other Actors.

In this lesson, you’ll transform these bipartite structures into true monopartite networks.

Projection 1: Actor collaboration network

Remember this projection from the previous lesson? It connects actors directly to other actors through their shared movie appearances.

Run it again to create an actor-to-actor network:

cypher
Project actor collaboration network
MATCH (source:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project( // (2)
  'actor-network', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Projection breakdown

  1. Match Actor nodes connected through Movie nodes (Movies are traversed but not captured)

  2. Call the GDS projection function

  3. Name the projection 'actor-network'

  4. Include source (Actor) nodes

  5. Include target (Actor) nodes

  6. First configuration map (empty - using defaults)

  7. Second configuration map (empty - using defaults)

  8. Return projection statistics

Why this is a true monopartite projection:

  • Only Actor nodes exist in the graph

  • Actors connect directly to other Actors

  • Any Actor can potentially connect to any other Actor (through shared movies)

  • The nodes cannot be separated into distinct non-overlapping sets

What this reveals: Actor communities, collaboration patterns, degrees of separation between actors.

Projection 2: Movie network

Now create a monopartite network of movies connected through shared actors. This is the inverse of the actor network—instead of connecting actors through movies, we connect movies through actors.

Complete the query below by replacing the ????? placeholders:

cypher
Complete the movie network projection (replace ?????)
MATCH (source:?????)<-[:ACTED_IN]-(:?????)-[:ACTED_IN]->(target:?????) // (1)
WITH gds.graph.project( // (2)
  'movies-only', // (3)
  ?????, // (4)
  ????? // (5)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (6)

Projection breakdown

  1. Match Movie nodes connected through Actor nodes (fill in the labels)

  2. Call the GDS projection function

  3. Name the projection 'movies-only'

  4. Include source nodes (fill in the variable)

  5. Include target nodes (fill in the variable)

  6. Return projection statistics

Solution
cypher
Solution: Project movies connected through actors
MATCH (source:Movie)<-[:ACTED_IN]-(:Actor)-[:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'movies-only', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Projection breakdown:

  1. Match Movie nodes connected through Actor nodes (Actors are traversed but not captured)

  2. Call the GDS projection function

  3. Name the projection 'movies-only'

  4. Include source (Movie) nodes

  5. Include target (Movie) nodes

  6. First configuration map (empty - using defaults)

  7. Second configuration map (empty - using defaults)

  8. Return projection statistics

Key points:

  • source:Movie and target:Movie — both endpoints are Movie nodes

  • The pattern traverses Actor nodes to create direct Movie-to-Movie connections

  • Actors become invisible "bridges" in the projection

What this reveals: Movie clusters, franchise connections, films linked by shared cast members.

Projection 3: User network through shared ratings

Your database has User nodes with RATED relationships to Movie nodes. This is another bipartite structure: Users connect to Movies, but Users never connect directly to other Users.

Create a true monopartite graph of Users connected through their shared movie ratings.

Remember: you need the Movie nodes to find shared connections, but you don’t want Movie nodes in the final projection.

Replace ???? with the appropriate Cypher pattern:

cypher
Complete the user network projection (replace ????)
MATCH ???? // (1)
WITH gds.graph.project( // (2)
  'user-network', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Projection breakdown

  1. Match User nodes connected through Movie nodes (fill in the pattern)

  2. Call the GDS projection function

  3. Name the projection 'user-network'

  4. Include source (User) nodes

  5. Include target (User) nodes

  6. First configuration map (empty - using defaults)

  7. Second configuration map (empty - using defaults)

  8. Return projection statistics

Solution
cypher
Solution: Project users connected through ratings
MATCH (source:User)-[:RATED]->(:Movie)<-[:RATED]-(target:User) // (1)
WITH gds.graph.project( // (2)
  'user-network', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Projection breakdown

  1. Match User nodes connected through Movie nodes (Movies are traversed but not captured)

  2. Call the GDS projection function

  3. Name the projection 'user-network'

  4. Include source (User) nodes

  5. Include target (User) nodes

  6. First configuration map (empty - using defaults)

  7. Second configuration map (empty - using defaults)

  8. Return projection statistics

Key points:

  • source:User and target:User — both endpoints are User nodes

  • The pattern traverses Movie nodes to create direct User-to-User connections

  • Users who rated the same movie become directly connected

What this reveals: User communities with similar tastes, potential recommendation clusters.

Projection 4: Genre co-occurrence network

Now create a network of genres that appear together in movies.

The original structure is bipartite: (:Movie)-[:IN_GENRE]→(:Genre). Transform this into a monopartite Genre-to-Genre network.

Complete the query below by replacing the ????? placeholders:

cypher
Complete the genre co-occurrence projection (replace ?????)
MATCH ???? // (1)
WITH gds.graph.project( // (2)
  '?????', // (3)
  ?????, // (4)
  ????? // (5)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (6)

Projection breakdown

  1. Match Genre nodes connected through Movie nodes (fill in the pattern)

  2. Call the GDS projection function

  3. Name the projection (fill in a descriptive name)

  4. Include source nodes (fill in the variable)

  5. Include target nodes (fill in the variable)

  6. Return projection statistics

Solution
cypher
Solution: Project genres connected through movies
MATCH (source:Genre)<-[:IN_GENRE]-(:Movie)-[:IN_GENRE]->(target:Genre) // (1)
WITH gds.graph.project( // (2)
  'genre-cooccurrence', // (3)
  source, // (4)
  target // (5)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (6)

Projection breakdown 1. Match Genre nodes connected through Movie nodes (Movies are traversed but not captured) 2. Call the GDS projection function 3. Name the projection 'genre-cooccurrence' 4. Include source (Genre) nodes 5. Include target (Genre) nodes 6. Return projection statistics

Key points:

  • source:Genre and target:Genre — both endpoints are Genre nodes

  • The pattern traverses Movie nodes to connect genres that appear together

  • Two genres are connected if any movie belongs to both

What this reveals: Common genre combinations, genre clustering (e.g., Action-Adventure-Sci-Fi often appear together).

Running algorithms on monopartite projections

Each of the graphs you created remains in memory. You can run algorithms on any of them to compare results.

stream() mode

Throughout these exercises, you’ll use .stream() mode to run algorithms and view results without writing them to the database. Don’t worry if the syntax isn’t fully clear yet—Module 3 covers execution modes in detail. For now, focus on understanding what each projection structure reveals.

Degree centrality on movies

Run degree centrality on the 'movies-only' graph to find movies with the most connections to other movies (through shared actors):

cypher
Run degree centrality on movies-only
CALL gds.degree.stream('movies-only') // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).title AS movie, score AS connections // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

  1. Call degree centrality in stream mode on 'movies-only' projection

  2. Yield node IDs and degree scores

  3. Convert node IDs to movie titles and return with scores

  4. Sort by score in descending order

  5. Limit results to top 10 movies

The top result is likely "True Romance"—an important film, yes, but is it truly the most important?

Remember what degree centrality measures: the number of connections each node has. In our Movie-to-Movie network, connections represent shared actors. "True Romance" ranks highly because it shares cast members with many other movies.

Even though "True Romance" is culturally important and critically acclaimed, that is not what PageRank measures on this projection. It means the film’s cast appeared in many other films in our database. Looking at the cast explains this:

Actor Degree rank in actor-network

"Samuel L. Jackson"

666.0

"Christopher Walken"

526.0

"Dennis Hopper"

414.0

"Gary Oldman"

396.0

"Val Kilmer"

368.0

"Christian Slater"

358.0

"Brad Pitt"

304.0

"Michael Rapaport"

249.0

Patricia Arquette

177.0

Bronson Pinochet

60.0

The film features several highly-connected actors, creating many Movie-to-Movie connections.

The key takeaway here is that 'importance' or 'centrality' only matter in terms of the relationships analyzed. In this case, we have not returned the most important or central movies in terms of user ratings, or cultural influence.

PageRank on movies

PageRank also measures importance, but weights each node’s score based on the importance of its neighbours. Run PageRank on the same graph:

cypher
Run PageRank on movies-only
CALL gds.pageRank.stream('movies-only') // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).title AS movie, score AS importance // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

  1. Call PageRank in stream mode on 'movies-only' projection

  2. Yield node IDs and PageRank scores

  3. Convert node IDs to movie titles and return with scores

  4. Sort by score in descending order

  5. Limit results to top 10 movies

The results differ from degree centrality because PageRank considers not just how many connections a movie has, but how important those connected movies are.

Why PageRank works here: This is a true monopartite graph. Movies connect to other movies, so PageRank’s "importance flow" can circulate through the network. Compare this to the previous lesson, where PageRank on a bipartite Actor→Movie structure produced meaningless results because importance flowed into Movies and got trapped.

However, note that PageRank on this particular graph produces essentially the same output as degree centrality. This happens because the Movie-to-Movie connections are relatively uniform in weight—movies connect through shared actors, and those actor-based connections don’t carry varying "importance" the way they might in other networks. In a network where some connections are inherently more valuable than others (like web pages linking to authoritative sources), PageRank would diverge more significantly from degree centrality.

Compare algorithms on user network

You have another graph in memory: 'user-network'. Run PageRank to find the most important users:

cypher
Run PageRank on user-network (replace ?????)
CALL gds.pageRank.stream( // (1)
  '?????' // (2)
)
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS user, score AS importance // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

  1. Call PageRank in stream mode

  2. Specify the projection name (fill in 'user-network')

  3. Yield node IDs and PageRank scores

  4. Convert node IDs to user names and return with scores

  5. Sort by score in descending order

  6. Limit results to top 10 users

Now run degree centrality on the same graph to compare:

cypher
Run degree centrality on user-network (replace ?????)
CALL gds.degree.stream( // (1)
  '?????' // (2)
)
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS user, score AS connections // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

  1. Call degree centrality in stream mode

  2. Specify the projection name (fill in 'user-network')

  3. Yield node IDs and degree scores

  4. Convert node IDs to user names and return with scores

  5. Sort by score in descending order

  6. Limit results to top 10 users

Comparing the results:

  • Degree centrality finds users who rated movies in common with the most other users

  • PageRank finds users who are connected to other important users

A user like "Angela Garcia" might have the highest degree (rated movies that many others also rated), while "Zachary Carey" might have the highest PageRank (connected to other highly-connected users).

Both are valid measures of "importance"—they just measure different things.

The pattern: Bipartite to monopartite transformation

Notice the common pattern across all these projections:

Original bipartite structure Monopartite projection Bridge nodes

Actor → Movie

Actor ↔ Actor

Movies (traversed, not captured)

Movie ← Actor

Movie ↔ Movie

Actors (traversed, not captured)

User → Movie

User ↔ User

Movies (traversed, not captured)

Movie → Genre

Genre ↔ Genre

Movies (traversed, not captured)

In each case:

  1. Start with a bipartite structure (Type A connects to Type B)

  2. Traverse the "bridge" nodes (Type B) without capturing them

  3. Create direct connections between nodes of the same type (Type A ↔ Type A)

  4. Result: a true monopartite graph suitable for algorithms like PageRank

Check your understanding

Understanding Contextual Importance

You run degree centrality on two different monopartite projections of the same Movies dataset:

Projection A: Movies connected through shared actors

Projection B: Movies connected through shared genres

The movie "Inception" ranks highly in Projection A but poorly in Projection B.

What does this tell you about "Inception"?

  • ❏ The projection calculations are incorrect—a movie should rank the same regardless of projection

  • ✓ "Inception" shares many actors with other movies but belongs to uncommon genre combinations

  • ❏ Degree centrality is not a reliable algorithm for measuring movie importance

  • ❏ "Inception" is more popular with audiences than with critics

Hint

Think about what connections each projection is measuring. What does a high degree in one projection vs. low degree in another tell you about the movie’s actual relationships?

Solution

"Inception" shares many actors with other movies but belongs to uncommon genre combinations is correct.

Centrality is contextual—it depends on the relationships in your projection.

High degree in the actor projection means "Inception" has many cast members who appear in other movies. Low degree in the genre projection means it has a unique genre combination.

Different projections answer different questions: actor networks vs. genre patterns. Neither ranking is wrong; they measure different aspects of the same movie.

Summary

You’ve practiced creating multiple monopartite projections from the Movies dataset by transforming bipartite structures into single-type networks.

Each projection serves different analytical goals:

  • Actor network → collaboration patterns, actor communities

  • Movie network → cast overlap, franchise connections

  • User network → taste similarity, recommendation clusters

  • Genre network → genre combinations, content categorisation

In the next lesson, you’ll put this knowledge to the test with a challenge that requires you to design your own monopartite projection.

Chatbot

How can I help you today?