Practice monopartite projections

Introduction

Now that you understand what monopartite graphs are, it’s time to practice creating different types of monopartite projections.

In this lesson, you’ll work with the Movies dataset to create various monopartite projections that reveal different network structures within the same data.

By the end of this lesson, you will be able to:

Create monopartite projections from multipartite data
Project different node types as monopartite networks
Understand how projection patterns reveal different insights

The movies dataset

Your database contains:

Actor nodes with properties like name and born
Movie nodes with properties like title and released
User nodes with properties like name
Genre nodes with properties like name
ACTED_IN relationships (Actor → Movie)
DIRECTED relationships (Person → Movie)
RATED relationships (User → Movie)
IN_GENRE relationships (Movie → Genre)

Projection 1: Actor collaboration network

Remember this projection you created in the previous lesson, connecting actors to actors.

Run it again to create an actor-to-actor network:

cypher

Project actor collaboration network

MATCH (source:Actor)-[:ACTED_IN]->
  (:Movie)
    <-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project( // (2)
  'actor-network', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Projection breakdown

Match Actor nodes connected through Movie nodes (Movies are not captured)
Call the GDS projection function
Name the projection 'actor-network'
Include source (Actor) nodes
Include target (Actor) nodes
First configuration map (empty - using defaults)
Second configuration map (empty - using defaults)
Return projection statistics

This projection creates:

A monopartite network of only Actor nodes
Actors connected if they appeared in the same movie
Relationships representing collaborations

What this reveals: Actor communities, collaboration patterns, degrees of separation between actors.

Movie network

Now create a monopartite network of movies connected through shared actors.

Complete the query below by pasting it into the sandbox and replacing the ????? placeholders:

cypher

Complete the movie network projection (replace ?????)

MATCH (source:?????)<-[:ACTED_IN]-
  (:????)
    -[:ACTED_IN]->(target:?????) // (1)
WITH gds.graph.project( // (2)
  'movies-only', // (3)
  ????, // (4)
  ???? // (5)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (6)

Projection breakdown

Match Movie nodes connected through Actor nodes (fill in the labels)
Call the GDS projection function
Name the projection 'movies-only'
Include source nodes (fill in the variable)
Include target nodes (fill in the variable)
Return projection statistics

Details

cypher

Solution: Project movies connected through actors

MATCH (source:Movie)<-[:ACTED_IN]-(:Actor)-[:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'movies-only', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Match Movie nodes connected through Actor nodes (Actors are not captured)
Call the GDS projection function
Name the projection 'movies-only'
Include source (Movie) nodes
Include target (Movie) nodes
First configuration map (empty - using defaults)
Second configuration map (empty - using defaults)
Return projection statistics

The movies-only graph now contains only the Movies in the network and their shared connections through actors.

Key points:

source:Movie and target:Movie - both nodes are Movie type
Pattern goes through Actor nodes to connect movies
source.title < target.title prevents duplicate relationships
Name the projection 'movie-similarity'

What this reveals: Movie clusters, franchise connections, similar films based on shared cast.

Projection 2: User network through shared ratings

In the graph, you have User nodes with RATED relationships to Movie nodes. See if you can create a monopartite graph of only Users and their relationships to each other.

Remember, you need the Movie nodes to get users' shared connections. However, you do not want to include Movie nodes in the projection

Replace ???? with the appropriate Cypher query to complete the projection.

cypher

Complete the user network projection (replace ????)

MATCH ???? // (1)
WITH gds.graph.project( // (2)
  'user-interaction-network', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Projection breakdown

Match User nodes connected through Movie nodes (fill in the pattern)
Call the GDS projection function
Name the projection 'user-interaction-network'
Include source (User) nodes
Include target (User) nodes
First configuration map (empty - using defaults)
Second configuration map (empty - using defaults)
Return projection statistics

Details

cypher

Solution: Project users connected through ratings

MATCH (source:User)-[:RATED]->(:Movie)<-[:RATED]-(target:User) // (1)
WITH gds.graph.project( // (2)
  'user-interaction-network', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Match User nodes connected through Movie nodes (Movies are not captured)
Call the GDS projection function
Name the projection 'user-interaction-network'
Include source (User) nodes
Include target (User) nodes
First configuration map (empty - using defaults)
Second configuration map (empty - using defaults)
Return projection statistics

Key points:

source:User and target:User - both nodes are User type
Pattern goes through Movie nodes to connect users
Users are directly connected by their shared watching history

What this reveals: User connections through similar watch history

This projection creates:

A monopartite network of only User nodes
Users connected through shared movie ratings

What this reveals: User communities with similar tastes, recommendation opportunities.

Projection 3: Genre co-occurrence network

Now create a network of genres that appear together in movies.

Complete the query below by replacing the ????? placeholders:

cypher

Complete the genre co-occurrence projection (replace ?????)

MATCH ???? // (1)
WITH gds.graph.project( // (2)
  '?????', // (3)
  ?????, // (4)
  ????? // (5)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (6)

Projection breakdown

Match Genre nodes connected through Movie nodes (fill in the pattern)
Call the GDS projection function
Name the projection (fill in a descriptive name)
Include source nodes (fill in the variable)
Include target nodes (fill in the variable)
Return projection statistics

If you require help, open the dropdown below to view the full solution.

Details

cypher

Solution: Project genres connected through movies

MATCH (source:Genre)<-[:IN_GENRE]-(:Movie)-[:IN_GENRE]->(target:Genre) // (1)
WITH gds.graph.project( // (2)
  'genre-cooccurrence', // (3)
  source, // (4)
  target // (5)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (6)

Match Genre nodes connected through Movie nodes (Movies are not captured)
Call the GDS projection function
Name the projection 'genre-cooccurrence'
Include source (Genre) nodes
Include target (Genre) nodes
Return projection statistics

Key points:

source:Genre and target:Genre - both nodes are Genre type
Pattern goes through Movie nodes to connect genres
Name the projection 'genre-cooccurrence'
Pass source and target to gds.graph.project()

What this reveals: Common genre combinations, genre clustering (e.g., Action-Adventure-Sci-Fi).

Understanding algorithm results

Each of the graphs you just projected continues to exist in memory.

You can run mutliple algorithms on each one to compare results.

stream()

Throughout these exercises, you’ll use .stream() mode to run algorithms and view results. Don’t worry if the syntax isn’t fully clear yet—Module 3 will teach execution modes in detail. For now, focus on understanding the projection patterns and what insights each graph structure can reveal.

Run the command below to see how degree centrality ranks the Movies in the 'movies-only' graph.

cypher

Run degree centrality on movies-only

CALL gds.degree.stream('movies-only') // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).title AS movie, score AS collaborations // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

Call degree centrality in stream mode on 'movies-only' projection
Yield node IDs and degree scores
Convert node IDs to movie titles and return with scores
Sort by score in descending order
Limit results to top 10 movies

The first movie returned is called "Backdraft". There’s a good chance you have never heard of that movie. So, you may wonder why degree centrality ranks it as 'most important'.

Up to now, we have used 'most important' to explain the intention of centrality algorithms — and it’s not entirely incorrect. It is, however, missing context.

It is the most important movie in terms of its relationships' intent. In this case, that intent is actor collaborations.

So, we could predict perhaps that, although 'Backdraft' may not be as famous as, say, 'The Matrix', it should contain a lot more famous faces.

It’s true, we have some high degree centrality actors in there, including:

Actor	Degree rank
Robert De Niro	1
Kurt Russel	19
Donald Sutherland	44
Val Kilmer	119
Jennifer Jason Leigh	139
Scott Glenn	447
William Baldwin	769

There are many more, who may have lesser start power, but whose connections to movies which do are more numerous than most.

Take a look for yourself on IMDb.

The key takeaway here is that 'importance' or 'centrality' only matter in terms of the relationships analyzed. In this case, we have not returned the most important or central movies in terms of user ratings, or cultural influence.

Instead, its result is more relevant to the movie business itself. It answers 'Which movies contained the most important actor-to-actor collaborations overall'.

If you had asked someone on the street, they are very unlikely to have guessed 'Backdraft'.

Remember, degree centrality measures the number of outgoing relationships each node has.

PageRank does the same, except it weights each node’s score relative to the score of its neighbors. Run PageRank on this graph, and you might be less surprised by the result.

cypher

Run PageRank on movies-only

CALL gds.pageRank.stream('movies-only') // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).title AS movie, score AS collaborations // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

Call PageRank in stream mode on 'movies-only' projection
Yield node IDs and PageRank scores
Convert node IDs to movie titles and return with scores
Sort by score in descending order
Limit results to top 10 movies

You have another graph waiting in memory: 'user-interaction-network'.

It is composed of User to User relationships through Movie nodes.

Replace the graph name below to run the PageRank algorithm on this graph:

cypher

Run PageRank on user-interaction-network (replace ?????)

CALL gds.pageRank.stream( // (1)
  '?????' // (2)
)
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS user, score AS centrality // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call PageRank in stream mode
Specify the projection name (fill in 'user-interaction-network')
Yield node IDs and PageRank scores
Convert node IDs to user names and return with scores
Sort by score in descending order
Limit results to top 10 users

The resulting table provides you with a breakdown of the most 'important' users in terms of their movie ratings.

This is also a great example of the difference between PageRank and degree centrality.

Run degree centrality on the same graph to compare the results—the algorithm syntax is provided, just add the graph name:

cypher

Run degree centrality on user-interaction-network (replace ?????)

CALL gds.degree.stream( // (1)
  '?????' // (2)
)
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS user, score AS centrality // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call degree centrality in stream mode
Specify the projection name (fill in 'user-interaction-network')
Yield node IDs and degree scores
Convert node IDs to user names and return with scores
Sort by score in descending order
Limit results to top 10 users

Remember, degree centrality merely counts the outgoing relationships of the nodes — PageRank weights those relationships in terms of other nodes in the graph.

So, while 'Angela Garcia' may have rated the most movies overall, 'Zacahry Carey' is the most important user when weighted in terms of other important users.

What’s next

You’ve now practiced creating multiple monopartite projections from the Movies dataset — simply by modifying the initial Cypher query used to generate the graph.

Each projection transforms multipartite data (Actor-Movie, User-Movie, Movie-Genre) into monopartite networks for specific analyses.

In the next lesson, you’ll put this knowledge to the test with a challenge that requires you to create your own monopartite projection.

Check your understanding

Understanding Contextual Importance

You run degree centrality on two different monopartite projections of the same Movies dataset:

Projection A: Movies connected through shared actors

Projection B: Movies connected through shared genres

The movie "Inception" ranks highly in Projection A but poorly in Projection B.

What does this tell you about "Inception"?

❏ The projection calculations are incorrect—a movie should rank the same regardless of projection
✓ "Inception" shares many actors with other movies but belongs to uncommon genre combinations
❏ Degree centrality is not a reliable algorithm for measuring movie importance
❏ "Inception" is more popular with audiences than with critics

Hint

Think about what connections each projection is measuring. What does a high degree in one projection vs. low degree in another tell you about the movie’s actual relationships?

Solution

"Inception" shares many actors with other movies but belongs to uncommon genre combinations is correct.

Centrality is contextual—it depends on the relationships in your projection.

High degree in the actor projection means "Inception" has many cast members who appear in other movies. Low degree in the genre projection means it has a unique genre combination.

Different projections answer different questions: actor networks vs. genre patterns. Neither ranking is wrong; they measure different aspects of the same movie.

Summary

Monopartite projections extract single node types from complex data by connecting them through intermediate nodes. You practiced creating projections for:

Actors connected through shared movies
Movies connected through shared actors
Users connected through shared ratings
Genres connected through shared movies

Each projection serves different analytical goals. Understanding how to design meaningful projections is a fundamental GDS skill—the same data can reveal completely different insights depending on how you project it.

Get started with Graph Data Science

Get started with the Graph Data Science library

GDS basic concepts

Working with algorithms

Essential projection techniques

Practice monopartite projections

Introduction

The movies dataset

Projection 1: Actor collaboration network

Movie network

Projection 2: User network through shared ratings

Projection 3: Genre co-occurrence network

Understanding algorithm results

What’s next

Check your understanding

Understanding Contextual Importance

Summary

Chatbot