Projecting monopartite graphs

Introduction

In the previous lesson, you learned how to create basic Cypher projections. You projected actors and movies from your database into an in-memory graph.

Now let’s understand what type of graph you actually created—and why it matters for the algorithms you’ll run on it.

Here’s a critical detail: the projection you created may not behave the way you expect.

Run this projection now:

cypher
The actors-graph projection
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actors-graph', // (3)
  source, // (4)
  target // (5)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (6)

Projection breakdown

  1. Match Actor nodes connected to Movie nodes via ACTED_IN relationships

  2. Call the GDS projection function

  3. Name the projection 'actors-graph'

  4. Include source (Actor) nodes

  5. Include target (Movie) nodes

  6. Return projection statistics

Intuitively, you may think you projected a data model that looks like this:

A node

This is what’s known as a bipartite graph—a graph whose nodes can be divided into two distinct, non-overlapping sets where connections only occur between sets (Actors connect to Movies, but Actors never connect directly to other Actors).

What you actually projected was this:

An unidentified node

The graph structure is still bipartite—Actors still only connect to Movies in the projection. But GDS has stripped away the labels, so it no longer knows which nodes are Actors and which are Movies.

This is an unlabelled projection. Understanding this distinction between structure and labels is essential for getting meaningful algorithm results.

By the end of this lesson, you will understand:

  • What monopartite graphs are and how they differ from bipartite graphs

  • The difference between graph structure and graph labels

  • Why GDS creates unlabelled projections by default

  • How graph structure affects algorithm results

  • When and how to project true monopartite graphs

Structure vs Labels: The key distinction

Before diving deeper, let’s clarify two concepts that are easy to conflate:

Graph structure refers to how nodes connect to each other:

  • In a monopartite structure, nodes cannot be separated into distinct non-overlapping sets—any node can potentially connect to any other node

  • In a bipartite structure, nodes fall into exactly two non-overlapping sets, and connections only occur between sets

Graph labels refer to what GDS knows about node and relationship types:

  • A labelled projection preserves node labels (Actor, Movie) and relationship types (ACTED_IN)

  • An unlabelled projection treats all nodes as generic "Node" and all relationships as generic connections

By default, GDS creates unlabelled projections. This means:

  • A bipartite structure remains bipartite (Actors still only connect to Movies)

  • But GDS doesn’t know which nodes are which type

This distinction matters because some algorithms behave poorly on bipartite structures, regardless of whether labels are present.

What is a monopartite graph?

A monopartite graph is one whose nodes cannot be separated into distinct non-overlapping sets based on their connection patterns.

Consider a social network where all nodes are people connected by friendships:

cypher
Project a social network
MATCH (source:Person)-[r:FRIENDS_WITH]->(target:Person) // (1)
WITH gds.graph.project( // (2)
  'social-network', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (8)

Projection breakdown

  1. Match Person nodes connected to other Person nodes via FRIENDS_WITH relationships

  2. Call the GDS projection function

  3. Name the projection 'social-network'

  4. Include source (Person) nodes

  5. Include target (Person) nodes

  6. First configuration map (empty - using defaults)

  7. Second configuration map (empty - using defaults)

  8. Return projection statistics

This creates an in-memory projection that looks like this:

A network of nodes all of the same type

This is a true monopartite graph because:

  • Any Person can be friends with any other Person

  • You cannot divide the nodes into separate groups where connections only occur between groups

  • The structure itself is monopartite

For this graph, stripping labels doesn’t change anything meaningful—it was already a single node type connecting to itself. Running algorithms on this unlabelled projection will produce the same results as a labelled version.

Homogeneous graphs

A graph with a single node type and single relationship type is also called a homogeneous graph.

The default projection behaviour

Now let’s return to our Movies dataset. When we projected it using this command:

cypher
Default projection behaviour
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actors-graph', // (3)
  source, // (4)
  target // (5)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (6)

Projection breakdown

  1. Match Actor nodes connected to Movie nodes via ACTED_IN relationships

  2. Call the GDS projection function

  3. Name the projection 'actors-graph'

  4. Include source (Actor) nodes

  5. Include target (Movie) nodes

  6. Return projection statistics

We created an unlabelled bipartite projection:

A graph of nodes without labels and relationships without types

What GDS sees:

  • All nodes appear as a generic "Node" type (no distinction between Actor and Movie)

  • All relationships appear as a generic type (no "ACTED_IN" label)

  • Algorithms cannot distinguish between node types

What the structure still is:

  • Bipartite—the original Actor nodes still only connect to original Movie nodes

  • No Actor-to-Actor connections exist

  • No Movie-to-Movie connections exist

The structure hasn’t changed — only GDS’s awareness of node types has been removed.

Why structure matters for algorithms

Earlier, we ran degree centrality on this graph. Degree centrality simply counts connections for each node. It doesn’t care about graph structure—it produces valid results on any graph.

However, other algorithms behave differently depending on structure.

Take PageRank for example. PageRank ranks nodes by "importance" based on the importance of nodes that link to them. It works by simulating a "random walk" through the graph, where importance flows along relationships.

Let’s run PageRank on our current projection:

cypher
Run PageRank on actors-graph
CALL gds.pageRank.stream( // (1)
  'actors-graph',         // (2)
  {}                      // (3)
)
YIELD nodeId, score       // (4)
RETURN gds.util.asNode(nodeId).title, score // (5)
ORDER BY score DESC // (6)

Algorithm breakdown

  1. Call PageRank algorithm in stream mode

  2. Run on the 'actors-graph' projection

  3. Configuration map (empty - using defaults)

  4. Yield node IDs and PageRank scores

  5. Convert node IDs to names and return with scores

  6. Sort by score in descending order

You should notice that almost all nodes receive nearly the same PageRank score. This isn’t useful.

Why does this happen?

In our bipartite structure:

Two clusters showing the bipartite structure with unlabelled nodes

PageRank’s "importance" flows along relationship directions. In our Actor→Movie structure:

  1. Importance flows from Actors into Movies

  2. Movies have no outgoing relationships (in this projection)

  3. Movies become "rank sinks"—accumulating importance with nowhere to pass it

The result: Movies accumulate all the PageRank score based purely on how many Actors appeared in them. This tells us nothing meaningful about actor importance.

The problem isn’t the missing labels—it’s the bipartite structure itself. Even if we preserved the Actor and Movie labels, PageRank would still flow into Movies and get trapped there.

Projecting a true monopartite graph

To get meaningful PageRank results for actors, we need to change the structure, not just the labels.

If we want to know which actors are most important, we need a graph where actors connect to other actors. We can create this by treating shared movies as implicit connections:

cypher
Project actors connected through movies
MATCH (source:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project( // (2)
  'actors-only', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (8)

Projection breakdown

  1. Match Actor nodes connected through Movie nodes (Movies are traversed but not captured)

  2. Call the GDS projection function

  3. Name the projection 'actors-only'

  4. Include source (Actor) nodes

  5. Include target (Actor) nodes

  6. First configuration map (empty - using defaults)

  7. Second configuration map (empty - using defaults)

  8. Return projection statistics

This creates a fundamentally different structure:

Actors connected directly to other actors

Now we have a true monopartite graph:

  • Only Actor nodes exist in the projection

  • Actors connect directly to other Actors (through shared movie appearances)

  • Any Actor can potentially connect to any other Actor

  • The nodes cannot be separated into distinct non-overlapping sets

Let’s run PageRank on this new projection:

cypher
Run PageRank on actors-only projection
CALL gds.pageRank.stream( // (1)
  'actors-only', // (2)
  {} // (3)
)
YIELD nodeId, score // (4)
RETURN gds.util.asNode(nodeId).name, score // (5)
ORDER BY score DESC // (6)

Algorithm breakdown

  1. Call PageRank algorithm in stream mode

  2. Run on the 'actors-only' projection

  3. Configuration map (empty - using defaults)

  4. Yield node IDs and PageRank scores

  5. Convert node IDs to names and return with scores

  6. Sort by score in descending order

stream() mode

In this example, we’re illustrating the impact of projections on algorithms with .stream() mode—this runs the algorithm and shows results without writing them. You’ll learn to use all five execution modes in detail in Module 3.

Now PageRank returns meaningful results—Gérard Depardieu as the most important actor, with other actors ranked by their relative importance in the collaboration network.

What changed:

  • Before: Bipartite structure (Actor→Movie) caused importance to flow into Movies and get trapped

  • After: Monopartite structure (Actor<→Actor) allows importance to flow between actors based on collaboration patterns

When to project monopartite graphs

Project a true monopartite graph when:

  • You want to analyse relationships within a single entity type

  • You’re using algorithms that expect monopartite structure (PageRank, many centrality measures)

  • The intermediate nodes (like Movies) are just "bridges" for your analysis

Keep bipartite structure when:

  • You’re using algorithms designed for bipartite graphs (Node Similarity, some recommendation algorithms)

  • The relationship between different types is what you’re analysing

  • You need to distinguish between node types for filtering

We’ll explore bipartite projections and label preservation in a later lesson.

Check your understanding

Identifying Monopartite Behavior

You project a graph of customers and products using this query:

cypher
MATCH (source:Customer)-[r:PURCHASED]->(target:Product)
WITH gds.graph.project(
  'customer-products',
  source,
  target,
  {},
  {}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

How does GDS treat this projection by default?

  • ❏ As a bipartite graph with Customer and Product as distinct types

  • ✓ As a monopartite graph where all nodes are treated as the same type

  • ❏ As a normal graph with whatever labels are in the Cypher statement

  • ❏ As an invalid projection because it contains two label types

Hint

Remember GDS’s default behavior when no label configuration is specified.

Solution

As a monopartite graph where all nodes are treated as the same type is correct.

By default, GDS ignores node labels and treats all nodes in a projection as a single, generic type—even if your source data has multiple labels. To preserve labels, you must explicitly configure sourceNodeLabels and targetNodeLabels.

Summary

You now understand the critical difference between graph structure (monopartite vs bipartite) and graph labels (what GDS knows about types).

You’ve seen how bipartite structures can cause algorithms like PageRank to produce meaningless results—not because of missing labels, but because of how the structure traps the algorithm’s computations.

When you need to analyse relationships within a single entity type, project a true monopartite graph by connecting nodes through their shared neighbours.

In the next lesson, you’ll learn about bipartite and multipartite graphs, and when preserving labels matters.

Chatbot

How can I help you today?