Introduction
In the previous lessons, you learned that GDS creates unlabelled projections by default—preserving graph structure but stripping away node labels and relationship types.
You also learned that some algorithms, like PageRank, produce poor results on bipartite structures because the structure itself traps their computations.
But bipartite structures aren’t always a problem. Some algorithms are designed for bipartite graphs and produce excellent results on them.
In this lesson, you’ll learn about bipartite and multipartite graphs, when to preserve labels in your projections, and which algorithms work well with multi-type structures.
By the end of this lesson, you will understand:
-
What bipartite and multipartite graphs are
-
The difference between multipartite and heterogeneous graphs
-
When preserving node labels matters
-
How to configure projections to preserve labels
-
Which algorithms work well with bipartite structures
What is a bipartite graph?
A bipartite graph is one whose nodes can be divided into exactly two distinct, non-overlapping sets, where connections only occur between sets—never within them.
Think of it as a graph "of two parts."
Example: Customers purchasing products:
(:Customer)-[:PURCHASED]->(:Product)
(:Customer)-[:PURCHASED]->(:Product)
(:Customer)-[:PURCHASED]->(:Product)In this structure:
-
Customers connect to Products
-
Customers never connect directly to other Customers
-
Products never connect directly to other Products
The same is true of our (:Actor)-[:ACTED_IN]→(:Movie) projection:
The defining characteristic is that you can cleanly separate all nodes into two groups where every relationship crosses the divide and no node connects to any other node on the same side.
What is a multipartite graph?
A multipartite graph is simply an extension of the concept to three, four, or more non-overlapping sets.
Here’s an example from the Movies dataset:
This graph has four node types and multiple relationship types:
-
(:Actor)-[:ACTED_IN]→(:Movie) -
(:User)-[:RATED]→(:Movie) -
(:Movie)-[:IN_GENRE]→(:Genre)
Each node type forms a distinct set, and every relationship crosses between sets—no Actor connects to another Actor, no Movie to another Movie, and so on.
What is a heterogeneous graph?
A heterogeneous graph is a graph with multiple node types and/or relationship types, where nodes of the same type may connect to each other.
The Movies example above is both multipartite and heterogeneous—it has multiple node and relationship types, and happens to have no within-set connections.
But if we added (:Actor)-[:FRIENDS_WITH]→(:Actor) relationships, the graph would still be heterogeneous—but it would no longer be multipartite, because actors would now connect to other actors.
All multipartite graphs are heterogeneous, but not all heterogeneous graphs are multipartite.
Recap: Structure vs Labels
Remember the key distinction from the previous lessons:
Structure = how nodes actually connect
-
Monopartite: nodes cannot be separated into distinct non-overlapping sets
-
Bipartite: exactly two non-overlapping sets, connections only between them
-
Multipartite: three or more non-overlapping sets
Labels = what GDS knows about node/relationship types
-
Labelled: GDS preserves Actor, Movie, ACTED_IN, etc.
-
Unlabelled: GDS sees generic "Node" and generic relationships
By default, GDS creates unlabelled projections. The structure remains unchanged, but GDS loses awareness of which nodes are which type.
When labels matter
In the previous lessons, we saw that PageRank fails on bipartite structures—not because of missing labels, but because of the structure itself. Adding labels wouldn’t fix PageRank on a bipartite graph.
So when do labels matter?
Labels matter when:
-
You need to filter algorithm execution by type — Run PageRank only on Actor nodes, for example
-
The algorithm uses labels for its logic — Some algorithms behave differently based on node types
-
You’re writing results back and need type awareness — Knowing which nodes received which scores
Labels don’t matter when:
-
You’ve already projected a single-type subgraph — Your Actor-to-Actor projection only contains Actors anyway
-
The algorithm ignores types entirely — Degree centrality just counts connections
-
Structure is your only concern — The algorithm’s behaviour depends on connection patterns, not labels
Preserving labels in projections
To preserve labels, use the configuration parameters in your projection:
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
'actors-movies-labelled', // (3)
source, // (4)
target, // (5)
{
sourceNodeLabels: labels(source), // (6)
targetNodeLabels: labels(target), // (7)
relationshipType: type(r) // (8)
},
{}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (9)Projection breakdown
-
Match Actor nodes connected to Movie nodes via ACTED_IN relationships
-
Call the GDS projection function
-
Name the projection 'actors-movies-labelled'
-
Include source (Actor) nodes
-
Include target (Movie) nodes
-
Preserve source node labels (Actor)
-
Preserve target node labels (Movie)
-
Preserve relationship type (ACTED_IN)
-
Return projection statistics
Key configuration parameters:
-
sourceNodeLabels: labels(source)— Preserves the label(s) from source nodes -
targetNodeLabels: labels(target)— Preserves the label(s) from target nodes -
relationshipType: type(r)— Preserves the relationship type
Now GDS creates a labelled bipartite projection:
The structure is unchanged—still bipartite. But now GDS knows which nodes are Actors and which are Movies.
Creating multipartite projections
The same configuration works for three or more node types:
MATCH (source)-[r]->(target) // (1)
WHERE (source:Actor OR source:Movie OR source:User OR source:Genre)
AND (target:Actor OR target:Movie OR target:User OR target:Genre) // (2)
WITH gds.graph.project( // (3)
'movies-multipartite', // (4)
source, // (5)
target, // (6)
{
sourceNodeLabels: labels(source), // (7)
targetNodeLabels: labels(target), // (8)
relationshipType: type(r) // (9)
},
{}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (10)Projection breakdown
-
Match any source and target nodes with relationships
-
Filter to include only Actor, Movie, User, or Genre nodes
-
Call the GDS projection function
-
Name the projection 'movies-multipartite'
-
Include source nodes
-
Include target nodes
-
Preserve source node labels
-
Preserve target node labels
-
Preserve relationship types
-
Return projection statistics
This preserves all four node types and their relationship types in a single labelled projection.
Algorithms designed for bipartite graphs
Some algorithms are specifically designed to work with bipartite structures, not against them.
Node Similarity compares nodes based on their shared neighbours—perfect for bipartite graphs where one set of nodes connects to another.
Let’s see it in action. First, project Users and Movies:
MATCH (source:User)-[r:RATED]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
'user-rated-movie', // (3)
source, // (4)
target, // (5)
{
sourceNodeLabels: labels(source), // (6)
targetNodeLabels: labels(target), // (7)
relationshipType: type(r) // (8)
},
{}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (9)Projection breakdown
-
Match User nodes connected to Movie nodes via RATED relationships
-
Call the GDS projection function
-
Name the projection 'user-rated-movie'
-
Include source (User) nodes
-
Include target (Movie) nodes
-
Preserve source node labels
-
Preserve target node labels
-
Preserve relationship types
-
Return projection statistics
This creates a labelled bipartite projection:
Running Node Similarity
Node Similarity finds nodes that share many neighbours. In a bipartite graph, it compares nodes on one side based on their connections to the other side.
CALL gds.nodeSimilarity.write( // (1)
'user-rated-movie', // (2)
{
writeRelationshipType: 'SIMILAR', // (3)
writeProperty: 'score' // (4)
})
YIELD nodesCompared, relationshipsWritten // (5)Algorithm breakdown
-
Call Node Similarity algorithm in write mode
-
Run on 'user-rated-movie' projection
-
Write new relationships with type 'SIMILAR'
-
Write similarity scores as 'score' property
-
Yield the number of nodes compared and relationships written
write() mode
You’re using .write() mode to persist results to the database. This is different from .stream() which only displays results. You’ll learn when to use each execution mode in Module 3.
Understanding the results
Let’s verify what Node Similarity created. First, check if it created any User-to-Movie similarity relationships:
MATCH (u:User)-[r:SIMILAR]->(m:Movie) // (1)
RETURN u, r, m // (2)
LIMIT 1 // (3)Query breakdown
-
Try to match User nodes with SIMILAR relationships to Movie nodes
-
Return the matched users, relationships, and movies
-
Limit to 1 result (should return nothing)
This returns no results. Why?
Node Similarity only considers nodes similar if they share neighbours. In our bipartite structure:
-
Users connect to Movies
-
Users never directly neighbour other Users
-
Therefore, Users can only be similar to other Users (based on shared Movie connections)
The algorithm respects the bipartite structure: it compares nodes within each partition based on their shared connections across the partition.
Now let’s see the relationships it did create:
MATCH path = (u1:User)-[s:SIMILAR]->(u2:User)-[:RATED]->(m:Movie) // (1)
RETURN path // (2)
LIMIT 10 // (3)Query breakdown
-
Match paths showing similar users and movies they’ve rated
-
Return the complete path
-
Limit to 10 results
You should see User nodes connected by SIMILAR relationships:
Node Similarity found Users who rated similar Movies and created direct User-to-User relationships based on that pattern.
Always check whether your algorithm expects monopartite or handles bipartite structures.
Check your understanding
Identifying Graph Types
Which network is multipartite?
-
❏ A network with only Actors connected by WORKED_WITH relationships
-
❏ A network with Users connected to Movies
-
✓ A network with Actors, Movies, Users, and Genres interconnected
-
❏ A network with only Movies connected by SIMILAR_TO relationships
Hint
Bipartite graphs have two node types; multipartite graphs have three or more.
Solution
A network with Actors, Movies, Users, and Genres interconnected is correct.
This is multipartite because it has four node types.
The other options are either monopartite (one type) or bipartite (two types).
Label Preservation Configuration
You want to project customers and products while preserving their labels. Which configuration is correct?
-
❏
{nodeLabels: ['Customer', 'Product']} -
✓
{sourceNodeLabels: labels(source), targetNodeLabels: labels(target)} -
❏
{preserveLabels: true} -
❏
{bipartite: true, types: ['Customer', 'Product']}
Hint
Look at the specific parameter names GDS uses for preserving source and target node labels.
Solution
{sourceNodeLabels: labels(source), targetNodeLabels: labels(target)} is correct.
GDS uses these specific configuration parameters to preserve node labels in projections:
-
sourceNodeLabelsfor the labels of source nodes in your MATCH pattern -
targetNodeLabelsfor the labels of target nodes in your MATCH pattern
The labels() function extracts the label(s) from the matched nodes.
Summary
You now understand bipartite and multipartite graphs—structures with two or more distinct, non-overlapping node sets.
You’ve learned:
-
How to preserve labels using configuration parameters
-
That some algorithms (Node Similarity) are designed for bipartite structures
-
How to choose between monopartite projection and label-preserved bipartite projection
In the next lesson, you’ll practice creating projections with multiple node types from the Movies dataset.