Projecting bipartite and multipartite graphs

Introduction

You’ve learned that GDS creates monopartite projections by default, treating all nodes as a single type regardless of their original labels.

But what if you need to preserve the distinction between node types? What if it matters that some nodes are Actors and others are Movies?

In this lesson, you’ll learn about bipartite and multipartite graphs—graphs with two or more distinct node types—and how to create projections that preserve these distinctions in GDS.

By the end of this lesson, you will understand:

What bipartite and multipartite graphs are
The difference between multipartite and heterogeneous graphs
Why preserving node labels matters for certain analyses
How to configure projections to preserve multiple node types in GDS

What is a bipartite graph?

A bipartite graph has two distinct types of nodes, with relationships only connecting nodes of different types—never nodes of the same type.

Think of it as a graph "of two parts."

Example: Customers purchasing products:

(:Customer)-[:PURCHASED]→(:Product)
(:Customer)-[:PURCHASED]→(:Product)
(:Customer)-[:PURCHASED]→(:Product)

(No Customer-to-Customer connections) (No Product-to-Product connections)

In a true bipartite graph:

Type A nodes connect only to Type B nodes
Type B nodes connect only to Type A nodes
No connections exist between nodes of the same type

A bipartite graph. A row of Actor nodes in a top row connect to a row of Movie nodes in a bottom row. No nodes within the same row connect to each other.

What is a multipartite graph?

A multipartite graph contains three or more distinct node types.

You could also call this a 'heterogeneous graph' if it has multiple relationship types as well.

In practice:

Multipartite = multiple node types (3+)
Heterogeneous = more than one node type and/or multiple relationship types

Here’s an example from the Movies dataset you’re experimenting with:

In this image we have the following nodes and relationships:

(:Actor)-[:ACTED_IN]→(:Movie)
(:User)-[:RATED]→(:Movie)
(:Movie)-[:IN_GENRE]→(:Genre)

This graph has four node types (Actor, User, Movie, Genre) and three relationship types, making it both multipartite and heterogeneous.

Understanding what you’re actually projecting

Remember when we projected actors and movies?

cypher

Default projection without label preservation

MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actors-movies', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (8)

Projection breakdown

Match Actor nodes connected to Movie nodes via ACTED_IN relationships
Call the GDS projection function
Name the projection 'actors-movies'
Include source (Actor) nodes
Include target (Movie) nodes
First configuration map (empty - labels not preserved)
Second configuration map (empty - using defaults)
Return projection statistics

Even though this graph is naturally bipartite, GDS sees this projection as a monopartite graph—treating all nodes as a single generic type, ignoring the Actor and Movie labels.

This happens because we have not specified that labels should be preserved. GDS is projecting exactly what we asked for.

For certain algorithms, this doesn’t matter. For others, you may actually want to treat different node types as if they are the same.

In most cases, when your data has multiple types, you’ll want to project a monopartite subgraph.

However, in some cases, you’ll want to project multiple node types, and preserve the distinctions between node types in your projection.

Preserving labels in projections

In earlier lessons, you saw the empty configuration settings in graph projection syntax.

To preserve labels, use these configuration parameters:

cypher

Project bipartite graph with label preservation

MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actors-movies-bipartite', // (3)
  source, // (4)
  target, // (5)
  {
    sourceNodeLabels: labels(source), // (6)
    targetNodeLabels: labels(target), // (7)
    relationshipType: type(r) // (8)
  },
  {}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (9)

Projection breakdown

Match Actor nodes connected to Movie nodes via ACTED_IN relationships
Call the GDS projection function
Name the projection 'actors-movies-bipartite'
Include source (Actor) nodes
Include target (Movie) nodes
Preserve source node labels
Preserve target node labels
Preserve relationship types
Return projection statistics

Key configuration parameters:

sourceNodeLabels: labels(source): Preserves the label(s) from source nodes
targetNodeLabels: labels(target): Preserves the label(s) from target nodes
relationshipType: type(r): Preserves the relationship type

Now GDS creates a properly labeled bipartite graph:

An Actor node connects to a Movie node via an ACTED_IN relationship.

Actor nodes retain their Actor label
Movie nodes retain their Movie label
Relationships retain their types

You do not have to use all three of these settings at any given time — but there’s no harm in doing so.

Creating multipartite projections

The same configuration works for three or more node types:

cypher

Project multipartite graph with multiple node types

MATCH (source)-[r]->(target) // (1)
WHERE source:Actor OR source:Movie OR source:User OR source:Genre AND target:Actor OR target:Movie OR target:User OR target:Genre // (2)
WITH gds.graph.project( // (3)
  'movies-network-multi', // (4)
  source, // (5)
  target, // (6)
  {
    sourceNodeLabels: labels(source), // (7)
    targetNodeLabels: labels(target), // (8)
    relationshipType: type(r) // (9)
  },
  {}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (10)

Projection breakdown

Match any source and target nodes with relationships
Filter to include only Actor, Movie, User, or Genre nodes
Call the GDS projection function
Name the projection 'movies-network-multi'
Include source nodes
Include target nodes
Preserve source node labels
Preserve target node labels
Preserve relationship types
Return projection statistics

This preserves all four node types (Actor, Movie, User, Genre) and their relationship types in a single projection.

When label preservation matters

Use default monopartite (no label configuration) when:

All nodes genuinely represent the same entity type
Node labels don’t affect your analysis goals
You intentionally want to treat different node types as the same

Preserve labels (bipartite/multipartite) when:

You need to distinguish between node types
You’ll filter algorithms by node type (e.g., run only on Actors)

Algorithm considerations

Not all algorithms work well with multiple node types.

Throughout this course, as you happen across an algorithm, we’ll show you how to decide on your projection type, and explain the reasoning.

For now, to illustrate this, let’s project our Users and Movies to a graph.

cypher

Project user-movie bipartite graph

MATCH (source:User)-[r:RATED]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'user-rated-movie', // (3)
  source, // (4)
  target, // (5)
  {
    sourceNodeLabels: labels(source), // (6)
    targetNodeLabels: labels(target), // (7)
    relationshipType: type(r) // (8)
  },
  {}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (9)

Projection breakdown

Match User nodes connected to Movie nodes via RATED relationships
Call the GDS projection function
Name the projection 'user-rated-movie'
Include source (User) nodes
Include target (Movie) nodes
Preserve source node labels
Preserve target node labels
Preserve relationship types
Return projection statistics

Now, you have projected a graph that really does look like this:

a graph model of a User node connecting to a Movie node via a RATED relationship.

The labels persist, and we can now use them when running an algorithm.

Running node similarity

Node similarity is an algorithm that compares a set of nodes based on the nodes they are connected to. When you run it, it will write new relationships back to the graph.

Two nodes are considered similar if they share many of the same neighbors.

You can learn more about the node similarity algorithm in the GDS docs.

For now, all you need to know is that node similarity works extremely well on bipartite graphs.

Let’s run node similarity on our bipartite user-rated-movie graph.

cypher

Run node similarity on bipartite graph

CALL gds.nodeSimilarity.write( // (1)
  'user-rated-movie', // (2)
  {
    writeRelationshipType: 'SIMILAR', // (3)
    writeProperty: 'score' // (4)
  })
YIELD nodesCompared, relationshipsWritten // (5)

Algorithm breakdown

Call node similarity algorithm in write mode
Run on 'user-rated-movie' projection
Write new relationships with type 'SIMILAR'
Write similarity scores as 'score' property
Yield the number of nodes compared and relationships written

write() mode

You’re using .write() mode to persist results to the database. This is different from .stream() which only displays results. You’ll learn when to use each execution mode in Module 3.

Now, to verify what it has done — or rather, what it has not done — run the following query:

cypher

Check for User-to-Movie similarity (should be empty)

MATCH (u:User)-[r:SIMILAR]->(m:Movie) // (1)
RETURN u, r, m // (2)
LIMIT 1 // (3)

Query breakdown

Try to match User nodes with SIMILAR relationships to Movie nodes
Return the matched users, relationships, and movies
Limit to 1 result (should return nothing)

This query should have returned no changes, no records.

Node similarity only considers nodes to be similar if they have shared neighbors. A User node, in our model, is never a direct neighbor with another User node, and so a User will never be similar to a shared Movie.

In a bipartite graph, every node on one side exclusively connects to nodes on the other side of the divide.

A bipartite graph. A row of User nodes in a top row connect to a row of Movie nodes in a bottom row. No nodes within the same row connect to each other.

The relationships that we created with the algorithm only exist between nodes on the same side of the graph.

Run the following query to see some of your new SIMILAR relationships.

cypher

View similar users and movies

MATCH path = (:User)-[:SIMILAR]->(:User)-[:RATED]->(:Movie) // (1)
RETURN path // (2)
LIMIT 10 // (3)

Query breakdown

Match a path showing similar users and the movies they have rated.
Return the complete path.
Limit to 10 results.

You should now find that User nodes have received SIMILAR relationships to each other, based on their movie rating behaviors.

You could use these newly inferred relationships in multiple ways — not least, as part of a content recommendation system.

What’s next

You now understand bipartite and multipartite graphs—structures with two or more distinct node types. You’ve learned how to preserve these distinctions using label and relationship configuration settings.

In the next lesson, you’ll practice creating projections with multiple node types from the Movies dataset.

Check your understanding

Identifying Graph Types

Which network is multipartite?

❏ A network with only Actors connected by WORKED_WITH relationships
❏ A network with Users connected to Movies
✓ A network with Actors, Movies, Users, and Genres interconnected
❏ A network with only Movies connected by SIMILAR_TO relationships

Hint

Bipartite graphs have two node types; multipartite graphs have three or more.

Solution

A network with Actors, Movies, Users, and Genres interconnected is correct.

This is multipartite because it has four node types.

The other options are either monopartite (one type) or bipartite (two types).

Label Preservation Configuration

You want to project customers and products while preserving their labels. Which configuration is correct?

❏ {nodeLabels: ['Customer', 'Product']}
✓ {sourceNodeLabels: labels(source), targetNodeLabels: labels(target)}
❏ {preserveLabels: true}
❏ {bipartite: true, types: ['Customer', 'Product']}

Hint

Look at the specific parameter names GDS uses for preserving source and target node labels.

Solution

{sourceNodeLabels: labels(source), targetNodeLabels: labels(target)} is correct.

GDS uses these specific configuration parameters to preserve node labels in projections:

sourceNodeLabels for the labels of source nodes in your MATCH pattern
targetNodeLabels for the labels of target nodes in your MATCH pattern

The labels() function extracts the label(s) from the matched nodes.

Summary

Bipartite graphs have two distinct node types; multipartite graphs have three or more. When multiple relationship types are also preserved, these are often called heterogeneous graphs.

By default, GDS ignores node labels, creating monopartite projections. To preserve multiple node types, use:

cypher

Configuration for label preservation

{ // (1)
  sourceNodeLabels: labels(source), // (2)
  targetNodeLabels: labels(target), // (3)
  relationshipType: type(r) // (4)
}

Configuration breakdown

Configuration map for label preservation
Preserve source node labels
Preserve target node labels
Preserve relationship types

Label preservation matters when algorithms need to distinguish between node types, when you’ll filter by type, or when you’re analyzing relationships between different entities. Choose your projection structure based on your analytical goals.

Get started with Graph Data Science

Get started with the Graph Data Science library

GDS basic concepts

Working with algorithms

Essential projection techniques

Projecting bipartite and multipartite graphs

Introduction

What is a bipartite graph?

What is a multipartite graph?

Understanding what you’re actually projecting

Preserving labels in projections

Creating multipartite projections

When label preservation matters

Algorithm considerations

Running node similarity

What’s next

Check your understanding

Identifying Graph Types

Label Preservation Configuration

Summary

Chatbot