Introduction
The way you project your graph determines what algorithms can "see" and analyze.
In this session, you’ll learn how to create projections, understand graph structures, and why this matters for your algorithm results.
What You’ll Learn
By the end of this lesson, you’ll be able to:
-
Create Cypher projections using
gds.graph.project() -
Distinguish between graph structure and node labels in GDS
-
Identify monopartite, bipartite, multipartite, and heterogeneous graph structures
-
Choose appropriate projection strategies based on your target algorithm
Running a Cypher Projection
The most basic Cypher projection command looks like this:
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
'actors-graph', // (3)
source, // (4)
target // (5)
) AS g // (6)
RETURN g.graphName AS graph,
g.nodeCount AS nodes,
g.relationshipCount AS rels // (7)-
Match Actor nodes connected to Movie nodes via ACTED_IN relationships
-
Create a graph projection using the matched nodes
-
Name the projection
actors-graph -
Use the source nodes (Actors) from the MATCH statement
-
Use the target nodes (Movies) from the MATCH statement
-
Assign the resulting graph to variable
g -
Return the graph name, node count, and relationship count
Projecting Graph Models
You are not limited to using the relationships available in the main graph. For example, you can use intermediate nodes in your MATCH statement to create new relationships that exist only in the projection.
MATCH (source:Actor)-[r:ACTED_IN]-> // (1)
(:Movie)
<-[:ACTED_IN]-(target:Actor) // (2)
WITH gds.graph.project( // (3)
'actors-graph', // (4)
source, // (5)
target // (6)
) AS g // (7)
RETURN g.graphName AS graph, // (8)
g.nodeCount AS nodes,
g.relationshipCount AS rels-
Start with Actors who acted in movies
-
Find other Actors who acted in the same movies
-
Create a graph projection from the matched pattern
-
Name the projection
actors-graph -
Use the first set of Actors as source nodes
-
Use the second set of Actors as target nodes
-
Assign the resulting graph to variable
g -
Return the metadata and graph statistics
Actor to Actor Collaboration Graph
Running the previous projection will create a graph connecting actors directly to actors who worked on the same movies.
Run the following basic graph projection to see how this works for real.
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie)
WITH gds.graph.project(
'actors-graph',
source,
target
) AS g
RETURN g.graphName AS graph,
g.nodeCount AS nodes,
g.relationshipCount AS relsWhat You Projected
Let’s focus on the first projection you ran:
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie)
WITH gds.graph.project(
'actors-graph',
source,
target
) AS g
RETURN g.graphName AS graph,
g.nodeCount AS nodes,
g.relationshipCount AS relsWhat You Expected
You may have expected to project a graph that looks like this:
graph LR
Actor(("Actor"))
Movie(("Movie"))
Actor -- "ACTED_IN" --> MovieThis is a bipartite graph—a graph whose nodes fall into two distinct, non-overlapping sets.
What GDS Actually Sees
By default, GDS strips away labels but preserves structure:
graph LR
Actor(("Node"))
Movie(("Node"))
Actor -- `__ALL__` --> MovieThe graph is still structurally bipartite—Actors still only connect to Movies, never to other Actors. But GDS no longer knows which nodes are Actors and which are Movies.
Structure vs Labels
Two separate concepts:
-
Structure: How nodes connect (bipartite, monopartite, etc.)
-
Labels: What GDS knows about node types
Your projection kept the bipartite structure but lost the labels.
Graph Structures: Monopartite
A monopartite graph has nodes that cannot be separated into distinct non-overlapping sets.
Example: A social network where (:Person)-[:FRIENDS_WITH]→(:Person)
Any person can be friends with any other person. You cannot divide the nodes into separate groups where connections only occur between groups.
Graph Structures: Bipartite
A bipartite graph has nodes that fall into exactly two non-overlapping sets, where connections only occur between sets.
Example: (:Actor)-[:ACTED_IN]→(:Movie)
Actors connect to Movies. Actors never connect directly to other Actors. Movies never connect directly to other Movies.
Graph Structures: Multipartite
A multipartite graph has three or more non-overlapping node sets, where connections only occur between sets, never within the same set.
Example: (:User)-[:RATES]→(:Movie)-[:IN_GENRE]→(:Genre)
In a true multipartite structure, each set is non-overlapping.
Graph Structures: Heterogeneous
A heterogeneous graph has multiple node types and/or relationship types, but nodes within the same type can connect to each other.
Our full Movies dataset has Actors, Movies, Users, and Genres. Movies can connect to all node types, meaning the connections can overlap.
Why Structure Matters: PageRank Example
PageRank ranks nodes by "importance" based on incoming connections from other important nodes.
Let’s see what happens when we run it on our unlabelled bipartite projection:
CALL gds.pageRank.stream('actors-graph', {})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).title, score
ORDER BY score DESCPageRank on Bipartite Structure
In our Actor → Movie bipartite structure, PageRank flows into Movies but has nowhere to go from there. This is called a spider trap.
In another graph, we may end up with infinite loops — known as a rank sink.
Modelling, Not Algorithms
It’s important to remember here: we are talking about graph structures, using algorithms for framing.
Do not worry too much about the intricacies of PageRank or any other algorithm for now—that comes later.
For now, try to see how the signal flows from node to node in the graph structures we’re examining.
Rank Sink
In our bipartite graph, Movie nodes become "rank sinks"—accumulating high scores simply because they receive connections, not because they’re meaningfully important.
Almost all nodes receive the same score on either side of the structure. The bipartite structure traps the algorithm’s ranking signal.
Solution 1: Project a True Monopartite Graph
Now let’s return to that second projection—the Actor-to-Actor collaboration graph:
MATCH (source:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor)
WITH gds.graph.project('actors-only', source, target) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCountThis creates direct Actor-to-Actor connections through shared Movies. The Movies become invisible "bridges."
True Monopartite Result
The projected graph is now monopartite. All actors connect to other actors.
There is no meaningful way of separating the nodes into non-overlapping sets.
PageRank on Monopartite
Now PageRank can flow between nodes of the same type, producing meaningful importance rankings.
Monopartite Structures
Bear in mind, the projection still does not retain node labels.
It is the graph structure, not its labels, that affects the algorithm’s results.
Preserve Labels
For some algorithms, you will want to retain node labels.
Use configuration to preserve labels:
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie)
WITH gds.graph.project(
'actors-movies-labelled',
source,
target,
{
sourceNodeLabels: labels(source),
targetNodeLabels: labels(target),
relationshipType: type(r)
},
{}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCountWhen to Preserve Labels
Preserve labels when:
-
You need to filter algorithms by node type
-
Node type distinctions affect your analysis
Use default (unlabelled) when:
-
You’re projecting a true monopartite or bipartite subgraph
-
The algorithm ignores node labels (most do)
Node Similarity
Node Similarity compares nodes based on shared neighbours.
It works best on graphs that can be separated into distinct sets of nodes, such as bipartite graphs.
Project User-Movie Bipartite Graph
The following projection creates a bipartite graph with preserved labels.
MATCH (source:User)-[r:RATED]->(target:Movie)
WITH gds.graph.project(
'user-rated-movie',
source, target,
{ sourceNodeLabels: labels(source),
targetNodeLabels: labels(target) },
{}
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCountNode Similarity Result
Node Similarity finds Users who rated similar Movies—creating new User-to-User relationships.
The algorithm respects the bipartite structure: it compares nodes on one side based on their connections to the other side.
Quick Reference: Choosing Your Projection
| Algorithm Type | Graph Structure | Projection Strategy |
|---|---|---|
PageRank, Betweenness |
Works best on monopartite |
Project single node type (e.g., Actor-to-Actor) |
Node Similarity |
Designed for bipartite |
Preserve labels, include both types |
Community Detection |
Varies by algorithm |
Check documentation for each |
Common Terminology
| Term | Meaning |
|---|---|
Monopartite |
Nodes cannot be separated into distinct non-overlapping sets |
Bipartite |
Exactly two non-overlapping node sets; connections only between sets |
Multipartite |
Three or more non-overlapping node sets |
Heterogeneous |
Multiple node types and/or relationship types (may overlap) |
Unlabelled |
GDS doesn’t know node/relationship types (default behaviour) |
Lesson Summary
In this lesson, you learned:
-
How to create Cypher projections with
gds.graph.project() -
How to transform graph structures by changing your MATCH pattern
-
Structure (how nodes connect) is separate from labels (what GDS knows about types)
-
GDS strips labels by default but preserves structure
-
Bipartite structures can trap algorithms like PageRank
-
Project true monopartite graphs for algorithms that expect them
-
Preserve labels when using bipartite-aware algorithms like Node Similarity
In the next lesson, you’ll practice projecting different graph types.