Configuring projections for undirected relationships

Introduction

In the previous lesson, you ran algorithms on directed relationships. But some algorithms work differently—or better—when relationships are undirected.

In this lesson, you’ll configure your projection to use undirected relationships and run PageRank and Leiden to see how the results change.

By the end of this lesson, you will understand:

How to project undirected relationships
How PageRank behaves on undirected graphs
How Leiden differs from Louvain
When to use undirected projections

Why undirected relationships matter

Some algorithms are designed for undirected graphs. Others can run on both but produce different results.

PageRank originally assumed directed relationships (web pages linking to each other).

Leiden is a community detection algorithm that improves on Louvain. Unlike Louvain, Leiden requires undirected relationships.

For collaboration networks, undirected makes sense—if two actors worked together, the connection goes both ways.

In this lesson, you will learn:

How to configure and project a directed graph into an undirected graph
How directionality impacts an algorithm’s results

Create both directed and undirected projections

To compare how relationship orientation affects results, let’s create both projections side by side.

First, drop any existing graphs:

cypher

Drop all graphs

CALL gds.graph.list() // (1)
YIELD graphName // (2)
CALL gds.graph.drop(graphName) // (3)
YIELD graphName AS droppedGraph // (4)
RETURN droppedGraph // (5)

Query breakdown

Call the graph list procedure
Yield each graph name
Drop each graph by piping the name to the drop procedure
Yield the dropped graph name
Return the list of all dropped graphs

Move over to the sandbox and create an actor collaboration network called 'actor-network-directed'.

You’ve created this type of projection many times—actors connected through shared movies.

Solution

cypher

Create directed actor network

MATCH (source:Actor)-[:ACTED_IN]->
  (:Movie)
    <-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project( // (2)
  'actor-network-directed', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (8)

Projection breakdown

Match Actor nodes connected through Movie nodes
Call the GDS projection function
Name the projection 'actor-network-directed'
Include source (Actor) nodes
Include target (Actor) nodes
First configuration map (empty - using defaults)
Second configuration map (empty - directed by default)
Return projection statistics

Now let’s create an undirected version of the same projection called 'actor-network-undirected'.

You’ll use the exact same Cypher pattern but add this configuration to the second set of curly brackets:

undirectedRelationshipTypes: ['*']

cypher

Create undirected actor network

MATCH (source:Actor)-[:ACTED_IN]->
  (:Movie)
    <-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project( // (2)
  'actor-network-undirected', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {
    undirectedRelationshipTypes: ['*'] // (7)
  }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (8)

Projection breakdown

Match Actor nodes connected through Movie nodes
Call the GDS projection function
Name the projection 'actor-network-undirected'
Include source (Actor) nodes
Include target (Actor) nodes
First configuration map (empty - using defaults)
Configure all relationships as undirected
Return projection statistics

Understanding the configuration:

undirectedRelationshipTypes: ['*'] tells GDS to treat all relationships as undirected
To redirect only specific types, replace '*' with the relationship type: ['ACTED_IN']
This configuration goes in the second set of brackets
You must include both sets of curly brackets—even if the first set remains empty

Both projections have the same nodes and relationships—only the orientation differs.

In the following sections, we’ll run two algorithms on both graphs to see how directionality can affect output.

PageRank: Comparing directed vs undirected

PageRank measures importance by:

Counting the number of incoming connections
Weighting those incoming connections by the popularity of the linker
Dividing the popularity of the linker by their out-degree

Basically, if node-A has many incoming 'votes' from other nodes, and those nodes' are both popular and not too frivolous with votes, node-A will have a high PageRank.

Let’s run PageRank on both projections and compare.

Run PageRank in stream mode on the directed graph ('actor-network-directed'), returning the top 10 actors by score. Use gds.util.asNode() to access actor names.

Solution

cypher

PageRank on directed graph

CALL gds.pageRank.stream('actor-network-directed', {}) // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).name AS actor, score // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

Call PageRank in stream mode on directed graph with default configuration
Yield node IDs and PageRank scores
Convert node IDs to actor names and return with scores
Sort by score in descending order
Limit to top 10 actors

Now run the same query on the undirected graph ('actor-network-undirected'):

Solution

cypher

PageRank on undirected graph

CALL gds.pageRank.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).name AS actor, score // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

Call PageRank in stream mode on undirected graph with default configuration
Yield node IDs and PageRank scores
Convert node IDs to actor names and return with scores
Sort by score in descending order
Limit to top 10 actors

Compare the two result sets.

They should be identical — same as with degree centrality.

Despite this, there is a subtle difference. While degree centrality counts outgoing edges one-for-one from the node we’re ranking, PageRank counts them to the node we’re ranking and divides their weight by how many outgoing relationships the source nodes have given out.

In this case, Robert De Niro gets the highest score because he has the most authority in the collaboration network overall, not simply because he has the highest number of movies.

PageRank: Experimenting with dampingFactor

We can test this difference by playing around with the dampingFactor.

The dampingFactor controls how randomly the algorithm will jump around the graph vs how strictly it will follow links.

Low: The algorithm will follow some links but will also 'teleport' around the graph more often
High: The algorithm will follow most links but will occasionally 'teleport' around the graph

The dampingFactor provides a couple of benefits:

Stops the algorithm from getting stuck inside a dead-end or cyclical relationship structure
Helps to avoid one high-degree node overly dominating the results

Move to the sandbox and run PageRank with dampingFactor: float in the algorithm’s configuration brackets. Experiment with different dampingFactor values on both graphs. Try values like 0.15, 0.50, 0.85, and 0.99.

Track how specific actors like Robert De Niro, Jackie Chan and Vincent Price move up and down the rankings as you change the configuration.

Solution

cypher

PageRank on directed graph

CALL gds.pageRank.stream('actor-network-directed', // (1)
  {
    dampingFactor: 0.85 // <2> Specify the dampingFactor
  })
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS actor, score // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)

Call PageRank in stream mode on directed graph with default configuration
Include dampingFactor in the configuration brackets
Yield node IDs and PageRank scores
Convert node IDs to actor names and return with scores
Sort by score in descending order
Limit to top 10 actors

The default dampingFactor is 0.85. Normally, this provides a good mix of link-following vs random teleportation.

However, when you run with higher or lower dampingFactor you should notice actors moving up or down the list.

You can review all of the available config settings on the PageRank docs.

Compare PageRank on both graphs

Run PageRank with dampingFactor: 0.95 on both graphs and compare how orientation affects the rankings.

Remember to use stream mode and return the top 10 actors by score.

Solution

cypher

PageRank on directed graph with dampingFactor 0.95

CALL gds.pageRank.stream('actor-network-directed', { // (1)
  dampingFactor: 0.95 // (2)
})
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS actor, score // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call PageRank in stream mode on directed graph
Set damping factor to 0.95 (higher than default 0.85)
Yield node IDs and PageRank scores
Convert node IDs to actor names and return with scores
Sort by score in descending order
Limit to top 10 actors

cypher

Solution: PageRank on undirected graph with dampingFactor 0.95

CALL gds.pageRank.stream('actor-network-undirected', { // (1)
  dampingFactor: 0.95 // (2)
})
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS actor, score // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call PageRank in stream mode on undirected graph
Set damping factor to 0.95 (higher than default 0.85)
Yield node IDs and PageRank scores
Convert node IDs to actor names and return with scores
Sort by score in descending order
Limit to top 10 actors

Compare the two result sets, and they should be identical.

Key takeaway: Directionality did not affect the output, because our graph is already symmetrical.

Two nodes connected by bidirectional relationships.

When we switch to undirected, we do not change the balance of the relationships.

Leiden: Comparing directed vs undirected

Leiden is an improvement over Louvain that guarantees well-connected communities. Unlike PageRank, Leiden requires undirected relationships.

Let’s see what happens if we try to run it on our directed graph.

Try running Leiden on the directed graph ('actor-network-directed'):

cypher

Leiden on directed graph (will fail)

CALL gds.leiden.stream('actor-network-directed', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, count(*) AS size // (3)
RETURN communityId, size // (4)
ORDER BY size DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Attempt to call Leiden on directed graph (will produce error)
Yield node IDs and community assignments
Group by community and count members
Return community ID and size
Sort by size in descending order
Limit to top 10 communities

What happened?

Details

You should receive an error message indicating that Leiden requires undirected relationships.

Leiden’s algorithm is designed to work on undirected graphs because it analyzes bidirectional connections to guarantee well-connected communities. When you project directed relationships, Leiden cannot properly evaluate community cohesion.

Run Leiden on undirected graph

Now run Leiden on the undirected graph ('actor-network-undirected'):

cypher

Leiden on undirected graph

CALL gds.leiden.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, count(*) AS size // (3)
RETURN communityId, size // (4)
ORDER BY size DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call Leiden in stream mode on undirected graph
Yield node IDs and community assignments
Group by community and count members
Return community ID and size
Sort by size in descending order
Limit to top 10 communities

Like Louvain, this shows community sizes. But Leiden tweaks the Louvain algorithm to produce more balanced, granular communities.

Key difference from PageRank: PageRank can adapt to both directed and undirected graphs, while Leiden has a hard requirement for undirected relationships.

You can find this out by referencing Leiden’s attributes header in the docs:

Leiden’s attributes header in the GDS docs.

Leiden: Analyzing results

Let’s look at actors in the largest Leiden community.

Complete the query to stream Leiden results and return sample actors from the largest community:

cypher

View sample actors from largest Leiden community (replace ????)

CALL gds.leiden.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, collect(????) AS actors, count(*) AS size // (3)
ORDER BY size DESC // (4)
LIMIT 1 // (5)
RETURN communityId, size, actors[0..10] AS sampleActors // (6)

Algorithm breakdown

Call Leiden in stream mode on undirected graph
Yield node IDs and community assignments
Group by community, collect actor nodes (fill in conversion to names), and count members
Sort by size in descending order
Limit to the largest community only
Return community ID, size, and first 10 actors as sample

Solution

cypher

View sample actors from largest Leiden community

CALL gds.leiden.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, collect(gds.util.asNode(nodeId).name) AS actors, count(*) AS size // (3)
ORDER BY size DESC // (4)
LIMIT 1 // (5)
RETURN communityId, size, actors[0..10] AS sampleActors // (6)

Call Leiden in stream mode on undirected graph
Yield node IDs and community assignments
Group by community, collect actor names, and count members
Sort by size in descending order
Limit to the largest community only
Return community ID, size, and first 10 actor names as sample

These actors form a tightly-knit collaboration group based on shared movie appearances.

Leiden: Custom configuration

Leiden also supports maxLevels like Louvain. Let’s experiment with different configurations.

Run Leiden with maxLevels: 1 to see how many communities it detects:

cypher

Leiden with maxLevels: 1

CALL gds.leiden.stats('actor-network-undirected', { // (1)
  maxLevels: 1 // (2)
})
YIELD communityCount, modularity, ranLevels // (3)
RETURN communityCount, modularity, ranLevels // (4)

Algorithm breakdown

Call Leiden in stats mode on undirected graph
Set maximum hierarchy levels to 1
Yield community statistics
Return community count, modularity, and levels run

Review the results for communityCount and ranLevels.

Now try with maxLevels: 20:

cypher

Leiden with maxLevels: 20

CALL gds.leiden.stats('actor-network-undirected', { // (1)
  maxLevels: 20 // (2)
})
YIELD communityCount, modularity, ranLevels // (3)
RETURN communityCount, modularity, ranLevels // (4)

Algorithm breakdown

Call Leiden in stats mode on undirected graph
Set maximum hierarchy levels to 20
Yield community statistics
Return community count, modularity, and levels run

The second run is unlikely to run to the full 20 levels.

While higher maxLevels allows for more granular community detection, like Louvain, Leiden will stop early if communities have already converged.

However, remember back when you ran Louvain on this graph. It returned around 1000+ communities.

Leiden should have returned many more than that — Leiden generally excels at producing more modular and granular communities than Louvain.

When to use undirected projections

Use undirected relationships when:

Relationships are naturally bidirectional (collaborated with, friends with, similar to)
You’re running algorithms that require undirected graphs (like Leiden)
Direction doesn’t add meaningful information to your analysis

Use directed relationships when:

Direction has real meaning (follows, purchased, influenced)
You want to distinguish incoming from outgoing connections
Algorithms specifically need directional information

What’s next

You’ve learned how to project undirected relationships and run PageRank and Leiden on them. You’ve seen how configuration options like dampingFactor and maxLevels impact results.

In the next lesson, you’ll add relationship weights to your projection and see how algorithms can use those weights to produce even more nuanced results.

Check your understanding

Spot Robert De Niro: Undirected Graph

Run PageRank on actor-network-undirected with these dampingFactor values: 0.15, 0.50, 0.85

Where is Robert De Niro at dampingFactor: 0.15?

❏ Position 1
❏ Position 5
✓ Not in the top 10
❏ Position 10

Hint

Low dampingFactor values emphasize connection count over connection quality.

Solution

Robert De Niro is not in the top 10 at dampingFactor: 0.15.

With a low dampingFactor, PageRank essentially becomes a connection count algorithm. Actors with many collaborations dominate—regardless of who they worked with.

Robert De Niro has fewer total collaborations than actors like Jackie Chan (153 films), so he drops out of the top 10 when we only count quantity.

But watch what happens as you increase dampingFactor:

0.50: Robert De Niro appears at position 5
0.60: Robert De Niro climbs to position 3
0.85: Robert De Niro reaches position 1

As dampingFactor increases, PageRank weighs connection quality more heavily. Robert De Niro worked with highly influential actors, so he rises to the top when quality matters more than quantity.

Spot Vincent Price: Directed Graph

Run PageRank on actor-network-directed with these dampingFactor values: 0.15, 0.50, 0.85

Where is Vincent Price at dampingFactor: 0.85?

❏ Position 1
❏ Position 6
✓ Not in the top 10
❏ Position 5

Hint

High dampingFactor values emphasize connection quality over connection count.

Solution

Vincent Price is not in the top 10 at dampingFactor: 0.85.

This is the opposite pattern from Robert De Niro on the undirected graph:

0.15: Vincent Price is at position 1
0.50: Vincent Price drops to position 6
0.85: Vincent Price falls out of the top 10

At low dampingFactor, Vincent Price ranks #1 because he has many collaborations. But as dampingFactor increases and connection quality becomes more important, he drops out entirely.

This suggests Vincent Price, while prolific, didn’t work with as many highly-connected actors as Robert De Niro did. When the algorithm prioritizes "who you know" over "how many you know," different actors rise to prominence.

Key insight: The same algorithm with different configuration can reveal different dimensions of importance in your network—quantity vs. quality, breadth vs. depth.

Summary

Undirected projections treat relationships as bidirectional. Some algorithms like Leiden require undirected graphs, while others like PageRank produce different results on undirected networks.

PageRank’s dampingFactor controls whether the algorithm emphasizes connection quantity or quality. Leiden improves on Louvain with guaranteed well-connected communities. Both benefit from proper projection configuration.

Get started with Graph Data Science

Get started with the Graph Data Science library

GDS basic concepts

Working with algorithms

Essential projection techniques

Configuring projections for undirected relationships

Introduction

Why undirected relationships matter

Create both directed and undirected projections

PageRank: Comparing directed vs undirected

PageRank: Experimenting with dampingFactor

Compare PageRank on both graphs

Leiden: Comparing directed vs undirected

What happened?

Run Leiden on undirected graph

Leiden: Analyzing results

Leiden: Custom configuration

When to use undirected projections

What’s next

Check your understanding

Spot Robert De Niro: Undirected Graph

Spot Vincent Price: Directed Graph

Summary

Chatbot