Configuring projections for undirected relationships

Introduction

In the previous lesson, you ran algorithms on directed relationships. But some algorithms work differently—or better—when relationships are undirected.

In this lesson, you’ll configure your projection to use undirected relationships and run PageRank and Leiden to see how the results change.

By the end of this lesson, you will understand:

  • How to project undirected relationships

  • How PageRank behaves on undirected graphs

  • How Leiden differs from Louvain

  • When to use undirected projections

Why undirected relationships matter

Some algorithms are designed for undirected graphs. Others can run on both but produce different results.

PageRank originally assumed directed relationships (web pages linking to each other).

Leiden is a community detection algorithm that improves on Louvain. Unlike Louvain, Leiden requires undirected relationships.

For collaboration networks, undirected makes sense—if two actors worked together, the connection goes both ways.

In this lesson, you will learn:

  • How to configure and project a directed graph into an undirected graph

  • How directionality impacts an algorithm’s results

Create both directed and undirected projections

To compare how relationship orientation affects results, let’s create both projections side by side.

First, drop any existing graphs:

cypher
Drop existing actor-network
CALL gds.graph.drop('actor-network', false) YIELD graphName // (1)
RETURN graphName // (2)
  1. Drop 'actor-network' with error suppression (won’t fail if missing)

  2. Return the dropped graph name

Create a directed actor collaboration network called 'actor-network-directed'.

You’ve created this type of projection many times—actors connected through shared movies.

Solution

Details
cypher
Solution: Create directed actor network
MATCH (source:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project( // (2)
  'actor-network-directed', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {} // (7)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (8)
  1. Match Actor nodes connected through Movie nodes

  2. Call the GDS projection function

  3. Name the projection 'actor-network-directed'

  4. Include source (Actor) nodes

  5. Include target (Actor) nodes

  6. First configuration map (empty - using defaults)

  7. Second configuration map (empty - directed by default)

  8. Return projection statistics

Now let’s create an undirected version of the same projection called 'actor-network-undirected'.

We use the exact same Cypher pattern but add this configuration to the second set of curly brackets:

undirectedRelationshipTypes: ['*']

cypher
Create undirected actor network
MATCH (source:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor) // (1)
WITH gds.graph.project( // (2)
  'actor-network-undirected', // (3)
  source, // (4)
  target, // (5)
  {}, // (6)
  {
    undirectedRelationshipTypes: ['*'] // (7)
  }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount // (8)
  1. Match Actor nodes connected through Movie nodes

  2. Call the GDS projection function

  3. Name the projection 'actor-network-undirected'

  4. Include source (Actor) nodes

  5. Include target (Actor) nodes

  6. First configuration map (empty - using defaults)

  7. Configure all relationships as undirected

  8. Return projection statistics

Understanding the configuration:

  • undirectedRelationshipTypes: ['*'] tells GDS to treat all relationships as undirected

  • To redirect only specific types, replace '*' with the relationship type: ['ACTED_IN']

  • This configuration goes in the second set of brackets (projection config, not relationship config)

  • You must include both sets of curly brackets—even if the first set remains empty

Both projections have the same nodes and relationships—only the orientation differs.

In the following sections, we’ll run two algorithms on both graphs to see how directionality can affect output.

PageRank: Comparing directed vs undirected

PageRank measures importance by:

  • Counting the number of incoming connections

  • Weighting those incoming connections by the popularity of the linker

  • Dividing the popularity of the linker by their out-degree

Basically, if node-A has many incoming 'votes' from other nodes, and those nodes' are both popular and not too frivolous with votes, node-A will have a high PageRank.

Let’s run PageRank on both projections and compare.

Run PageRank in stream mode on the directed graph ('actor-network-directed'), returning the top 10 actors by score. Use gds.util.asNode() to access actor names.

Solution

Details
cypher
Solution: PageRank on directed graph
CALL gds.pageRank.stream('actor-network-directed', {}) // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).name AS actor, score // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)
  1. Call PageRank in stream mode on directed graph with default configuration

  2. Yield node IDs and PageRank scores

  3. Convert node IDs to actor names and return with scores

  4. Sort by score in descending order

  5. Limit to top 10 actors

Now run the same query on the undirected graph ('actor-network-undirected'):

Solution

Details
cypher
Solution: PageRank on undirected graph
CALL gds.pageRank.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(nodeId).name AS actor, score // (3)
ORDER BY score DESC // (4)
LIMIT 10 // (5)
  1. Call PageRank in stream mode on undirected graph with default configuration

  2. Yield node IDs and PageRank scores

  3. Convert node IDs to actor names and return with scores

  4. Sort by score in descending order

  5. Limit to top 10 actors

Compare the two result sets.

They should be identical — same as with degree centrality.

Despite this, there is a subtle difference. While degree centrality counts outgoing edges one-for-one from the node we’re ranking, PageRank counts them to the node we’re ranking and divides their weight by how many outgoing relationships the source nodes have given out.

In this case, Robert De Niro gets the highest score because he has the most authority in the collaboration network overall, not simply because he has the highest number of movies.

PageRank: Experimenting with dampingFactor

We can test this by playing around with the dampingFactor.

The dampingFactor controls how randomly the algorithm will jump around the graph vs how strictly it will follow links.

  • Low: The algorithm will follow some links but will also 'teleport' around the graph more often

  • High: The algorithm will follow most links but will occasionally 'teleport' around the graph

The dampingFactor provides a few benefits:

  1. Stops the algorithm from getting stuck inside a dead-end or cyclical relationship structure

  2. Helps to avoid one high-degree node overly dominating the results

Move to the sandbox and run PageRank with different dampingFactor values on both graphs. Try values like 0.15, 0.50, 0.85, and 0.99.

Track how specific actors like Robert De Niro, Jackie Chan and Vincent Price move up and down the rankings as you change the configuration.

The default dampingFactor is 0.85. Normally, this provides a good mix of link-following vs random teleportation.

You can review all of the available config settings on the PageRank docs.

Compare PageRank on both graphs

Run PageRank with dampingFactor: 0.95 on both graphs and compare how orientation affects the rankings.

Remember to use stream mode and return the top 10 actors by score.

Solution

Details
cypher
Solution: PageRank on directed graph with dampingFactor 0.95
CALL gds.pageRank.stream('actor-network-directed', { // (1)
  dampingFactor: 0.95 // (2)
})
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS actor, score // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)
  1. Call PageRank in stream mode on directed graph

  2. Set damping factor to 0.95 (higher than default 0.85)

  3. Yield node IDs and PageRank scores

  4. Convert node IDs to actor names and return with scores

  5. Sort by score in descending order

  6. Limit to top 10 actors

cypher
Solution: PageRank on undirected graph with dampingFactor 0.95
CALL gds.pageRank.stream('actor-network-undirected', { // (1)
  dampingFactor: 0.95 // (2)
})
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS actor, score // (4)
ORDER BY score DESC // (5)
LIMIT 10 // (6)
  1. Call PageRank in stream mode on undirected graph

  2. Set damping factor to 0.95 (higher than default 0.85)

  3. Yield node IDs and PageRank scores

  4. Convert node IDs to actor names and return with scores

  5. Sort by score in descending order

  6. Limit to top 10 actors

Compare the two result sets, and they should be identical.

Key takeaway: Directionality did not affect the output, because our graph is already symetrical.

Two nodes connected by bidirectional relationships.

When we switch to undirected, we do not change the balance of the relationships.

Leiden: Comparing directed vs undirected

Leiden is an improvement over Louvain that guarantees well-connected communities. Unlike PageRank, Leiden requires undirected relationships.

Let’s see what happens if we try to run it on our directed graph.

Try running Leiden on the directed graph ('actor-network-directed'):

cypher
Leiden on directed graph (will fail)
CALL gds.leiden.stream('actor-network-directed', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, COUNT(*) AS size // (3)
RETURN communityId, size // (4)
ORDER BY size DESC // (5)
LIMIT 10 // (6)
  1. Attempt to call Leiden on directed graph (will produce error)

  2. Yield node IDs and community assignments

  3. Group by community and count members

  4. Return community ID and size

  5. Sort by size in descending order

  6. Limit to top 10 communities

What happened?

Details

You should receive an error message indicating that Leiden requires undirected relationships.

Leiden’s algorithm is designed to work on undirected graphs because it analyzes bidirectional connections to guarantee well-connected communities. When you project directed relationships, Leiden cannot properly evaluate community cohesion.

Run Leiden on undirected graph

Now run Leiden on the undirected graph ('actor-network-undirected'):

cypher
Leiden on undirected graph
CALL gds.leiden.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, COUNT(*) AS size // (3)
RETURN communityId, size // (4)
ORDER BY size DESC // (5)
LIMIT 10 // (6)
  1. Call Leiden in stream mode on undirected graph

  2. Yield node IDs and community assignments

  3. Group by community and count members

  4. Return community ID and size

  5. Sort by size in descending order

  6. Limit to top 10 communities

Like Louvain, this shows community sizes. But Leiden tweaks the Louvain algorithm to produce more balanced, granular communities.

Key difference from PageRank: PageRank can adapt to both directed and undirected graphs, while Leiden has a hard requirement for undirected relationships.

You can find this out by referencing Leiden’s attributes header in the docs:

Leiden’s attributes header in the GDS docs.

Leiden: Analyzing results

Let’s look at actors in the largest Leiden community.

Complete the query to stream Leiden results and return sample actors from the largest community:

cypher
View sample actors from largest Leiden community (replace ????)
CALL gds.leiden.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, collect(????) AS actors, count(*) AS size // (3)
ORDER BY size DESC // (4)
LIMIT 1 // (5)
RETURN communityId, size, actors[0..10] AS sampleActors // (6)
  1. Call Leiden in stream mode on undirected graph

  2. Yield node IDs and community assignments

  3. Group by community, collect actor nodes (fill in conversion to names), and count members

  4. Sort by size in descending order

  5. Limit to the largest community only

  6. Return community ID, size, and first 10 actors as sample

Solution

Details
cypher
Solution: View sample actors from largest Leiden community
CALL gds.leiden.stream('actor-network-undirected', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, collect(gds.util.asNode(nodeId).name) AS actors, count(*) AS size // (3)
ORDER BY size DESC // (4)
LIMIT 1 // (5)
RETURN communityId, size, actors[0..10] AS sampleActors // (6)
  1. Call Leiden in stream mode on undirected graph

  2. Yield node IDs and community assignments

  3. Group by community, collect actor names, and count members

  4. Sort by size in descending order

  5. Limit to the largest community only

  6. Return community ID, size, and first 10 actor names as sample

These actors form a tightly-knit collaboration group based on shared movie appearances.

Leiden: Custom configuration

Leiden also supports maxLevels like Louvain. Let’s experiment with different configurations.

Run Leiden with maxLevels: 1 to see how many communities it detects:

cypher
Leiden with maxLevels: 1
CALL gds.leiden.stats('actor-network-undirected', { // (1)
  maxLevels: 1 // (2)
})
YIELD communityCount, modularity, ranLevels // (3)
RETURN communityCount, modularity, ranLevels // (4)
  1. Call Leiden in stats mode on undirected graph

  2. Set maximum hierarchy levels to 1

  3. Yield community statistics

  4. Return community count, modularity, and levels run

Review the results for communityCount and ranLevels.

Now try with maxLevels: 20:

cypher
Leiden with maxLevels: 20
CALL gds.leiden.stats('actor-network-undirected', { // (1)
  maxLevels: 20 // (2)
})
YIELD communityCount, modularity, ranLevels // (3)
RETURN communityCount, modularity, ranLevels // (4)
  1. Call Leiden in stats mode on undirected graph

  2. Set maximum hierarchy levels to 20

  3. Yield community statistics

  4. Return community count, modularity, and levels run

The second run is unlikely to run to the full 20 levels.

While higher maxLevels allows for more granular community detection, like Louvain, Leiden will stop early if communities have already converged.

However, remember back when you ran Louvain on this graph. It returned around 1000+ communities.

Leiden should have returned many more than that — Leiden generally excels at producing more modular and granular communities than Louvain.

When to use undirected projections

Use undirected relationships when:

  • Relationships are naturally bidirectional (collaborated with, friends with, similar to)

  • You’re running algorithms that require undirected graphs (like Leiden)

  • Direction doesn’t add meaningful information to your analysis

Use directed relationships when:

  • Direction has real meaning (follows, purchased, influenced)

  • You want to distinguish incoming from outgoing connections

  • Algorithms specifically need directional information

What’s next

You’ve learned how to project undirected relationships and run PageRank and Leiden on them. You’ve seen how configuration options like dampingFactor and maxLevels impact results.

In the next lesson, you’ll add relationship weights to your projection and see how algorithms can use those weights to produce even more nuanced results.

Check your understanding

Spot Robert De Niro: Undirected Graph

Run PageRank on actor-network-undirected with these dampingFactor values: 0.15, 0.50, 0.85

Where is Robert De Niro at dampingFactor: 0.15?

  • ❏ Position 1

  • ❏ Position 5

  • ✓ Not in the top 10

  • ❏ Position 10

Hint

Low dampingFactor values emphasize connection count over connection quality.

Solution

Robert De Niro is not in the top 10 at dampingFactor: 0.15.

With a low dampingFactor, PageRank essentially becomes a connection count algorithm. Actors with many collaborations dominate—regardless of who they worked with.

Robert De Niro has fewer total collaborations than actors like Jackie Chan (153 films), so he drops out of the top 10 when we only count quantity.

But watch what happens as you increase dampingFactor:

  • 0.50: Robert De Niro appears at position 5

  • 0.60: Robert De Niro climbs to position 3

  • 0.85: Robert De Niro reaches position 1

As dampingFactor increases, PageRank weighs connection quality more heavily. Robert De Niro worked with highly influential actors, so he rises to the top when quality matters more than quantity.

Spot Vincent Price: Directed Graph

Run PageRank on actor-network-directed with these dampingFactor values: 0.15, 0.50, 0.85

Where is Vincent Price at dampingFactor: 0.85?

  • ❏ Position 1

  • ❏ Position 6

  • ✓ Not in the top 10

  • ❏ Position 5

Hint

High dampingFactor values emphasize connection quality over connection count.

Solution

Vincent Price is not in the top 10 at dampingFactor: 0.85.

This is the opposite pattern from Robert De Niro on the undirected graph:

  • 0.15: Vincent Price is at position 1

  • 0.50: Vincent Price drops to position 6

  • 0.85: Vincent Price falls out of the top 10

At low dampingFactor, Vincent Price ranks #1 because he has many collaborations. But as dampingFactor increases and connection quality becomes more important, he drops out entirely.

This suggests Vincent Price, while prolific, didn’t work with as many highly-connected actors as Robert De Niro did. When the algorithm prioritizes "who you know" over "how many you know," different actors rise to prominence.

Key insight: The same algorithm with different configuration can reveal different dimensions of importance in your network—quantity vs. quality, breadth vs. depth.

Summary

Undirected projections treat relationships as bidirectional. Some algorithms like Leiden require undirected graphs, while others like PageRank produce different results on undirected networks.

PageRank’s dampingFactor controls whether the algorithm emphasizes connection quantity or quality. Leiden improves on Louvain with guaranteed well-connected communities. Both benefit from proper projection configuration.

Chatbot

How can I help you today?