Projection Configuration for Algorithms

Introduction

Some algorithms need specific projection configurations to work correctly—​or at all.

In this lesson, you’ll learn how relationship direction and weights affect algorithm behaviour, and how to configure projections accordingly.

What You’ll Learn

By the end of this lesson, you’ll be able to:

  • Create undirected projections for algorithms that require them

  • Include relationship weights to represent connection strength

  • Project node properties and handle missing values with coalesce()

  • Determine what configuration an algorithm needs by checking its documentation

Two Key Projection Settings

Beyond choosing which nodes and relationships to include, two configuration options significantly affect algorithms:

  1. Relationship direction — directed vs undirected

  2. Relationship weights — treating some connections as stronger than others

Diagram showing nodes with directed and weighted edges illustrating algorithm behavior.

Directed vs Undirected Projections

By default, GDS projections are directed — relationships flow one way, from source to target.

Some algorithms require undirected projections, where every relationship works in both directions.

This is a property of the projection itself, set at projection time — not something you change when running an algorithm.

When Direction Matters

Directed makes sense when:

  • Direction has real meaning (follows, purchased, influenced)

  • You want to distinguish incoming from outgoing connections

Undirected makes sense when:

  • Relationships are naturally bidirectional (collaborated with, friends with)

  • The algorithm requires it (Leiden)

Algorithm Requirements

Check the documentation header for each algorithm:

Leiden’s attributes showing it requires undirected relationships
  • Green = works well

  • Grey = runs but ignores that aspect

  • Red = won’t run

Leiden has a red mark for directed—​it requires undirected relationships.

Projecting an Undirected Graph

To make a projection undirected, add undirectedRelationshipTypes to the second config map:

cypher
Undirected projection of actors and movies
MATCH (source:Actor)-[:ACTED_IN]->(target:Movie)
WITH gds.graph.project(
  'actor-movie-undirected',
  source,
  target,
  {}, // (1)
  { undirectedRelationshipTypes: ['*'] } // (2)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. First config map (for node/relationship properties — empty here)

  2. Second config map — make all relationship types undirected

The ['*'] means all relationship types in the projection become undirected.

In this projection, every Actor is connected to every Movie it appeared in, and every Movie is connected back to every Actor.

In reality, GDS creates this as a bidirectional graph rather than a truly undirected one, but in practice the effect is the same.

Projecting a Directed Graph

For comparison, let’s project the same data as a directed graph:

cypher
Directed projection of actors and movies
MATCH (source:Actor)-[:ACTED_IN]->(target:Movie)
WITH gds.graph.project(
  'actor-movie-directed',
  source,
  target
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

This projection only has relationships flowing from Actor to Movie, matching the ACTED_IN direction in the database.

Run degree centrality in both directed and undirected projections

Now let’s run degree centrality on both directed and undirected projections and compare the results.

cypher
Degree centrality in directed projection
CALL gds.degree.stream('actor-movie-directed')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 5
cypher
Degree centrality in undirected projection
CALL gds.degree.stream('actor-movie-undirected')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 5

You’ll notice that both versions appear to return the same results. This is because, in our graph, actors connect to more movies than movies connect to actors.

Comparing Directed vs Undirected

However, let’s now look at what happens if we only count the degree count for movies in each projection.

cypher
Degree centrality in directed projection
CALL gds.degree.stream('actor-movie-directed')
YIELD nodeId, score
WITH gds.util.asNode(nodeId) as nodeType, score
WHERE "Movie" IN labels(nodeType)
RETURN nodeType.title AS title, score
ORDER BY score DESC LIMIT 5
cypher
Degree centrality in undirected projection
CALL gds.degree.stream('actor-movie-undirected')
YIELD nodeId, score
WITH gds.util.asNode(nodeId) as nodeType, score
WHERE "Movie" IN labels(nodeType)
RETURN nodeType.title AS title, score
ORDER BY score DESC LIMIT 5

You’ll notice that the second version returns different results. In the first version we’re counting all the outgoing rels from Actor to Movie, while in the second version we’re counting in both directions.

In the first version, only Actor nodes receive a score greater than 0, while in the second version all nodes with outgoing relationships receive a score greater than 0.

In the first version, Movies do not have any outgoing relationships, so their relationships are not counted. In the second version, all movies have outgoing relationships, so their relationships are counted.

Comparing Directed vs Undirected

For symmetrical graphs (like actor collaborations), results may be identical:

  • If A→B exists, B→A also exists

  • Making it undirected doesn’t change the structure

Diagram comparing directed and undirected projections on symmetrical and asymmetrical graphs.

For asymmetrical graphs (like followers), results will differ significantly.

Direction in the Projection vs Direction in the Algorithm

It’s important to distinguish between two different places where direction can come into play:

  1. Projection-level direction — set when you create the projection using undirectedRelationshipTypes. This determines the structure of the in-memory graph.

  2. Algorithm-level orientation — some algorithms (like degree centrality) let you choose which direction to traverse using an orientation parameter. This only works within the bounds of what the projection contains.

Degree Centrality: Measuring Direction

Degree centrality counts connections. On a directed projection, you can choose which connections to count using the orientation parameter.

cypher
Counting outgoing connections (NATURAL)
CALL gds.degree.stream('actor-movie-directed', {
  orientation: 'NATURAL' // (1)
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name,
       score
ORDER BY score DESC LIMIT 5
  1. NATURAL counts outgoing relationships (the default). For actors, this counts how many movies each actor appeared in.

Degree Centrality: Reversing Direction

On the same directed projection, you can reverse the traversal direction:

cypher
Counting incoming connections (REVERSE)
CALL gds.degree.stream('actor-movie-directed', {
  orientation: 'REVERSE' // (1)
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).title AS title,
       score
ORDER BY score DESC LIMIT 5
  1. REVERSE counts incoming relationships. For movies, this counts how many actors appeared in each movie.

The same projection, different orientation — and you’re measuring something completely different.

Reversing direction on an undirected projection

Now let’s run degree centrality on an undirected projection, both with and without reversing the direction.

cypher
CALL gds.degree.stream('actor-movie-undirected', {
  orientation: 'NATURAL'
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 5
cypher
CALL gds.degree.stream('actor-movie-undirected', {
  orientation: 'REVERSE'
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 5

You’ll notice that the results are the same in both cases. This is because the undirected projection is already bidirectional, so reversing the direction doesn’t change the results.

Relationship Weights

Not all connections are equal. Weights let algorithms treat some relationships as stronger than others.

Examples:

  • Rating scores (1-5 stars)

  • Transaction amounts

  • Interaction frequency

  • Distance or travel time

Graph with nodes and weighted edges showing examples like ratings and transactions.

Projecting Weights

Include relationship properties in your projection:

cypher
MATCH (source:User)-[r:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-weighted',
  source,
  target,
  { relationshipProperties: r { .rating } }, // (1)
  { undirectedRelationshipTypes: ['*'] } // (2)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. Capture the rating property from relationships

  2. Make relationships undirected (required for Leiden)

The r { .rating } syntax captures the rating property from relationships.

Using Weights in Algorithms

Tell the algorithm which property to use:

cypher
CALL gds.leiden.stream('user-movie-weighted', {
  relationshipWeightProperty: 'rating' // (1)
})
YIELD nodeId, communityId
  1. Point the algorithm at the projected weight property

Without this parameter, the algorithm ignores the weights even if they’re in the projection.

Weighted vs Unweighted Communities

Unweighted: Groups users who rated the same movies (regardless of how they rated them)

Weighted: Groups users who rated movies similarly (5-star ratings contribute more than 1-star)

Weights transform "what’s connected" into "how strongly it’s connected."

Comparison of unweighted and weighted community detection with nodes and edges.

Weights for Different Algorithms

Different algorithms interpret weights differently:

  • Community Detection — stronger connections keep nodes together

  • Pathfinding — weights become distances/costs to minimise

  • Similarity — higher weights increase similarity contribution

Check documentation for how each algorithm uses weights.

Node Properties

Sometimes algorithms need node attributes—​not just connections.

Examples:

  • Initial values for community detection

  • Features for similarity calculations

  • Attributes for machine learning pipelines

Examples of node properties like initial values and features.

Projecting Node Properties

Include node properties using sourceNodeProperties and targetNodeProperties:

cypher
MATCH (source:Movie)<-[r:RATED]-(:User)-[:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-with-properties',
  source,
  target,
  {
    sourceNodeProperties: source { .startYear, .imdbRating }, // (1)
    targetNodeProperties: target { .startYear, .imdbRating } // (2)
  }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. Include properties from source nodes

  2. Include the same properties from target nodes

Handling Missing Properties

Use coalesce() to provide default values when properties might be missing:

cypher
MATCH (source:Movie)<-[:RATED]-(:User)-[:RATED]->(target:Movie)
WITH gds.graph.project(
  'movie-network-defaults',
  source,
  target,
  {
    sourceNodeProperties: source {
      imdbRating: coalesce(source.imdbRating, 5.0), // (1)
      startYear: coalesce(source.startYear, 1) // (2)
    },
    targetNodeProperties: target {
      imdbRating: coalesce(target.imdbRating, 5.0),
      startYear: coalesce(target.startYear, 1)
    }
  }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. Default to 5.0 if imdbRating is null

  2. Default to 1 if startYear is null

No defaults leads to null

Without coalesce(), nodes missing the property will have null values—​which can cause algorithms to fail.

cypher
CALL gds.fastRP.stream('user-movie-with-properties', {
  featureProperties: ['startYear', 'imdbRating'], // (1)
  embeddingDimension: 64
})
YIELD nodeId, embedding
  1. Algorithms using node properties will fail if any values are null

Without defaults, this query will fail with a null property error.

Using Node Properties in Algorithms

With the use of defaults, however, we can still run the algorithm:

cypher
CALL gds.fastRP.stream('movie-network-defaults', {
  featureProperties: ['startYear', 'imdbRating'], // (1)
  embeddingDimension: 64
})
YIELD nodeId, embedding
  1. featureProperties tells FastRP to incorporate node attributes into embeddings

The featureProperties parameter tells FastRP to incorporate those node attributes into the embeddings.

FastRP and other embedding algorithms are covered in the GDS Python Client & Aura Graph Analytics workshop.

Configuration Checklist

Before running an algorithm, ask:

  1. Does this algorithm support my graph structure? (Check the header attributes)

  2. Does it need undirected relationships? (Leiden does)

  3. Should I use weights? (Do connection strengths matter?)

  4. Do I need node properties? (Does the algorithm use node attributes?)

  5. What direction makes sense? (What am I actually measuring?)

Quick Reference: Configuration

Setting Syntax Location

Undirected relationships

undirectedRelationshipTypes: ['*']

Second config map

Relationship weights

relationshipProperties: r { .propertyName }

First config map

Node properties

sourceNodeProperties: source { .prop }
targetNodeProperties: target { .prop }

First config map

Use relationship weights

relationshipWeightProperty: 'propertyName'

Algorithm config

Use node properties

featureProperties: ['prop1', 'prop2']

Algorithm config

Summary

Projection configuration affects what algorithms can run and how they behave:

Diagram summarizing effects of direction

Direction:

  • Some algorithms require undirected relationships (Leiden)

  • Direction determines what you’re measuring (outgoing vs incoming)

  • Use undirectedRelationshipTypes: ['*'] to make relationships bidirectional

Weights:

  • Capture connection strength with relationshipProperties

  • Tell algorithms to use weights with relationshipWeightProperty

  • Transforms analysis from "connected" to "how strongly connected"

Node Properties:

  • Include node attributes with sourceNodeProperties and targetNodeProperties

  • Use coalesce() to handle missing values

  • Some algorithms use properties as features (e.g., FastRP with featureProperties)

You’re now ready to apply these concepts in the hands-on use case exercises.

Chatbot

How can I help you today?

Data Model

Your data model will appear here.