Projection Configuration for Algorithms

Introduction

Some algorithms need specific projection configurations to work correctly—​or at all.

In this lesson, you’ll learn how relationship direction and weights affect algorithm behaviour, and how to configure projections accordingly.

What You’ll Learn

By the end of this lesson, you’ll be able to:

  • Create undirected projections for algorithms that require them

  • Include relationship weights to represent connection strength

  • Project node properties and handle missing values with coalesce()

  • Determine what configuration an algorithm needs by checking its documentation

Two Key Projection Settings

Beyond choosing which nodes and relationships to include, two configuration options significantly affect algorithms:

  1. Relationship direction — directed vs undirected

  2. Relationship weights — treating some connections as stronger than others

Diagram showing nodes with directed and weighted edges illustrating algorithm behavior.

Directed vs Undirected

By default, GDS projections are directed--relationships flow one way.

Some algorithms require undirected relationships, where connections work both ways.

When Direction Matters

Directed makes sense when:

  • Direction has real meaning (follows, purchased, influenced)

  • You want to distinguish incoming from outgoing connections

Undirected makes sense when:

  • Relationships are naturally bidirectional (collaborated with, friends with)

  • The algorithm requires it (Leiden)

Algorithm Requirements

Check the documentation header for each algorithm:

Leiden’s attributes showing it requires undirected relationships
  • Green = works well

  • Grey = runs but ignores that aspect

  • Red = won’t run

Leiden has a red mark for directed—​it requires undirected relationships.

Creating Undirected Projections

Add undirectedRelationshipTypes to your projection configuration:

cypher
MATCH (source:Actor)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor)
WITH gds.graph.project(
  'actor-network-undirected',
  source,
  target,
  {}, // (1)
  { undirectedRelationshipTypes: ['*'] } // (2)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. First config map (node/relationship properties)

  2. Second config map—​make all relationship types undirected

The ['*'] means all relationship types become undirected.

Comparing Directed vs Undirected

For symmetrical graphs (like actor collaborations), results may be identical:

  • If A→B exists, B→A also exists

  • Making it undirected doesn’t change the structure

Diagram comparing directed and undirected projections on symmetrical and asymmetrical graphs.

For asymmetrical graphs (like followers), results will differ significantly.

Degree Centrality and Direction

Degree centrality counts outgoing relationships by default.

cypher
CALL gds.degree.stream('actor-network', {
  orientation: 'NATURAL' // (1)
})
  1. NATURAL = outgoing (default), REVERSE = incoming, UNDIRECTED = both

Direction on Bipartite Graphs

Remember the Actor→Movie bipartite projection?

cypher
CALL gds.degree.stream('actor-movie-network', {
  orientation: 'REVERSE' // (1)
})
  1. With REVERSE, you rank Movies by incoming actor connections instead of actors by movie count

Direction determines what you’re measuring.

Relationship Weights

Not all connections are equal. Weights let algorithms treat some relationships as stronger than others.

Examples:

  • Rating scores (1-5 stars)

  • Transaction amounts

  • Interaction frequency

  • Distance or travel time

Graph with nodes and weighted edges showing examples like ratings and transactions.

Projecting Weights

Include relationship properties in your projection:

cypher
MATCH (source:User)-[r:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-weighted',
  source,
  target,
  { relationshipProperties: r { .rating } }, // (1)
  { undirectedRelationshipTypes: ['*'] } // (2)
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. Capture the rating property from relationships

  2. Make relationships undirected (required for Leiden)

The r { .rating } syntax captures the rating property from relationships.

Using Weights in Algorithms

Tell the algorithm which property to use:

cypher
CALL gds.leiden.stream('user-movie-weighted', {
  relationshipWeightProperty: 'rating' // (1)
})
YIELD nodeId, communityId
  1. Point the algorithm at the projected weight property

Without this parameter, the algorithm ignores the weights even if they’re in the projection.

Weighted vs Unweighted Communities

Unweighted: Groups users who rated the same movies (regardless of how they rated them)

Weighted: Groups users who rated movies similarly (5-star ratings contribute more than 1-star)

Weights transform "what’s connected" into "how strongly it’s connected."

Comparison of unweighted and weighted community detection with nodes and edges.

Weights for Different Algorithms

Different algorithms interpret weights differently:

  • Community Detection — stronger connections keep nodes together

  • Pathfinding — weights become distances/costs to minimise

  • Similarity — higher weights increase similarity contribution

Check documentation for how each algorithm uses weights.

Node Properties

Sometimes algorithms need node attributes—​not just connections.

Examples:

  • Initial values for community detection

  • Features for similarity calculations

  • Attributes for machine learning pipelines

Examples of node properties like initial values and features.

Projecting Node Properties

Include node properties using sourceNodeProperties and targetNodeProperties:

cypher
MATCH (source:Movie)<-[r:RATED]-(:User)-[:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-with-properties',
  source,
  target,
  {
    sourceNodeProperties: source { .startYear, .imdbRating }, // (1)
    targetNodeProperties: target { .startYear, .imdbRating } // (2)
  }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. Include properties from source nodes

  2. Include the same properties from target nodes

Handling Missing Properties

Use coalesce() to provide default values when properties might be missing:

cypher
MATCH (source:Movie)<-[:RATED]-(:User)-[:RATED]->(target:Movie)
WITH gds.graph.project(
  'movie-network-defaults',
  source,
  target,
  {
    sourceNodeProperties: source {
      imdbRating: coalesce(source.imdbRating, 5.0), // (1)
      startYear: coalesce(source.startYear, 1) // (2)
    },
    targetNodeProperties: target {
      imdbRating: coalesce(target.imdbRating, 5.0),
      startYear: coalesce(target.startYear, 1)
    }
  }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
  1. Default to 5.0 if imdbRating is null

  2. Default to 1 if startYear is null

No defaults leads to null

Without coalesce(), nodes missing the property will have null values—​which can cause algorithms to fail.

cypher
CALL gds.fastRP.stream('user-movie-with-properties', {
  featureProperties: ['startYear', 'imdbRating'], // (1)
  embeddingDimension: 64
})
YIELD nodeId, embedding
  1. Algorithms using node properties will fail if any values are null

Without defaults, this query will fail with a null property error.

Using Node Properties in Algorithms

With the use of defaults, however, we can still run the algorithm:

cypher
CALL gds.fastRP.stream('movie-network-defaults', {
  featureProperties: ['startYear', 'imdbRating'], // (1)
  embeddingDimension: 64
})
YIELD nodeId, embedding
  1. featureProperties tells FastRP to incorporate node attributes into embeddings

The featureProperties parameter tells FastRP to incorporate those node attributes into the embeddings.

We’ll cover FastRP along with many other algorithms later in the workshop.

Configuration Checklist

Before running an algorithm, ask:

  1. Does this algorithm support my graph structure? (Check the header attributes)

  2. Does it need undirected relationships? (Leiden does)

  3. Should I use weights? (Do connection strengths matter?)

  4. Do I need node properties? (Does the algorithm use node attributes?)

  5. What direction makes sense? (What am I actually measuring?)

Quick Reference: Configuration

Setting Syntax Location

Undirected relationships

undirectedRelationshipTypes: ['*']

Second config map

Relationship weights

relationshipProperties: r { .propertyName }

First config map

Node properties

sourceNodeProperties: source { .prop }
targetNodeProperties: target { .prop }

First config map

Use relationship weights

relationshipWeightProperty: 'propertyName'

Algorithm config

Use node properties

featureProperties: ['prop1', 'prop2']

Algorithm config

Summary

Projection configuration affects what algorithms can run and how they behave:

Diagram summarizing effects of direction

Direction:

  • Some algorithms require undirected relationships (Leiden)

  • Direction determines what you’re measuring (outgoing vs incoming)

  • Use undirectedRelationshipTypes: ['*'] to make relationships bidirectional

Weights:

  • Capture connection strength with relationshipProperties

  • Tell algorithms to use weights with relationshipWeightProperty

  • Transforms analysis from "connected" to "how strongly connected"

Node Properties:

  • Include node attributes with sourceNodeProperties and targetNodeProperties

  • Use coalesce() to handle missing values

  • Some algorithms use properties as features (e.g., FastRP with featureProperties)

You’re now ready to apply these concepts in the hands-on use case exercises.

Chatbot

How can I help you today?