The GDS Workflow

The Three-Step Workflow

Every GDS analysis follows the same basic pattern: Project → Run → Write

Diagram showing the three-step workflow

This workflow separates your source data from analysis, enabling fast, iterative experimentation on an in-memory projection of your graph without modifying your original graph.

You can choose to run an algorithm and write its results in a single operation instead of two. Or, you can chain multiple algorithm results to feed a final algorithm before writing back to the graph.

What You’ll Learn

By the end of this lesson, you’ll be able to:

  • Apply the Project → Run → Write workflow to any GDS analysis

  • Create graph projections using Cypher projection syntax

  • Run algorithms in different execution modes (stats, stream, mutate, write)

  • Manage in-memory projections by listing and dropping graphs

Why This Workflow?

  • Speed: In-memory projections are faster than querying the database directly.

  • Safety: Your source graph remains unchanged until you explicitly write results back.

  • Flexibility: Run multiple algorithms on the same projection to compare and combine results.

Comparison of in-memory projections vs direct database queries.

Step 1: Project

The graph projected by GDS is an in-memory structure containing nodes and relationships — just like your main graph.

a representation of a main graph with an arrow pointing to its projected variant. The main graph remains on disk

However, the in-memory graph you create is optimized for topology and property lookup operations.

Projecting graphs

You can create a projection of your graph using Cypher or Native projection.

cypher
Cypher Projection
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie)
WITH gds.graph.project(
  'actors-graph-cypher',
  source,
  target
) AS g
RETURN g.graphName AS graph,
      g.nodeCount AS nodes,
      g.relationshipCount AS rels
cypher
Native Projection
CALL gds.graph.project(
  'actors-graph-native',
  ['Actor', 'Movie'],
  'ACTED_IN'
)
YIELD graphName AS graph,
      nodeCount AS nodes,
      relationshipCount AS rels

When you run both of these queries, you will notice a slight difference in the results.

The reason for this is that native projection first projects the nodes and then finds the relevant relationships.

The Cypher projection matches only those nodes that have the required relationships. If you wanted to return the exact same results, you could run the following, instead:

cypher
Cypher Projection
MATCH (source:Actor), (target:Movie)
OPTIONAL MATCH (source)-[r:ACTED_IN]->(target)
WITH gds.graph.project(
  'actors-graph-cypher-all-nodes',
  source,
  target
) AS g
RETURN g.graphName AS graph,
      g.nodeCount AS nodes,
      g.relationshipCount AS rels

If you do run this, you will notice that it takes an age to run. That is because it is first fetching every Actor and Movie node in the graph, and then using the OPTIONAL MATCH clause to find the relevant relationships. OPTIONAL MATCH is a very expensive operation.

In practice, for GDS, you will rarely, if ever, need to project orphaned nodes.

Cypher vs Native context

Cypher projection is more flexible and more readily available across Neo4j products. It is primarily used in the Neo4j Browser but you can also use it via the Python driver.

Native projection is ostensibly simpler, and mimics the syntax of Python environments. It is primarily used in the Python driver, but you can also use it via the Neo4j Browser.

Generally, you will use Cypher projection in the Neo4j Browser and Native projection in the Python driver.

In this workshop, we will focus on Cypher projection. Native projection is covered in the GDS Python Client & Aura Graph Analytics workshop.

Cypher projection

A Cypher projection follows the same pattern as any other Cypher query you are used to running. However, instead of returning results, the gds.graph.project() procedure creates an in-memory representation based on the MATCH clause.

cypher
Cypher projection example
MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actors-graph-cypher-v2',
  source,
  target
) AS g
RETURN g.graphName AS graph, // (3)
  g.nodeCount AS nodes,
  g.relationshipCount AS rels

The Cypher projection above has three components:

  1. Cypher query - The MATCH clause defines which nodes and relationships to include.

  2. Projection call - The gds.graph.project() procedure creates the projection with a unique name, and defines the source and target nodes.

  3. Return statement - The RETURN clause returns metadata about the created projection, aliased as g.

Using Native projections

The native projection below produces the same result, and is more concise, but is less flexible. To create a native production, use the CALL keyword to invoke the gds.graph.project() procedure and YIELD the metadata.

cypher
Native projection
CALL gds.graph.project(
  'actor-graph-native-v2',
  ['Actor', 'Movie'],   // (1)
  'ACTED_IN'            // (2)
)
YIELD graphName AS graph, // (3)
    nodeCount AS nodes,
    relationshipCount AS rels

The native projection:

  1. Loads the provided labels into the project.

  2. Connects those nodes with the provided relationship types.

  3. Provides the same metadata, which are accessed using the YIELD clause.

Cypher vs Native

In this workshop, we will focus on Cypher projection.

Native projection is covered in the GDS Python Client & Aura Graph Analytics workshop, alongside the Python GDS client.

Running algorithms

Once you have projected your graph into memory, you can run algorithms on it using the CALL gds.<algorithm>.<mode> command.

Ignore the mode for now. We will cover it in more detail in a moment.

cypher
Running an algorithm
CALL gds.degree.stream( // (1)
  'actors-graph-cypher', // (2)
  {} // (3)
)
YIELD nodeId, score // (4)
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
  1. The degree centrality algorithm is called in stream mode.

  2. The algorithm is run on the actors-graph projection.

  3. A map of configuration options can be provided as the third argument.

  4. Each algorithm yields a unique set of results, which can be processed with Cypher.

Execution modes

There are five execution modes for algorithm commands:

  1. Stats: Get summary statistics without viewing individual results

  2. Stream: View results directly without storing

  3. Mutate: Store results in the projection

  4. Write: Persist results to your database

  5. Estimate: Check memory requirements before running

You can invoke each of these modes by appending the mode to the algorithm name.

cypher
Invoking the stats mode
CALL gds.degree.stats('actors-graph-native')

Estimate mode

The estimate mode is used to estimate the memory requirements for an algorithm.

cypher
Estimate mode
CALL gds.degree.stats.estimate('actors-graph-cypher', {}) // (1)
YIELD nodeCount, relationshipCount, // (2)
      bytesMin, bytesMax, requiredMemory
  1. Append .estimate to any execution mode

  2. Returns graph size and memory requirements

You will see how and when to use each of these modes as you work through the fraud detection use case.

Write from algorithm

Ultimately, you will end most GDS sessions by writing your results back to the graph. The CALL gds.<algorithm>.write will run the algorithm directly, and then write results.

cypher
Writing algorithm results directly to the database
CALL gds.degree.write(
  'actors-graph-cypher',
  {
    writeProperty: 'degree' // (1)
  }
)
YIELD centralityDistribution, nodePropertiesWritten // (2)
RETURN centralityDistribution.min AS minimumScore,
    centralityDistribution.mean AS meanScore,
    nodePropertiesWritten
  1. Specify the property name for storing results in the database

  2. Write mode yields summary statistics about what was written

Write from projection

You can also write results to your graph projection first, using mutate mode:

cypher
Storing results in the projection using mutate mode
CALL gds.degree.mutate(
  'actors-graph-cypher',
  {
    mutateProperty: 'degree' // (1)
  }
)
YIELD centralityDistribution, nodePropertiesWritten // (2)
  1. Store results in the projection under this property name

  2. Results stay in the projection only—​not persisted to the database

Write from graph

Then write from the graph projection back to your main graph.

Flowchart of writing node properties from projection to main graph.

Write from graph

cypher
Writing node properties from projection to database
CALL gds.graph.nodeProperties.write(
  'actors-graph-cypher', // (1)
  ['degree'] // (2)
)
YIELD propertiesWritten
  1. Name of the projection to write from

  2. List of properties to write back to the database

List graphs

Even when you’ve finished working on a projection, it will continue to hang around in memory until you either stop the server, or explicitly drop it.

You can see which graphs you have in memory with the gds.graph.list() command:

cypher
List graphs
CALL gds.graph.list() // (1)
YIELD graphName
RETURN graphName
  1. Returns all in-memory projections and their metadata

Dropping graphs

Once you’ve finished working on your projection, you can drop it from memory.

cypher
Drop graphs
CALL gds.graph.drop('actors-graph-cypher')

Dropping graphs

The entire graph will disappear, including any data within it — always make sure to write important information back to your main graph before dropping.

Illustration of dropping a graph from memory with warning.

Drop all

Sometimes, you might end up with a bunch of graphs in memory. You can drop them all at once with this pattern.

cypher
Drop all graphs
CALL gds.graph.list()
YIELD graphName // (1)
CALL gds.graph.drop(graphName) // (2)
YIELD graphName AS droppedGraphs
RETURN droppedGraphs
  1. List all projection names

  2. Drop each projection by name

Lesson Summary

In this lesson, you got to grips with the general GDS workflow: project → Run → Write.

In the next lesson, you’ll learn how to project a graph and run your first graph projection.

Chatbot

How can I help you today?

Data Model

Your data model will appear here.