GDS Workflows in Python

Introduction

Whether you’re working in the Browser or in Python, GDS workflows follow the same fundamental pattern. The steps are identical. The logic is identical. Only the syntax changes.

This lesson walks through that workflow step by step, showing you how each piece translates from Cypher to Python.

What You’ll Learn

By the end of this lesson, you’ll be able to:

Execute the standard five-step GDS workflow in Python
Create graph projections and inspect them using the Graph object
Choose the right execution mode for different situations
Work with algorithm results as pandas DataFrames
Clean up projections properly to manage memory

The GDS Workflow

Every GDS analysis follows the same five steps:

Load data into Neo4j (if needed)
Project the graph into GDS memory
Run algorithms on the projection
Work with results
Drop the projection

You did this in Modules 1 and 2 using Cypher. Now you’ll do the same thing in Python.

From Cypher to Python

In Module 2, you wrote Cypher projections like this:

cypher

Cypher projection

MATCH (source:User)-[r:P2P]->(target:User)
WITH gds.graph.project('fraud-graph', source, target) AS g
RETURN g.graphName, g.nodeCount

The Python equivalent uses the same concepts, but with a different interface. Let’s work through each step.

Step 1: Loading Data

If your data isn’t already in Neo4j, you can load it using gds.run_cypher(). This method executes any Cypher query and returns results as a pandas DataFrame.

python

Run Cypher to load data

# Load Movie nodes from CSV
gds.run_cypher(f""" # (1)
    LOAD CSV WITH HEADERS FROM '{CSV_URLS['movies']}' AS row
    MERGE (m:Movie {{tmdbId: row.tmdbId}}) # (2)
    SET m.title = row.title,
        m.year = toInteger(row.year),
        m.imdbRating = toFloat(row.imdbRating)
""")

gds.run_cypher() sends any Cypher query to the server — here using an f-string to inject the CSV URL
Double braces {{}} are required in f-strings to produce literal {} in the Cypher query

For this workshop, the companion notebook handles data loading. In practice, you’d often connect to an existing database.

Step 2: Creating Projections

The gds.graph.project() method returns two values: a Graph object and metadata about the projection.

python

Project a graph

G, result = gds.graph.project( # (1)
    "movies-graph",
    {
        "Actor": {
            "properties": {
                "born": {"defaultValue": 1900} # (2)
            }
        },
        "Movie": {
            "properties": {
                "year": {"defaultValue": 1900},
                "imdbRating": {"defaultValue": 0.0}
            }
        }
    },
    "ACTED_IN"
)

Returns a tuple: G (the Graph object for inspecting/running algorithms) and result (projection metadata)
defaultValue handles nodes missing a property — equivalent to coalesce() in Cypher projections

This example uses native projection syntax. In the next lesson, you’ll learn how to translate your Cypher projection knowledge to native projection in Python.

The Graph Object

The Graph object (G) gives you methods to inspect your projection without querying the catalog directly.

python

Graph operations

G.name()                     # Returns the graph name
G.node_count()               # Number of nodes in projection
G.relationship_count()       # Number of relationships
G.node_labels()              # List of node labels
G.relationship_types()       # List of relationship types
G.node_properties("Movie")   # (1)
G.memory_usage()             # (2)
G.exists()                   # True if graph exists in catalog

Returns the list of properties projected for a specific label — useful to verify before running algorithms
Check memory consumption to ensure the projection fits within your server’s available heap

These methods are useful for verifying your projection before running algorithms.

Step 3: Running Algorithms

Algorithm calls follow a consistent pattern:

python

gds.<algorithm>.<mode>(G, **config)

For example, to run degree centrality in mutate mode:

python

Mutate

result = gds.degree.mutate( # (1)
    G, mutateProperty="degree"
)

# Verify the property was added
print(G.node_properties("Actor")) # (2)

.mutate() stores the result as a new property on the in-memory projection — not in the database
After mutating, the property appears alongside any projected properties (e.g. ['born', 'degree'])

The mode you choose determines what happens with the results.

The Four Execution Modes

Each mode serves a different purpose:

.stream() — Returns results as a DataFrame. Use when you want to analyze or visualize results in Python.
.mutate() — Stores results in the projection only. Use when chaining multiple algorithms together.
.write() — Writes results back to Neo4j. Use when you need to persist results for later queries.
.stats() — Returns statistics only. Use for quick checks without storing anything.

Stream Mode in Practice

Stream mode is the most common choice for analysis work. Results come back as a pandas DataFrame.

python

Stream

df = gds.degree.stream(G) # (1)

# Standard pandas operations work immediately
top_nodes = df.nlargest(10, "score") # (2)
print(top_nodes)

.stream() returns a DataFrame with nodeId and score columns — no side effects on the projection or database
Since results are a standard pandas DataFrame, you can chain any pandas operation directly

Step 4: Working with Results

Since stream mode returns DataFrames, you can use the full pandas toolkit. Filter, sort, merge, visualize—whatever your analysis requires.

python

Stream to dataframes

# Get degree centrality scores
scores = gds.degree.stream(G)

# Find nodes above a threshold
high_degree = scores[scores["score"] > 50] # (1)

# Calculate summary statistics
print(scores["score"].describe()) # (2)

Standard pandas boolean indexing works directly on the streamed results
.describe() gives you count, mean, std, min/max — a quick way to understand the score distribution

Step 5: Cleanup

Projections consume memory. When you’re finished with a projection, drop it.

python

Cleanup

# Drop using the Graph object
G.drop()

# Or use the catalog
gds.graph.drop("movies-graph")

# Check what projections remain
print(gds.graph.list())

Forgetting to drop projections is a common source of memory issues, especially in notebooks where you might create multiple projections during exploration.

The Context Manager Pattern

Python’s with statement provides automatic cleanup. When the block ends, the projection is dropped—even if an error occurs.

python

Using with to Project → Run → Drop in one go

with gds.graph.project( # (1)
    "temp", ["User", "Movie"], "RATED"
)[0] as G:
    result = gds.degree.stream(G)
    print(f"Ran on {G.node_count()} nodes")
    display(result.nlargest(5, "score"))

# G has been dropped automatically
print(gds.graph.exists("temp")["exists"]) # (2)

[0] extracts the Graph object from the returned tuple — the with block ensures it is dropped when the block exits
After the with block, the projection no longer exists in memory — even if an error occurred inside the block

This pattern is especially useful for exploratory work where you’re creating and discarding projections frequently.

Putting It Together

Here’s the complete workflow in one place:

python

Connect

from graphdatascience import GraphDataScience

gds = GraphDataScience(uri, auth=(username, password))

python

Project → Run → Analyze → Drop

G, _ = gds.graph.project( # (1)
    "movies-graph",
    ["Actor", "Movie"],
    {"ACTED_IN": {"orientation": "UNDIRECTED"}}
)

# Run algorithm
df = gds.degree.stream(G) # (2)

# Work with results
print(df.nlargest(10, "score"))

# Cleanup
G.drop() # (3)
gds.close()

Project with UNDIRECTED orientation so edges flow both ways — required for many centrality algorithms
Stream results into a DataFrame for immediate analysis
Always drop the projection and close the connection when finished

Summary

The GDS workflow in Python mirrors what you learned in Cypher:

gds.graph.project() returns a Graph object for inspecting projections
Four execution modes let you choose where results go: .stream(), .mutate(), .write(), .stats()
Always drop projections when finished—or use context managers for automatic cleanup
Include defaultValue when projecting properties to handle nulls

In the companion notebook, you’ll work through each step hands-on with the Movies dataset.

Next: Understanding projection syntax options—native vs. Cypher projection in Python.

Graph Data Science in Practice

GDS Foundations

Community Detection for Fraud

GDS Python Client

Aura Graph Analytics

GDS Workflows in Python

Introduction

What You’ll Learn

The GDS Workflow

From Cypher to Python

Step 1: Loading Data

Step 2: Creating Projections

The Graph Object

Step 3: Running Algorithms

The Four Execution Modes

Stream Mode in Practice

Step 4: Working with Results

Step 5: Cleanup

The Context Manager Pattern

Putting It Together

Summary

Chatbot