GDS Workflows in Python

Introduction

Whether you’re working in the Browser or in Python, GDS workflows follow the same fundamental pattern. The steps are identical. The logic is identical. Only the syntax changes.

This lesson walks through that workflow step by step, showing you how each piece translates from Cypher to Python.

What You’ll Learn

By the end of this lesson, you’ll be able to:

  • Execute the standard five-step GDS workflow in Python

  • Create graph projections and inspect them using the Graph object

  • Choose the right execution mode for different situations

  • Work with algorithm results as pandas DataFrames

  • Clean up projections properly to manage memory

The GDS Workflow

Every GDS analysis follows the same five steps:

  1. Load data into Neo4j (if needed)

  2. Project the graph into GDS memory

  3. Run algorithms on the projection

  4. Work with results

  5. Drop the projection

You did this in Modules 1 and 2 using Cypher. Now you’ll do the same thing in Python.

From Cypher to Python

In Module 2, you wrote Cypher projections like this:

cypher
Cypher projection
MATCH (source:User)-[r:P2P]->(target:User)
WITH gds.graph.project('fraud-graph', source, target) AS g
RETURN g.graphName, g.nodeCount

The Python equivalent uses the same concepts, but with a different interface. Let’s work through each step.

Step 1: Loading Data

If your data isn’t already in Neo4j, you can load it using gds.run_cypher(). This method executes any Cypher query and returns results as a pandas DataFrame.

python
Run Cypher to load data
# Load Movie nodes from CSV
gds.run_cypher(f""" # (1)
    LOAD CSV WITH HEADERS FROM '{CSV_URLS['movies']}' AS row
    MERGE (m:Movie {{tmdbId: row.tmdbId}}) # (2)
    SET m.title = row.title,
        m.year = toInteger(row.year),
        m.imdbRating = toFloat(row.imdbRating)
""")
  1. gds.run_cypher() sends any Cypher query to the server — here using an f-string to inject the CSV URL

  2. Double braces {{}} are required in f-strings to produce literal {} in the Cypher query

For this workshop, the companion notebook handles data loading. In practice, you’d often connect to an existing database.

Step 2: Creating Projections

The gds.graph.project() method returns two values: a Graph object and metadata about the projection.

python
Project a graph
G, result = gds.graph.project( # (1)
    "movies-graph",
    {
        "Actor": {
            "properties": {
                "born": {"defaultValue": 1900} # (2)
            }
        },
        "Movie": {
            "properties": {
                "year": {"defaultValue": 1900},
                "imdbRating": {"defaultValue": 0.0}
            }
        }
    },
    "ACTED_IN"
)
  1. Returns a tuple: G (the Graph object for inspecting/running algorithms) and result (projection metadata)

  2. defaultValue handles nodes missing a property — equivalent to coalesce() in Cypher projections

This example uses native projection syntax. In the next lesson, you’ll learn how to translate your Cypher projection knowledge to native projection in Python.

The Graph Object

The Graph object (G) gives you methods to inspect your projection without querying the catalog directly.

python
Graph operations
G.name()                     # Returns the graph name
G.node_count()               # Number of nodes in projection
G.relationship_count()       # Number of relationships
G.node_labels()              # List of node labels
G.relationship_types()       # List of relationship types
G.node_properties("Movie")   # (1)
G.memory_usage()             # (2)
G.exists()                   # True if graph exists in catalog
  1. Returns the list of properties projected for a specific label — useful to verify before running algorithms

  2. Check memory consumption to ensure the projection fits within your server’s available heap

These methods are useful for verifying your projection before running algorithms.

Step 3: Running Algorithms

Algorithm calls follow a consistent pattern:

python
gds.<algorithm>.<mode>(G, **config)

For example, to run degree centrality in mutate mode:

python
Mutate
result = gds.degree.mutate( # (1)
    G, mutateProperty="degree"
)

# Verify the property was added
print(G.node_properties("Actor")) # (2)
  1. .mutate() stores the result as a new property on the in-memory projection — not in the database

  2. After mutating, the property appears alongside any projected properties (e.g. ['born', 'degree'])

The mode you choose determines what happens with the results.

The Four Execution Modes

Each mode serves a different purpose:

  • .stream() — Returns results as a DataFrame. Use when you want to analyze or visualize results in Python.

  • .mutate() — Stores results in the projection only. Use when chaining multiple algorithms together.

  • .write() — Writes results back to Neo4j. Use when you need to persist results for later queries.

  • .stats() — Returns statistics only. Use for quick checks without storing anything.

Stream Mode in Practice

Stream mode is the most common choice for analysis work. Results come back as a pandas DataFrame.

python
Stream
df = gds.degree.stream(G) # (1)

# Standard pandas operations work immediately
top_nodes = df.nlargest(10, "score") # (2)
print(top_nodes)
  1. .stream() returns a DataFrame with nodeId and score columns — no side effects on the projection or database

  2. Since results are a standard pandas DataFrame, you can chain any pandas operation directly

Step 4: Working with Results

Since stream mode returns DataFrames, you can use the full pandas toolkit. Filter, sort, merge, visualize—whatever your analysis requires.

python
Stream to dataframes
# Get degree centrality scores
scores = gds.degree.stream(G)

# Find nodes above a threshold
high_degree = scores[scores["score"] > 50] # (1)

# Calculate summary statistics
print(scores["score"].describe()) # (2)
  1. Standard pandas boolean indexing works directly on the streamed results

  2. .describe() gives you count, mean, std, min/max — a quick way to understand the score distribution

Step 5: Cleanup

Projections consume memory. When you’re finished with a projection, drop it.

python
Cleanup
# Drop using the Graph object
G.drop()

# Or use the catalog
gds.graph.drop("movies-graph")

# Check what projections remain
print(gds.graph.list())

Forgetting to drop projections is a common source of memory issues, especially in notebooks where you might create multiple projections during exploration.

The Context Manager Pattern

Python’s with statement provides automatic cleanup. When the block ends, the projection is dropped—even if an error occurs.

python
Using with to Project &#8594; Run &#8594; Drop in one go
with gds.graph.project( # (1)
    "temp", ["User", "Movie"], "RATED"
)[0] as G:
    result = gds.degree.stream(G)
    print(f"Ran on {G.node_count()} nodes")
    display(result.nlargest(5, "score"))

# G has been dropped automatically
print(gds.graph.exists("temp")["exists"]) # (2)
  1. [0] extracts the Graph object from the returned tuple — the with block ensures it is dropped when the block exits

  2. After the with block, the projection no longer exists in memory — even if an error occurred inside the block

This pattern is especially useful for exploratory work where you’re creating and discarding projections frequently.

Putting It Together

Here’s the complete workflow in one place:

python
Connect
from graphdatascience import GraphDataScience

gds = GraphDataScience(uri, auth=(username, password))
python
Project &#8594; Run &#8594; Analyze &#8594; Drop
G, _ = gds.graph.project( # (1)
    "movies-graph",
    ["Actor", "Movie"],
    {"ACTED_IN": {"orientation": "UNDIRECTED"}}
)

# Run algorithm
df = gds.degree.stream(G) # (2)

# Work with results
print(df.nlargest(10, "score"))

# Cleanup
G.drop() # (3)
gds.close()
  1. Project with UNDIRECTED orientation so edges flow both ways — required for many centrality algorithms

  2. Stream results into a DataFrame for immediate analysis

  3. Always drop the projection and close the connection when finished

Summary

The GDS workflow in Python mirrors what you learned in Cypher:

  • gds.graph.project() returns a Graph object for inspecting projections

  • Four execution modes let you choose where results go: .stream(), .mutate(), .write(), .stats()

  • Always drop projections when finished—or use context managers for automatic cleanup

  • Include defaultValue when projecting properties to handle nulls

In the companion notebook, you’ll work through each step hands-on with the Movies dataset.

Next: Understanding projection syntax options—native vs. Cypher projection in Python.

Chatbot

How can I help you today?