Introduction to the Python GDS Client

Introduction

You’ve spent the last two modules using GDS in the Neo4j Browser—projecting graphs, running algorithms, and interpreting results. That foundation is solid.

Moving to Python

Now it’s time to take those same skills into the environment where most real-world data science happens: Python.

Same algorithms, different syntax

The Python GDS client isn’t a different way of doing graph analytics. It’s the same algorithms, the same concepts, the same workflows—just wrapped in a Pythonic interface that plays nicely with pandas, scikit-learn, and the rest of the Python ecosystem.

What You’ll Learn

By the end of this lesson, you’ll be able to:

Set up a development environment for GDS work in Python
Connect to Neo4j using the Python GDS client
Execute Cypher queries and receive results as pandas DataFrames
Recognize when Python makes more sense than Browser (and vice versa)

Setting Up Your Environment

Before we write any code, let’s get your development environment ready.

Click the button below to open the workshop repository in a GitHub Codespace. This will clone the repository and set up a pre-configured Python environment automatically.

Open in GitHub Codespace

The Codespace takes approximately 10 minutes to configure. While it’s setting up, continue through the next few slides—we’ll walk through the concepts before you need to run any code.

How the Python Client Works

The GDS Python client acts as a bridge between your Python code and your Neo4j server.

Under the hood, it translates your Python method calls into Cypher queries, sends them to the server, executes them against the GDS library, and returns results as pandas DataFrames.

Flowchart showing the GDS Python client connecting Python code to the Neo4j server.

Everything still applies

This means everything you learned about GDS in the Browser still applies. The algorithms haven’t changed. The projections work the same way. You’re just using a different interface to access them.

When to Use Python vs. Browser

Both tools have their place. The key is knowing which one fits your current task.

Reach for Python when you’re:

Building repeatable data pipelines
Automating workflows that run regularly
Integrating graph analytics with other Python libraries
Working with results that need further processing

Stick with Browser when you’re:

Exploring data interactively
Running quick, one-off queries
Debugging projection or algorithm issues
Visually inspecting graph structure

For this module, we’ll work primarily in Python—but you’ll likely switch between both in practice.

Installing the Client

The official package is graphdatascience. In the Codespace you’ll use, it’s already installed. Otherwise:

bash

pip install graphdatascience

Connecting to Neo4j

With the package installed, connecting is straightforward:

python

from graphdatascience import GraphDataScience

gds = GraphDataScience( # (1)
    "bolt://localhost:7687",
    auth=("neo4j", "password")
)

# Verify the connection works
print(gds.server_version()) # (2)

Create a GraphDataScience instance with your Neo4j connection URI and credentials
Always verify the connection — this returns the GDS library version running on the server

Connecting to a Specific Database

By default, the client connects to the "neo4j" database. If your database has a different name, specify it explicitly:

python

gds = GraphDataScience(uri, auth=(user, password), database="my-db")

Running Cypher Queries

Once connected, you can run any Cypher query using gds.run_cypher(). The results come back as a pandas DataFrame—ready for analysis, visualization, or further processing.

python

result = gds.run_cypher(""" # (1)
    MATCH (m:Movie)
    RETURN m.title AS movie, m.year AS year
    ORDER BY m.year DESC
    LIMIT 10
""")

print(result.head()) # (2)

gds.run_cypher() accepts any valid Cypher query and sends it to the server
Results are returned as a pandas DataFrame — use .head(), .describe(), or any pandas method directly

This is useful for ad-hoc queries, but for GDS-specific operations (projections, algorithms), we’ll use dedicated methods in the next lesson.

Closing Connections

When you’re finished, close the connection to free up resources:

python

gds.close()

Python will call this automatically when the gds object is garbage collected, but it’s good practice to close connections explicitly—especially in notebooks where objects can persist longer than expected.

Our Dataset: The Cora Citation Network

Throughout this module, we’ll analyze a real academic citation network called Cora.

What’s in the dataset:

2,708 papers spanning 7 research subjects
10,556 citations (directed edges: Paper A → Paper B means A cites B)
1,433-dimensional feature vectors (word frequencies from paper abstracts)

Diagram of the Cora citation network with nodes for papers and directed edges for citations.

The seven research subjects

The Cora dataset is a classic benchmark dataset for graph machine learning.

It includes 2,708 academic papers, 10,556 citation relationships, and spans across 7 research subjects:

Neural Networks
Reinforcement Learning
Theory
Genetic Algorithms
Case-Based Reasoning
Probabilistic Methods
Rule Learning

This is a classic dataset in machine learning research—small enough to iterate quickly, rich enough to demonstrate real patterns.

What’s Ahead

With Python as our interface, we’ll work through the complete GDS workflow:

Projecting graphs into memory
Running algorithms like PageRank, Betweenness Centrality, Louvain, and FastRP
Processing results as DataFrames
Cleaning up projections when we’re done

Each algorithm will follow the same pattern: deep-dive on the Movies dataset (which you know well), then hands-on practice with Cora.

Summary

The Python GDS client gives you programmatic access to everything you learned in the Browser:

Same algorithms, same projection logic, same workflows
Results returned as pandas DataFrames
Version compatibility between client, driver, and GDS library matters

Your Codespace should be ready by now. In the next lesson, we’ll connect to Neo4j and run our first GDS workflow entirely in Python.

Graph Data Science in Practice

GDS Foundations

Community Detection for Fraud

GDS Python Client

Aura Graph Analytics

Introduction to the Python GDS Client

Introduction

Moving to Python

Same algorithms, different syntax

What You’ll Learn

Setting Up Your Environment

How the Python Client Works

Everything still applies

When to Use Python vs. Browser

Installing the Client

Connecting to Neo4j

Connecting to a Specific Database

Running Cypher Queries

Closing Connections

Our Dataset: The Cora Citation Network

The seven research subjects

What’s Ahead

Summary

Chatbot