Introduction
In this first lesson, you will learn how Neo4j Graph Data Science (GDS) is packaged, how to install it, and some licensing considerations. It is not strictly necessary to install GDS to take data science courses on graph academy. The interactive portions of these courses integrate with a sandbox that is automatically prepared for you with GDS on the backend. Nevertheless, we wanted to start here so you understand GDS as a product.
GDS Plugin and Compatibility
GDS is delivered as library and a plugin to the Neo4j Graph Database. This means that it needs to be installed as an extension in conjunction with configuration updates.
GDS also comes in both a free Community and paid Enterprise license which have important differences in regard to performance and enterprise capabilities. However, all analytics functionality, including graph algorithms and machine learning methods, are the same between both licenses.
The compatibility matrix for The GDS library vs Neo4j can be found here. In general, you can count on the latest version of GDS supporting the latest version of Neo4j and vice versa, and we recommend you always upgrade to that combination.
Below we will go over the installation process and licensing. Of course, if you are using AuraDS, GDS Enterprise comes prepackaged and ready to use out-of-the-box. You need not worry about installation, setup, and choosing between licenses.
Installation
Of all the on-prem installations, Neo4j Desktop has the simplest process for GDS installation. We will go over how to install GDS there first. Overall, if you plan on testing GDS locally on your desktop, Neo4j Desktop is usually the easiest place to start.
Once you install and open Neo4j Desktop, you will find GDS in the Plugins tab of a database:
The installer will download the GDS library and install it in the plugins/
directory of the database. It will also add the following entry to the settings file:
dbms.security.procedures.unrestricted=gds.*
This configuration entry is necessary because the GDS library accesses low-level components of Neo4j to maximize performance.
If the procedure allowlist is configured, make sure to also include procedures from the GDS library:
dbms.security.procedures.allowlist=gds.*
In Neo4j Desktop, at least in recent versions, this configuration should be disabled and/or included by default.
For GDS installation on other Neo4j deployment types, including standalone server, docker, and causal cluster, please see the Installation documentation. The steps are roughly the same as desktop though they include some other considerations and certain aspects may not be fully automated. For example, in Neo4j server, you need to get the plugin from the download center, put it the correct directory location, and update the configuration manually.
Licensing
GDS has both a community and enterprise license. Both have access to all the algorithms and machine learning methods, but the enterprise version has additional features that enable production use cases:
-
Enterprise features for increased performance: unlimited concurrency to speed up compute time and access to a low-memory analytics graph format enabling the application of data science to very large graphs
-
Enterprise features for security and workflow in production: fine-grained security, the ability to persist and publish machine learning models, in-memory graph back-up and restore, and causal cluster compatibility via read replica
You can find more information on how to obtain and install an enterprise license in our Enterprise Edition Configuration documentation.
Check your understanding
1. GDS Installation
GDS is delivered as a ____:
-
❏ standalone application
-
✓ plugin to the Neo4j Database
-
❏ microservice
-
❏ web application
Hint
The Neo4j Graph Data Science Library is an extension for Neo4j that provides a set of graph algorithms and machine learning techniques for analyzing and extracting insights from graph data.
Solution
GDS (Graph Data Science) is a plugin to the Neo4j Database.
You can install the GDS plugin for Neo4j by downloading the plugin JAR file and placing it in the plugins directory of your Neo4j installation.
2. GDS Licensing
Which of the below statements are true in respect to the GDS Community and Enterprise licenses (select all that apply):
-
✓ Community and Enterprise both have access to all the GDS algorithms and machine learning methods
-
❏ Community and Enterprise both have fine-grained security
-
❏ Community and Enterprise both have unlimited concurrency
-
✓ Enterprise-only features for increased performance include unlimited concurrency and a low-memory analytics graph format
Hint
Both the Community and Enterprise versions of the GDS Library offer comprehensive access to all algorithms and machine learning methods, while the Enterprise version provides additional performance-enhancing features such as unlimited concurrency and a low-memory analytics graph format.
Solution
The correct answers are:
Community and Enterprise both have access to all the GDS algorithms and machine learning methods
Enterprise-only features for increased performance include unlimited concurrency and a low-memory analytics graph format
Summary
In this lesson we covered GDS installation and licensing.
In the next lesson we will go over how GDS works at a high-level and how to better configure Neo4j for doing graph data science.