Introduction
In this lesson, you will learn about the different algorithm tiers, the different execution modes for algorithms, and how to estimate the memory needed to run algorithms in GDS.
At the end of this lesson, you will:
-
Understand the implications of each algorithm tier
-
Understand when to use each algorithm execution mode
-
Know how to estimate the memory requirements for running GDS algorithms on your data
Tiers
GDS algorithms are classified into three tiers: alpha, beta, and production.
-
Production-quality: Indicates that the algorithm has been tested in regard to stability and scalability. Algorithms in this tier are prefixed with
gds.<algorithm>
. -
Beta: Indicates that the algorithm is a candidate for the production-quality tier. Algorithms in this tier are prefixed with
gds.beta.<algorithm>
. -
Alpha: Indicates that the algorithm is experimental and might be changed or removed at any time. Algorithms in this tier are prefixed with
gds.alpha.<algorithm>
.
Execution Modes
GDS algorithms have 4 executions modes which determine how the results of the algorithm are handled.
-
stream
: Returns the result of the algorithm as a stream of records. -
stats
: Returns a single record of summary statistics, but does not write to the Neo4j database or modify any data. -
mutate
: Writes the results of the algorithm to the in-memory graph projection and returns a single record of summary statistics. -
write
: Writes the results of the algorithm back the Neo4j database and returns a single record of summary statistics.
For more detail and best practices on execution modes, please refer to our Running algorithms documentation.
Memory Estimation
As the size of data grows, a ubiquitous challenge for Data Science practitioners is figuring out how much memory is required to support their analytics and machine learning workflows. This can often require a lot of experimentation and trial and error. To circumvent this, GDS offers an estimation procedure which allows you to estimate the memory needed for using an algorithm on your data BEFORE actually executing it. To use the estimation procedure for different algorithms and execution modes you can simply append the command with .estimate
.
More detailed information on usage and best practices with memory estimation can be found in our Memory estimation documentation.
Overall Algorithm Syntax
Putting the above together, all GDS algorithms follow the below syntax:
CALL gds[.<tier>].<algorithm>.<execution-mode>[.<estimate>](
graphName: STRING,
configuration: MAP
)
Check your understanding
1. Alpha Tier Algorithms
What kind of support can you expect from alpha tier algorithms?
-
❏ Alpha tier algorithms have been tested for stability and scalability
-
❏ Alpha tier algorithms are candidates for the production-quality tier
-
✓ Alpha tier algorithms are experimental and might be changed or removed at any time
-
❏ None of the above
Hint
Alpha tier algorithms are subject to change or removal at any time due to their experimental nature.
Solution
Alpha tier algorithms are experimental and might be changed or removed at any time.
2. Algorithm Execution
Which of the following execution modes writes algorithm results back to the Neo4j database?
-
❏ Stream
-
❏ Stats
-
❏ Mutate
-
✓ Write
Hint
You would use this mode to write data to the database.
Solution
The answer is Write.
Summary
In this lesson you learned about the general characteristics of GDS algorithms.
In upcoming lessons you will go through the different algorithm categories in more detail to understand what the different GDS algorithms can do and how you can apply them to different use cases.