Step 1: Create a new property
The database used for this course does not have the type of data that will benefit from a TEXT index. A TEXT index performs better than a RANGE index when the property being indexed contains a lot of duplicate values.
Let’s temporarily create some string properties in the graph that will contain many duplicates. You will add a ratingValue property to all RATED relationships where the value is the concatenation of the user name, rating, and year of the movie.
Run this code that adds ratingInfo property to all RATING relationships (Note: It will take ~20 seconds to execute.)
CALL apoc.periodic.iterate(
"MATCH (u:User)-[r:RATED]->(m:Movie) RETURN left(toString(m.year),2) AS rY, id(r) AS rId",
"MATCH ()-[r:RATED]->() WHERE id(r) = rId SET r.ratingY = rY",
{batchSize:1000})
Step 2: Query the graph
After having added this property to all RATING relationships in the graph, we have this query that you can run:
PROFILE MATCH ()-[r:RATED]->(m:Movie)
// movies in the 1900's
WHERE r.ratingY CONTAINS "9"
RETURN m.title,m.year, r.rating
Repeat this query to get the true elapsed time and total db hits. The second query is required because the first query adds the query to the query cache because it needs to parse the query and plan it.
Step 3: Add RANGE index
Add a regular RANGE index to the graph using:
-
index_name
: RATED_ratingY -
relationship_type
: RATED -
property_key
: ratingY
Step 4: Repeat the query
After having added the RANGE index. Execute this query twice:
PROFILE MATCH ()-[r:RATED]->(m:Movie)
// movies in the 1900's
WHERE r.ratingY CONTAINS "9"
RETURN m.title,m.year, r.rating
Do you see that the RANGE index is used and the total db hits and elapsed time is lower?
Step 5: Add TEXT index
Add a TEXT index to the graph using:
-
index_name
: RATED_ratingY_text -
relationship_type
: RATED -
property_key
: ratingY
Step 6: Repeat the query
After having added the TEXT index. Execute this query twice:
PROFILE MATCH ()-[r:RATED]->(m:Movie)
// movies in the 1900's
WHERE r.ratingY CONTAINS "9"
RETURN m.title,m.year, r.rating
Do you see that the TEXT index is used?
This is because the query planner will use a TEXT index if one exists when the predicate uses CONTAINS
or ENDS WITH
.
The performance of the query in this scenario does not benefit from the TEXT index.
Validate Results
Once you have completed the six steps of this Challenge, click the Check Indexes button and we will check the database for you.
Hint
Index names, relationship type names, and property key names are all case-sensitive.
Those properties should be used to create the composite index.
You can type SHOW INDEXES
after you have created the index to confirm that it created the RANGE index and TEXT indexes.
If you mess up, you can reload this Challenge page and you should be where you need to be at the beginning of this challenge.
Then you would need to perform the steps of this Challenge again.
Solution
Here are the statements to create the RANGE and TEXT indexes:
CALL apoc.periodic.iterate(
"MATCH (u:User)-[r:RATED]->(m:Movie) RETURN left(toString(m.year),2) AS rY, id(r) AS rId",
"MATCH ()-[r:RATED]->() WHERE id(r) = rId SET r.ratingY = rY",
{batchSize:1000});
CREATE INDEX RATED_ratingY IF NOT EXISTS FOR ()-[x:RATED]-() ON (x.ratingY);
CREATE TEXT INDEX RATED_ratingY_text IF NOT EXISTS FOR ()-[x:RATED]-() ON (x.ratingY)
Summary
In this Challenge, you demonstrated that you can create a TEXT index in the graph in an effort to improve the performance of a query. A TEXT index is not always the best solution. In the next module, you will learn about full-text indexes.