Creating Multi-property Indexes

Why multi-property indexes?

Recall that a Node key defines multiple properties of a node that make it unique and require the properties to exist. Previously, you created a Node key for the name and url properties of the Person nodes. This ensured that these two values for a Person node would be unique and must exist in the graph.

A multi-property index (composite RANGE index) provides quick lookups for two or more properties of a node or relationship.

Suppose we had this query that was an important use case for our movie application:

cypher
PROFILE MATCH (m:Movie)
WHERE m.year = 2000 AND m.runtime <= 60
RETURN m.title, m.year, m.runtime

If you execute this query on the graph, you will see that it requires 27,927 total db hits to complete the query. If you know that a query is important to your application, it would benefit from a composite index on the Movie nodes on the year and runtime properties.

Syntax for creating a Composite index for multiple properties of a node

Here is the syntax for creating a Composite index for multiple properties of a node:

cypher
CREATE INDEX <index_name> IF NOT EXISTS
FOR (x:<node_label>)
ON (x.<property_key1>,x.<property_key2>,...)

You specify the name of the index, the node label it will be associated with, and the name of the property.

  • If an index already exists in the graph with the same name, no index is created.

  • If an index does not exist in the graph with the same name:

    • No index is created if there already is an index for that node label and property keys.

    • Otherwise, the index is created.

Creating the Composite index for a multiple properties of a node

We want to improve the performance of queries that test the year and runtime properties of a Movie node.

Execute this code to create the Composite index for these properties:

cypher
CREATE INDEX Movie_year_runtime IF NOT EXISTS
FOR (x:Movie)
ON (x.year, x.runtime)

Repeat the above query with the PROFILE. You should see that with the index, only 10 total db hits occur. The query plan now starts with the NodeIndexSeek() operation and uses the RANGE index you just created.

Composite index not used

If the query tests the existence of a property, it will not use this type of index.

For example, this query will not use the composite index if there are Movie nodes in the graph that do not have a year or runtime property.

cypher
PROFILE MATCH (m:Movie)
WHERE m.year IS NULL AND m.runtime IS NULL
RETURN m.title, m.year, m.runtime

If the query tests only one property, the index will not be used:

cypher
PROFILE MATCH (m:Movie)
WHERE m.year = 2000
RETURN m.title, m.year, m.runtime

Syntax for creating a Composite index for a multiple properties of a relationship

Here is the syntax for creating a Composite index for a multiple properties of a relationship:

cypher
CREATE INDEX <index_name> IF NOT EXISTS
FOR ()-[x:<RELATIONSHIP_TYPE>]-()
ON (x.<property_key1>, x.<property_key2>,...)

Check your understanding

1. Creating a Composite index

Suppose we have a graph that contains Company nodes. We want to be able to optimize queries that test the size of the company and location, both of which are properties of Company nodes. What is the correct statement to create this Composite index?

  • CREATE INDEX Company_size_location IF NOT EXISTS FOR (x:Company) ON (x.size, x.location)

  • CREATE b-tree Company_size_location IF NOT EXISTS ON (Company.size, Company.location)

  • CREATE b-tree Company_size_location IF NOT EXISTS FOR (x:Company) ON (x.size, x.location)

  • CREATE INDEX Company_size_location IF NOT EXISTS ON Company.size, Company.location

Hint

You are creating an index. The index type by default is RANGE. You must specify the two properties used for the index.

Solution

The correct code for creating the Composite index is:

CREATE INDEX Company_size_location IF NOT EXISTS FOR (x:Company) ON (x.size, x.location)

2. RANGE index usage

When is the RANGE composite index not used? (Select all that apply.)

  • ❏ It is always used for a predicate that contains at least one property from the index.

  • ✓ When the predicate tests for property existence.

  • ❏ When the total db hits will be more when the index is used.

  • ✓ When the predicate only tests one property.

Hint

There are two situations when a composite index will not be used

Solution

The composite index will not be used if:

  1. The predicate tests for property existence.

  2. The predicate only tests one property.

Summary

In this lesson, you learned how to create a Composite index for multiple properties of a node or relationship. In the next Challenge, you will create another Composite index and test it.