Create a Vector Index

To query embeddings, you need to create a vector index. A vector index significantly increases the speed of similarity searches by pre-computing the similarity between vectors and storing them in the index.

In this lesson, you will create vector indexes on the embedding property of the Question and Answer nodes.

Create the Question Index

You will use the CREATE VECTOR INDEX Cypher statement to create the index:

cypher

CREATE VECTOR INDEX Syntax

CREATE VECTOR INDEX [index_name] [IF NOT EXISTS]
FOR (n:LabelName)
ON (n.propertyName)
OPTIONS "{" option: value[, ...] "}"

CREATE VECTOR INDEX expects the following parameters:

index_name - The name of the index
LabelName - The node label on which to index
propertyName - The property on which to index
OPTIONS - The options for the index, where you can specify:
- vector.dimensions - The dimension of the embedding e.g. OpenAI embeddings consist of 1536 dimensions.
- vector.similarity_function - The similarity function to use when comparing values in this index - this can be euclidean or cosine.

Review and run the following Cypher to create the vector index:

cypher

Create the vector index

CREATE VECTOR INDEX questions IF NOT EXISTS
FOR (q:Question)
ON q.embedding
OPTIONS {indexConfig: {
 `vector.dimensions`: 1536,
 `vector.similarity_function`: 'cosine'
}}

Note that the index is called questions, is against the Question label, and is on the .embedding property. The vector.dimensions is 1536 (as used by OpenAI) and the vector.similarity_function is cosine. The IF NOT EXISTS clause ensures that the statement only creates the index if it does not already exist.

Run the statement to create the index.

Choosing a Similarity Function

Generally, cosine will perform best for text embeddings, but you may want to experiment with other functions.

Typically, you will choose a similarity function closest to the loss function used when training the embedding model. You should refer to the model’s documentation for more information.

Check the index creation status

The index will be updated asynchronously. You can check the status of the index population using the SHOW INDEXES statement:

Check that you created the index successfully using the SHOW INDEXES command.

cypher

Show Indexes

SHOW INDEXES WHERE type = "VECTOR"

You should see a result similar to the following:

Understand and search unstructured data using vector indexesShow Indexes Result
id	name	state	populationPercent	type
1	"questions"	"ONLINE"	`100.0`	"VECTOR"

Once the state is listed as online, the index will be ready to query.

The populationPercentage field indicates the proportion of node and property pairing.

When the populationPercentage is 100.0, all the question embeddings have been indexed.

Check your understanding

Create Vector Index

Your task is to create a vector index on authors' biographies.

The database contains Author nodes that have name, biography, and biographyEmbedding properties.

The biographyEmbedding property is a vector representation of the biography.

Select the correct syntax to create the vector index.

cypher

CREATE VECTOR INDEX authors IF NOT EXISTS
/*select:FOR (a:Author) ON a.biographyEmbedding*/
OPTIONS {indexConfig: {
 `vector.dimensions`: 1536,
 `vector.similarity_function`: 'cosine'
}}

❏ FOR (a:Author) ON a.biography
❏ FOR (a:Author) ON a.embedding
✓ FOR (a:Author) ON a.biographyEmbedding

Hint

Embeddings are vectors that represent the data. You create the vector index on the embedding of the biography.

Solution

You create the vector index on the biographyEmbedding property of the Author nodes.

cypher

CREATE VECTOR INDEX authors IF NOT EXISTS
FOR (a:Author) ON a.biographyEmbedding

Lesson Summary

In this lesson, you learned how to create a vector index using the CREATE VECTOR INDEX Cypher statement.

In the next lesson, you will use what you have learned to create a vector index for the Answer nodes.

Introduction to Vector Indexes and Unstructured Data

Introduction

Vector indexes

Importing unstructured data

Create a Vector Index

Create the Question Index

Check the index creation status

Check your understanding

Create Vector Index

Lesson Summary

Chatbot