Create a Vector Index

Your next task is to create a vector index using Cypher.

You previously used a vector index to find similar text; you can also use a vector index to find similar images.

Movie posters

GraphAcademy has loaded a sample of 1000 movie poster embeddings into the sandbox. Each movie has a URL to a poster image:

cypher
MATCH (m:Movie {title: "Toy Story"})
RETURN m.title, m.poster

Toy Story movie poster

The data also contains embeddings for each poster:

cypher
MATCH (m:Movie {title: "Toy Story"})
RETURN m.title, m.posterEmbedding

Create a vector index

To search the movie poster embeddings, you must create a vector index. Review the following Cypher to create the vector index before running it:

cypher
CREATE VECTOR INDEX moviePosters IF NOT EXISTS
FOR (m:Movie)
ON m.posterEmbedding
OPTIONS {indexConfig: {
 `vector.dimensions`: 512,
 `vector.similarity_function`: 'cosine'
}}

You should note the following about the index:

  • It is named moviePosters

  • It is against the posterEmbedding properties on Movie nodes

  • The vector has 512 dimensions

  • The function used to compare vectors is cosine

More about dimensions

The model used to create the embeddings determines the number of dimensions in the vector.

In this case, we used the OpenAI Clip Model, which has 512 dimensions.

We created the movie plot embeddings using Open AI’s text-embedding-ada-002 model, which has 1536 dimensions.

Run the Cypher to create the vector index.

Check that you created the index successfully using the SHOW INDEXES command.

cypher
Show Indexes
SHOW VECTOR INDEXES

You should see a result similar to the following:

GenAI Beyond Chat with RAG, Knowledge Graphs and PythonShow Indexes Result

id

name

state

populationPercent

type

4

"moviePosters"

"ONLINE"

100.0

"VECTOR"

Once the state is listed as "ONLINE", the index will be ready to query.

The populationPercentage field indicates the proportion of node and property pairing. When the populationPercentage is 100.0, all the movie embeddings have been indexed.

Similar posters

You can use the db.index.vector.queryNodes procedure to find similar movie posters.

cypher
MATCH (m:Movie{title: "Babe"})

CALL db.index.vector.queryNodes('moviePosters', 6, m.posterEmbedding)
YIELD node, score

RETURN node.title, node.poster, score;

3 movie posters

Pick a different movie and write a similar Cypher query to find similar posters.

You can view the movies that have a poster embedding using this Cypher:

cypher
MATCH (m:Movie)
WHERE m.posterEmbedding IS NOT NULL
RETURN m.title, m.poster

Continue

When you are ready, you can move on to the next task.

Summary

You learned how to create a vector index in Neo4j.

Next, you will learn how to model unstructured data as a graph.