Create a Vector Index

Movie posters

Your next task is to create a vector index using Cypher.

You previously used a vector index to find similar text; you can also use a vector index to find similar images.

GraphAcademy has loaded a sample of 1000 movie poster embeddings into the sandbox.

Each movie has a URL to a poster image and embeddings for the poster:

cypher
MATCH (m:Movie {title: "Toy Story"})
RETURN m.title, m.poster, m.posterEmbedding
Toy Story movie poster

Create a vector index

To search the movie poster embeddings, you must create a vector index. Review the following Cypher to create the vector index before running it.

cypher
CREATE VECTOR INDEX moviePosters IF NOT EXISTS
FOR (m:Movie)
ON m.posterEmbedding
OPTIONS {indexConfig: {
 `vector.dimensions`: 512,
 `vector.similarity_function`: 'cosine'
}}

You should note the following about the index:

  • It is named moviePosters

  • It is against the posterEmbedding properties on Movie nodes

  • The vector has 512 dimensions

  • The function used to compare vectors is cosine

More about dimensions

The model used to create the embeddings determines the number of dimensions in the vector.

In this case, we used the OpenAI Clip Model, which has 512 dimensions.

We created the movie plot embeddings using Open AI’s text-embedding-ada-002 model, which has 1536 dimensions.

Run the Cypher to create the vector index.

Check the index exists

Check that you created the index successfully using the SHOW INDEXES command.

cypher
Show Indexes
SHOW VECTOR INDEXES

You should see a result similar to the following:

GenAI Beyond Chat with RAG, Knowledge Graphs and PythonShow Indexes Result

id

name

state

populationPercent

type

4

"moviePosters"

"ONLINE"

100.0

"VECTOR"

Once the state is listed as "ONLINE", the index will be ready to query.

The populationPercentage field indicates the proportion of node and property pairing. When the populationPercentage is 100.0, all the movie embeddings have been indexed.

Similar posters

You can use the db.index.vector.queryNodes procedure to find similar movie posters.

cypher
MATCH (m:Movie{title: "Babe"})

CALL db.index.vector.queryNodes
    ('moviePosters', 6, m.posterEmbedding)
YIELD node, score

RETURN node.title, node.poster, score;
3 movie posters

Find a similar poster

Pick a different movie and update the Cypher query to find similar posters.

You can view the movies that have a poster embedding using this Cypher:

cypher
MATCH (m:Movie)
WHERE m.posterEmbedding IS NOT NULL
RETURN m.title, m.poster

Summary

You learned how to create a vector index in Neo4j.

Next, you will learn how to model unstructured data as a graph.