Your next task is to create a vector index using Cypher.
You previously used a vector index to find similar text; you can also use a vector index to find similar images.
Movie posters
GraphAcademy has loaded a sample of 1000 movie poster embeddings into the sandbox. Each movie has a URL to a poster image:
MATCH (m:Movie {title: "Toy Story"})
RETURN m.title, m.poster
The data also contains embeddings for each poster:
MATCH (m:Movie {title: "Toy Story"})
RETURN m.title, m.posterEmbedding
Create a vector index
To search the movie poster embeddings, you must create a vector index. Review the following Cypher to create the vector index before running it:
CREATE VECTOR INDEX moviePosters IF NOT EXISTS
FOR (m:Movie)
ON m.posterEmbedding
OPTIONS {indexConfig: {
`vector.dimensions`: 512,
`vector.similarity_function`: 'cosine'
}}
You should note the following about the index:
-
It is named
moviePosters
-
It is against the
posterEmbedding
properties onMovie
nodes -
The vector has
512
dimensions -
The function used to compare vectors is
cosine
More about dimensions
The model used to create the embeddings determines the number of dimensions in the vector.
In this case, we used the OpenAI Clip Model, which has 512 dimensions.
We created the movie plot embeddings using Open AI’s text-embedding-ada-002 model, which has 1536 dimensions.
Run the Cypher to create the vector index.
Check that you created the index successfully using the SHOW INDEXES
command.
SHOW VECTOR INDEXES
You should see a result similar to the following:
id |
name |
state |
populationPercent |
type |
4 |
"moviePosters" |
"ONLINE" |
|
"VECTOR" |
Once the state
is listed as "ONLINE", the index will be ready to query.
The populationPercentage
field indicates the proportion of node and property pairing.
When the populationPercentage
is 100.0
, all the movie embeddings have been indexed.
Similar posters
You can use the db.index.vector.queryNodes
procedure to find similar movie posters.
MATCH (m:Movie{title: "Babe"})
CALL db.index.vector.queryNodes('moviePosters', 6, m.posterEmbedding)
YIELD node, score
RETURN node.title, node.poster, score;
Pick a different movie and write a similar Cypher query to find similar posters.
You can view the movies that have a poster embedding using this Cypher:
MATCH (m:Movie)
WHERE m.posterEmbedding IS NOT NULL
RETURN m.title, m.poster
Continue
When you are ready, you can move on to the next task.
Summary
You learned how to create a vector index in Neo4j.
Next, you will learn how to model unstructured data as a graph.