In this lesson, you will learn how to query a vector index.
You will use the Question
and Answer
embeddings to find similar responses.
Querying with Embeddings
When querying a vector index, you have to query with an embedding.
For example, you want to use the vector index to find questions similar to the text "What are examples of good open-source projects?". You would first get an embedding of the text. Then, you would use the embedding to query the vector index.
You are going to explore two scenarios:
-
A user views an existing question and wants to see similar questions.
-
A user submits a new question and receives answers to a similar question.
In the first scenario, you will use existing question embeddings to find similar questions. In the second scenario, you will generate a new embedding for the user’s question to find similar questions and answers.
These scenarios will help you understand how to query the vector index to find similar questions and answers.
Finding similar questions
You can use the questions
and answers
vector indexes to find questions that are similar to each other.
A user views an existing question and wants to see similar questions.
The following Cypher query finds similar questions to the question "What are the most touristic countries in the world?".
Review the query before running it and observing the results.
MATCH (q:Question {text: "What are the most touristic countries in the world?"})
CALL db.index.vector.queryNodes('questions', 6, q.embedding)
YIELD node, score
RETURN node.text, score
Breaking down the query, you can identify the following:
-
The
MATCH
clause finds the specificQuestion
node. -
The query uses the
db.index.vector.queryNodes
function to query thequestions
vector index with theQuestion
node’s embedding -q.embedding
. The function returns the top6
similar nodes. -
YIELD
obtain thenode
and similarityscore
returned by the function. -
The query returns the
Question
node’stext
property and the similarity score.
You can extend this query to return the answers to the most similar questions:
MATCH (q:Question {text: "What are the most touristic countries in the world?"})
CALL db.index.vector.queryNodes('questions', 6, q.embedding)
YIELD node, score
MATCH (node)-[:ANSWERED_BY]->(a)
RETURN a.text, score
The query uses the node
and the ANSWERED_BY
relationship to find the answers.
Run the query and observe the results. You will notice that the top answers returned are similar to the question. As you get further down the list, the similarity score decreases and so does the relevance of the answers.
Finding answers to a similar question
To improve the user’s experience when asking a new question, you could use the vector index to find similar questions and answers.
To achieve this, you need to generate an embedding for the user’s new question and use it to query the vector index.
You can generate a new embedding in Cypher using the genai.vector.encode
function:
genai.vector.encode(
resource :: STRING,
provider :: STRING,
configuration :: MAP = {}
) :: LIST<FLOAT>
You pass the text you want to encode as the resource
parameter.
You can use embedding models from different providers
, such as OpenAI, Vertex AI, and Amazon Bedrock.
Provider-specific details like, API keys, are passed in the configuration
map.
For example, you can use the OpenAI
provider to generate an embedding passing the API key as token
in the configuration
map:
WITH genai.vector.encode("Test", "OpenAI", { token: "sk-..." }) AS embedding
RETURN embedding
OpenAI API key
To run this query, you must replace the token
value with your OpenAI API key.
You can incorporate the embedding into your query to find similar questions:
WITH genai.vector.encode(
"What are good open source projects",
"OpenAI",
{ token: "sk-..." }) AS userEmbedding
CALL db.index.vector.queryNodes('questions', 6, userEmbedding)
YIELD node, score
RETURN node.text, score
This query, creates an embedding using genai.vector.encode
and then uses that embedding to query the questions
vector index.
Try changing the text and observe the results.
Can you modify this query to work the same as the previous query and return the answers to the most similar questions?
Check Your Understanding
Query Vector Index
True or False - you can pass the text you wish to search for to db.index.vector.queryNodes
.
-
❏ True
-
✓ False
Hint
A vector index can only search for text embeddings.
Solution
The statement is False.
db.index.vector.queryNodes
requires an embedding to be passed to it, not text.
Lesson Summary
In this lesson, you learned how to query a vector index and generate embeddings using Cypher.
In the next module, you will learn how to import unstructured data into Neo4j using Python.