Understanding VectorRetriever
In previous tasks, you’ve built knowledge graphs by extracting entities and relationships from text. Now, we’ll enhance your retrieval capabilities using the VectorRetriever
. This retriever leverages vector embeddings to perform semantic searches, enabling you to find nodes based on the meaning of the content rather than just keywords.
How It Works
-
Query Embedding:
-
Converts the user’s query into a vector using an embedding model.
-
-
Similarity Search:
-
Finds nodes with embeddings similar to the query vector.
-
-
Retrieval:
-
Returns the top relevant nodes to augment the language model’s response.
-
When to Use VectorRetriever
-
Semantic Search:
-
When you need to find nodes based on the meaning of the content.
-
-
Unstructured Data:
-
Ideal for graphs containing text-rich nodes like products, reviews, or descriptions.
-
-
Precomputed Embeddings:
-
When nodes have embeddings stored as properties for efficient searching.
-
Setting Up VectorRetriever
Follow these steps to set up and use the VectorRetriever
.
Open the 2-neo4j-graphrag\vector_retriever.py
file in your code editor.
Step 1: Initialize the Embedder
Create the embedding function:
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")
Step 2: Initialize the VectorRetriever
Set up the VectorRetriever
with your Neo4j database and embedding model:
from neo4j_graphrag.retrievers import VectorRetriever
# Build the retriever
retriever = VectorRetriever(
driver,
index_name="moviePlots",
embedder=embedder,
return_properties=["title", "plot"],
)
Step 3: Using the Retriever
Utilize the VectorRetriever
to perform semantic searches within your Neo4j database:
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.llm import OpenAILLM
llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})
rag = GraphRAG(retriever=retriever, llm=llm)
query_text = "Give me 3 films where a hero goes on a journey"
response = rag.search(query_text=query_text, retriever_config={"top_k": 5})
print(response.answer)
Tips for Effective Use
-
Consistent Embeddings:
-
Use the same model for both query and node embeddings to ensure compatibility.
-
-
Property Retrieval:
-
Utilize the
return_properties
argument andresult_formatter
to retrieve and transform the desired properties effectively.
-
-
Adjust
top_k
:-
Since this retriever uses an approximate nearest neighbor algorithm, consider setting
top_k
to a higher value to improve the chances of retrieving relevant results.
-
Continue
When you are ready, you can move on to the next task.
Summary
You’ve learned how to use the VectorRetriever
for semantic searches in Neo4j, enhancing your RAG pipeline by providing relevant data to language models based on content meaning rather than just keywords.