Understanding HybridRetriever
The HybridRetriever
combines vector similarity with full-text search to retrieve relevant nodes from your Neo4j recommendations database. This hybrid approach enhances the retrieval process by leveraging both semantic similarity and exact keyword matching, ensuring more accurate and comprehensive results.
How It Works
-
Vector Similarity:
-
Uses vector embeddings to find semantically similar nodes.
-
-
Full-text Search:
-
Utilizes full-text indexes to match exact keywords or phrases.
-
-
Aggregation:
-
Merges and ranks results from both retrieval methods to provide the top relevant nodes.
-
When to Use HybridRetriever
-
Comprehensive Search:
-
When you need both semantic and keyword-based retrieval.
-
-
Diverse Data:
-
Your graph contains a mix of structured and unstructured data.
-
-
Enhanced Relevance:
-
Aim to improve both precision and recall in search results.
-
Setting Up HybridRetriever
Follow these steps to set up and use the HybridRetriever
.
Open the 2-neo4j-graphrag\hybrid_retriever.py
file in your code editor.
1. Initialize the Embedder
Create the embedding function using OpenAI’s model:
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")
2. Initialize the HybridRetriever
Set up the HybridRetriever
with your Neo4j database and embedding model:
from neo4j_graphrag.retrievers import HybridRetriever
from neo4j_graphrag.llm import OpenAILLM
retriever = HybridRetriever(
driver=driver,
vector_index_name="moviePlots",
fulltext_index_name="plotFulltext",
embedder=embedder,
return_properties=["title", "plot"],
)
3. Using the Retriever
Utilize the HybridRetriever
to perform hybrid searches within your Neo4j database:
from neo4j_graphrag.generation import GraphRAG
llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})
rag = GraphRAG(retriever=retriever, llm=llm)
query_text = "What is the name of the movie set in 1375 in Imperial China?"
response = rag.search(query_text=query_text, retriever_config={"top_k": 5})
print(response.answer)
Tips for Effective Use
-
Consistent Embeddings:
-
Use the same model for both query and node embeddings to ensure compatibility.
-
-
Build Effective Fulltext Indexes:
-
Create full-text indexes on relevant properties to enhance keyword search capabilities.
-
-
Leverage Fulltext Indexes:
-
If you can leverage your full-text indexes effectively, the HybridRetriever becomes more useful by combining semantic and keyword-based search results.
-
Continue
When you are ready, you can move on to the next task.
Summary
You’ve learned how to use the HybridRetriever
for comprehensive semantic and keyword-based searches in Neo4j, enhancing your RAG pipeline by combining multiple retrieval strategies.