Hybrid Retrieval

Understanding HybridRetriever

The HybridRetriever combines vector similarity with full-text search to retrieve relevant nodes from your Neo4j recommendations database. This hybrid approach enhances the retrieval process by leveraging both semantic similarity and exact keyword matching, ensuring more accurate and comprehensive results.

HybridRetriever

How It Works

  • Vector Similarity:

    • Uses vector embeddings to find semantically similar nodes.

  • Full-text Search:

    • Utilizes full-text indexes to match exact keywords or phrases.

  • Aggregation:

    • Merges and ranks results from both retrieval methods to provide the top relevant nodes.

When to Use HybridRetriever

  • Comprehensive Search:

    • When you need both semantic and keyword-based retrieval.

  • Diverse Data:

    • Your graph contains a mix of structured and unstructured data.

  • Enhanced Relevance:

    • Aim to improve both precision and recall in search results.

Setting Up HybridRetriever

Follow these steps to set up and use the HybridRetriever.

Open the 2-neo4j-graphrag\hybrid_retriever.py file in your code editor.

1. Initialize the Embedder

Create the embedding function using OpenAI’s model:

python
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings

embedder = OpenAIEmbeddings(model="text-embedding-ada-002")

2. Initialize the HybridRetriever

Set up the HybridRetriever with your Neo4j database and embedding model:

python
from neo4j_graphrag.retrievers import HybridRetriever
from neo4j_graphrag.llm import OpenAILLM

retriever = HybridRetriever(
    driver=driver,
    vector_index_name="moviePlots",
    fulltext_index_name="plotFulltext",
    embedder=embedder,
    return_properties=["title", "plot"],
)

3. Using the Retriever

Utilize the HybridRetriever to perform hybrid searches within your Neo4j database:

python
from neo4j_graphrag.generation import GraphRAG

llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})
rag = GraphRAG(retriever=retriever, llm=llm)
query_text = "What is the name of the movie set in 1375 in Imperial China?"
response = rag.search(query_text=query_text, retriever_config={"top_k": 5})
print(response.answer)

Tips for Effective Use

  • Consistent Embeddings:

    • Use the same model for both query and node embeddings to ensure compatibility.

  • Build Effective Fulltext Indexes:

    • Create full-text indexes on relevant properties to enhance keyword search capabilities.

  • Leverage Fulltext Indexes:

    • If you can leverage your full-text indexes effectively, the HybridRetriever becomes more useful by combining semantic and keyword-based search results.

Continue

When you are ready, you can move on to the next task.

Summary

You’ve learned how to use the HybridRetriever for comprehensive semantic and keyword-based searches in Neo4j, enhancing your RAG pipeline by combining multiple retrieval strategies.