Vector Retrieval

Understanding VectorRetriever

In previous tasks, you’ve built knowledge graphs by extracting entities and relationships from text. Now, we’ll enhance your retrieval capabilities using the VectorRetriever. This retriever leverages vector embeddings to perform semantic searches, enabling you to find nodes based on the meaning of the content rather than just keywords.

How It Works

  • Query Embedding:

    • Converts the user’s query into a vector using an embedding model.

  • Similarity Search:

    • Finds nodes with embeddings similar to the query vector.

  • Retrieval:

    • Returns the top relevant nodes to augment the language model’s response.

When to Use VectorRetriever

  • Semantic Search:

    • When you need to find nodes based on the meaning of the content.

  • Unstructured Data:

    • Ideal for graphs containing text-rich nodes like products, reviews, or descriptions.

  • Precomputed Embeddings:

    • When nodes have embeddings stored as properties for efficient searching.

Setting Up VectorRetriever

Follow these steps to set up and use the VectorRetriever.

Step 1: Initialize the Embedder

Create the embedding function:

python
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings

embedder = OpenAIEmbeddings(model="text-embedding-ada-002")

Step 2: Initialize the VectorRetriever

Set up the VectorRetriever with your Neo4j database and embedding model:

python
from neo4j_graphrag.retrievers import VectorRetriever

# Build the retriever
retriever = VectorRetriever(
    driver,
    index_name="moviePlots",
    embedder=embedder,
    return_properties=["title", "plot"],
)

Step 3: Using the Retriever

Utilize the VectorRetriever to perform semantic searches within your Neo4j database:

python
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.llm import OpenAILLM

llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})
rag = GraphRAG(retriever=retriever, llm=llm)
query_text = "Give me 3 films where a hero goes on a journey"
response = rag.search(query_text=query_text, retriever_config={"top_k": 5})
print(response.answer)

Tips for Effective Use

  • Consistent Embeddings:

    • Use the same model for both query and node embeddings to ensure compatibility.

  • Property Retrieval:

    • Utilize the return_properties argument and result_formatter to retrieve and transform the desired properties effectively.

  • Adjust top_k:

    • Since this retriever uses an approximate nearest neighbor algorithm, consider setting top_k to a higher value to improve the chances of retrieving relevant results.

Continue

When you are ready, you can move on to the next task.

Summary

You’ve learned how to use the VectorRetriever for semantic searches in Neo4j, enhancing your RAG pipeline by providing relevant data to language models based on content meaning rather than just keywords.