Vector Retrieval with Graph Traversal

Understanding VectorCypherRetriever

The VectorCypherRetriever combines vector similarity with Cypher queries to retrieve relevant nodes from your Neo4j recommendations database. This integration allows for more refined searches by applying graph-specific filters alongside semantic similarity.

Try this Cypher command in sandbox:

cypher
MATCH (m:Movie {title: "Jumanji"})
MATCH (actor:Actor)-[:ACTED_IN]->(m)
RETURN m, collect(actor) AS actors;

How It Works

  • Cypher Filtering:

    • Applies a Cypher query to filter nodes based on specific criteria.

  • Vector Similarity:

    • Utilizes vector embeddings to find semantically similar nodes within the filtered results.

  • Retrieval:

    • Returns the top relevant nodes that match both the Cypher filters and vector similarity.

When to Use VectorCypherRetriever

  • Combined Filtering and Similarity:

    • When you need to filter nodes based on properties or relationships before performing a similarity search.

  • Structured and Unstructured Data:

    • Your graph contains both structured properties and unstructured text data.

  • Advanced Querying:

    • Require complex retrieval logic that leverages Cypher’s capabilities alongside vector similarity.

Setting Up VectorCypherRetriever

Follow these steps to set up and use the VectorCypherRetriever.

1. Initialize the Embedder

Create the embedding function:

python
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings

embedder = OpenAIEmbeddings(model="text-embedding-ada-002")

2. Initialize the VectorCypherRetriever

Set up the VectorCypherRetriever with your Neo4j database and embedding model:

python
from neo4j_graphrag.retrievers import VectorCypherRetriever

retrieval_query = """
MATCH
(actor:Actor)-[:ACTED_IN]->(node)
RETURN
node.title AS movie_title,
node.plot AS movie_plot,
collect(actor.name) AS actors;
"""

retriever = VectorCypherRetriever(
    driver,
    index_name="moviePlots",
    embedder=embedder,
    retrieval_query=retrieval_query,
)

3. Using the Retriever

Utilize the VectorCypherRetriever to perform semantic searches within your Neo4j database:

python
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.llm import OpenAILLM

llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})
rag = GraphRAG(retriever=retriever, llm=llm)
query_text = "Who were the actors in the movie about the magic jungle board game?"
response = rag.search(query_text=query_text, retriever_config={"top_k": 5})
print(response.answer)

Tips for Effective Use

  • Consistent Embeddings:

    • Use the same model for both query and node embeddings to ensure compatibility.

  • Property Retrieval:

    • Utilize result_formatter to retrieve and transform the desired properties effectively.

  • Leverage Cypher Proficiency:

    • The node variable is provided in the Cypher query, so leveraging your Cypher skills can maximize the effectiveness of this retriever by crafting more precise and efficient queries.

Continue

When you are ready, you can move on to the next task.

Summary

You’ve learned how to use VectorCypherRetriever to perform filtered semantic searches in Neo4j, enhancing your RAG pipeline by combining Cypher queries with vector similarity.