Text to Cypher retriever

The Text2CypherRetriever retriever allows you to create GraphRAG pipelines that can answer natural language questions by generating and executing Cypher queries against the knowledge graph.

Using text to cypher retrieval can help you get precise information from the knowledge graph based on user questions. For example, how many lessons are in a course, what concepts are covered in a module, or how technologies relate to each other.

In this lesson, you will create a text to cypher retriever and use it to answer questions about the data in knowledge graph.

Create a Text2CypherRetriever GraphRAG pipeline

Open genai-graphrag-python/text2cypher_rag.py and review the code.

python

import os
from dotenv import load_dotenv
load_dotenv()

from neo4j import GraphDatabase
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.retrievers import Text2CypherRetriever

# Connect to Neo4j database
driver = GraphDatabase.driver(
    os.getenv("NEO4J_URI"), 
    auth=(
        os.getenv("NEO4J_USERNAME"), 
        os.getenv("NEO4J_PASSWORD")
    )
)

llm = OpenAILLM(
    model_name="gpt-4o", 
    model_params={"temperature": 0}
)

# Cypher examples as input/query pairs
examples = [
    "USER INPUT: 'Find a node with the name $name?' QUERY: MATCH (node) WHERE toLower(node.name) CONTAINS toLower($name) RETURN node.name AS name, labels(node) AS labels",
]

# Build the retriever
retriever = Text2CypherRetriever(
    driver=driver,
    neo4j_database=os.getenv("NEO4J_DATABASE"),
    llm=llm,
    examples=examples,
)

rag = GraphRAG(
    retriever=retriever, 
    llm=llm
)

query_text = "How many technologies are mentioned in the knowledge graph?"

response = rag.search(
    query_text=query_text,
    return_context=True
    )

print(response.answer)
print("CYPHER :", response.retriever_result.metadata["cypher"])
print("CONTEXT:", response.retriever_result.items)

driver.close()

The retriever is configured to use your database connection and given an example of how to query nodes by name:

python

# Cypher examples as input/query pairs
examples = [
    "USER INPUT: 'Find a node with the name $name?' QUERY: MATCH (node) WHERE toLower(node.name) CONTAINS toLower($name) RETURN node.name AS name, labels(node) AS labels",
]

# Build the retriever
retriever = Text2CypherRetriever(
    driver=driver,
    neo4j_database=os.getenv("NEO4J_DATABASE"),
    llm=llm,
    examples=examples,
)

The response includes the Cypher statement that the LLM generated and the results from executing the query:

python

print(response.answer)
print("CYPHER :", response.retriever_result.metadata["cypher"])
print("CONTEXT:", response.retriever_result.items)

Running the code for the query, "How many technologies are mentioned in the knowledge graph?", will produce a response similar to:

114 technologies are mentioned in the knowledge graph.

The context shows that the LLM generated and executed the following Cypher query:

cypher

MATCH (t:Technology) RETURN count(t) AS technologyCount

The following data was returned from the knowledge graph and passed as context to the LLM:

python

[RetrieverResultItem(content='<Record technologyCount=114>', metadata=None)]

Experiment with Different Questions

The Text2CypherRetriever passed the graph schema to the LLM to help it generate accurate Cypher queries.

Try asking different questions about the knowledge graph such as:

How does Neo4j relate to other technologies?
What entities exist in the knowledge graph?
Which lessons cover Generative AI concepts?

Review the responses, the generated Cypher queries, and the results passed to the LLM.

Check your understanding

When would you use the Text2CypherRetriever in a GraphRAG pipeline?

❏ When you want to combine vector similarity with additional context from related graph nodes
✓ When you want to get precise information from the knowledge graph using natural language questions
❏ When you need to validate and correct automatically generated Cypher queries before execution
❏ When you want to find semantically similar content across multiple document chunks

Hint

Think about what makes text-to-cypher different from vector retrieval - what type of questions is it best suited for?

Solution

Text2CypherRetriever is used when you want to get precise information from the knowledge graph using natural language questions. It generates and executes Cypher queries to answer specific questions like "How many lessons are in a course?", "What concepts are covered in a module?", or "How do technologies relate to each other?". This is different from vector retrieval, which finds similar content, or entity extraction, which processes raw documents.

Lesson Summary

In this lesson, you:

Created a GraphRAG pipeline using the Text2CypherRetriever.
Used natural language questions to generate and execute Cypher queries against the knowledge graph.

In the next module you will explore how to customize the SimpleKGPipeline to create knowledge graphs for different types of data, scenarios, and use cases.

Constructing Knowledge Graphs with Neo4j GraphRAG for Python

Introduction

Knowledge Graph Pipeline

Retrieval

Customization

Text to Cypher retriever

Create a Text2CypherRetriever GraphRAG pipeline

Experiment with Different Questions

Check your understanding

When would you use the Text2CypherRetriever in a GraphRAG pipeline?

Lesson Summary

Chatbot