The Text2CypherRetriever retriever allows you to create GraphRAG pipelines that can answer natural language questions by generating and executing Cypher queries against the knowledge graph.
Using text to cypher retrieval can help you get precise information from the knowledge graph based on user questions. For example, how many lessons are in a course, what concepts are covered in a module, or how technologies relate to each other.
Create a Text2CypherRetriever GraphRAG pipeline
Open genai-graphrag-python/text2cypher_rag.py and review the code.
import os
from dotenv import load_dotenv
load_dotenv()
from neo4j import GraphDatabase
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.retrievers import Text2CypherRetriever
# Connect to Neo4j database
driver = GraphDatabase.driver(
os.getenv("NEO4J_URI"),
auth=(
os.getenv("NEO4J_USERNAME"),
os.getenv("NEO4J_PASSWORD")
)
)
llm = OpenAILLM(
model_name="gpt-4o",
model_params={"temperature": 0}
)
# Cypher examples as input/query pairs
examples = [
"USER INPUT: 'Find a node with the name $name?' QUERY: MATCH (node) WHERE toLower(node.name) CONTAINS toLower($name) RETURN node.name AS name, labels(node) AS labels",
]
# Build the retriever
retriever = Text2CypherRetriever(
driver=driver,
neo4j_database=os.getenv("NEO4J_DATABASE"),
llm=llm,
examples=examples,
)
rag = GraphRAG(
retriever=retriever,
llm=llm
)
query_text = "How many technologies are mentioned in the knowledge graph?"
response = rag.search(
query_text=query_text,
return_context=True
)
print(response.answer)
print("CYPHER :", response.retriever_result.metadata["cypher"])
print("CONTEXT:", response.retriever_result.items)
driver.close()The retriever is configured to use your database connection and given an example of how to query nodes by name:
# Cypher examples as input/query pairs
examples = [
"USER INPUT: 'Find a node with the name $name?' QUERY: MATCH (node) WHERE toLower(node.name) CONTAINS toLower($name) RETURN node.name AS name, labels(node) AS labels",
]
# Build the retriever
retriever = Text2CypherRetriever(
driver=driver,
neo4j_database=os.getenv("NEO4J_DATABASE"),
llm=llm,
examples=examples,
)The response includes the Cypher statement that the LLM generated and the results from executing the query:
print(response.answer)
print("CYPHER :", response.retriever_result.metadata["cypher"])
print("CONTEXT:", response.retriever_result.items)Running the code for the query, "How many technologies are mentioned in the knowledge graph?", will produce a response similar to:
114 technologies are mentioned in the knowledge graph.
The context shows that the LLM generated and executed the following Cypher query:
MATCH (t:Technology) RETURN count(t) AS technologyCountThe following data was return from the knowledge graph and passed as context to the LLM:
[RetrieverResultItem(content='<Record technologyCount=114>', metadata=None)]Experiment with Different Questions
The Text2CypherRetriever passed the graph schema to the LLM to help it generate accurate Cypher queries.
Try asking different questions about the knowledge graph such as:
-
How does Neo4j relate to other technologies?
-
What entities exist in the knowledge graph?
-
Which lessons cover Generative AI concepts?
Review the responses, the generated Cypher queries, and the results passed to the LLM.
Check your understanding
When would you use the Text2CypherRetriever in a GraphRAG pipeline?
-
❏ When you want to combine vector similarity with additional context from related graph nodes
-
✓ When you want to get precise information from the knowledge graph using natural language questions
-
❏ When you need to validate and correct automatically generated Cypher queries before execution
-
❏ When you want to find semantically similar content across multiple document chunks
Hint
Think about what makes text-to-cypher different from vector retrieval - what type of questions is it best suited for?
Solution
Text2CypherRetriever is used when you want to get precise information from the knowledge graph using natural language questions. It generates and executes Cypher queries to answer specific questions like "How many lessons are in a course?", "What concepts are covered in a module?", or "How do technologies relate to each other?". This is different from vector retrieval, which finds similar content, or entity extraction, which processes raw documents.
Lesson Summary
In this lesson, you:
-
Created a
GraphRAGpipeline using theText2CypherRetriever. -
Used natural language questions to generate and execute Cypher queries against the knowledge graph.
In the next module you will explore how to customize the SimpleKGPipeline to create knowledge graphs for different types of data, scenarios, and use cases.