To improve the accuracy of the generated Cypher queries you can customize the generation prompt for your data requirements.
In this lesson, you will learn how to provide specific instructions and examples queries to improve Cypher query generation.
Prompt
You can provide a custom prompt to the GraphCypherQAChain
.
You can tailor the prompt to your use case to generate more accurate Cypher queries.
Update the cypher_qa.py
program to include a custom prompt:
from langchain_core.prompts.prompt import PromptTemplate
# Cypher template
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
The question is:
{question}"""
cypher_prompt = PromptTemplate(
input_variables=["schema", "question"],
template=cypher_template
)
The prompt includes instructions for generating Cypher queries including parameters for the schema
, and question
.
When invoked the GraphCypherQAChain
will insert the schema
and question
parameters into the prompt.
Add the custom prompt to the GraphCypherQAChain
:
cypher_qa = GraphCypherQAChain.from_llm(
graph=graph,
llm=model,
cypher_llm=cypher_model,
cypher_prompt=cypher_prompt,
allow_dangerous_requests=True,
verbose=True,
)
Specific instructions
To manage specific data or business rules, you can provide specific instructions to the LLM when generating the Cypher.
For example, movie titles that start with "The" are stored in the graph as "Matrix, The" instead of "The Matrix".
Asking the LLM to generate Cypher queries without this information will result in no data being returned.
[user] Who acted in the movie The Matrix?
[assistant] I don't know.
Update the cypher_template
to include a specific instruction to the LLM to handle this case:
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".
Click to view the complete prompt
# Cypher template with additional instructions
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".
Schema:
{schema}
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
The question is:
{question}"""
Update the code to ask the question, "Who acted in the movie The Matrix?, and review the results.
Examples
You can provide examples of questions and relevant Cypher queries to help the LLM generate more accurate Cypher queries.
Questions that relate to movies ratings often generate ambiguous or incorrect Cypher.
This is because the rating is a property of the RATED
relationship, and the Movie
node also includes a imdbRating
property.
Cypher examples should describe the query and the expected Cypher query, for example:
Question: Get user ratings? Cypher: MATCH (u:User)-[r:RATED]->(m:Movie) WHERE u.name = "User name" RETURN r.rating AS userRating
Update the cypher_template
to include the examples relating to movie ratings:
# Cypher template with examples
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".
Schema:
{schema}
Examples:
1. Question: Get user ratings?
Cypher: MATCH (u:User)-[r:RATED]->(m:Movie) WHERE u.name = "User name" RETURN r.rating AS userRating
2. Question: Get average rating for a movie?
Cypher: MATCH (m:Movie)<-[r:RATED]-(u:User) WHERE m.title = 'Movie Title' RETURN avg(r.rating) AS userRating
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
The question is:
{question}"""
Click to view the complete code
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_neo4j import Neo4jGraph
from langchain_neo4j import GraphCypherQAChain
from langchain.chat_models import init_chat_model
from langchain_core.prompts.prompt import PromptTemplate
model = init_chat_model(
"gpt-4o",
model_provider="openai"
)
cypher_model = init_chat_model(
"gpt-4o-mini",
model_provider="openai",
temperature=0.0
)
graph = Neo4jGraph(
url=os.getenv("NEO4J_URI"),
username=os.getenv("NEO4J_USERNAME"),
password=os.getenv("NEO4J_PASSWORD"),
)
# Cypher template with examples
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".
Schema:
{schema}
Examples:
1. Question: Get user ratings?
Cypher: MATCH (u:User)-[r:RATED]->(m:Movie) WHERE u.name = "User name" RETURN r.rating AS userRating
2. Question: Get average rating for a movie?
Cypher: MATCH (m:Movie)<-[r:RATED]-(u:User) WHERE m.title = 'Movie Title' RETURN avg(r.rating) AS userRating
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
The question is:
{question}"""
cypher_prompt = PromptTemplate(
input_variables=["schema", "question"],
template=cypher_template
)
cypher_qa = GraphCypherQAChain.from_llm(
graph=graph,
llm=model,
cypher_llm=cypher_model,
cypher_prompt=cypher_prompt,
allow_dangerous_requests=True,
verbose=True,
)
question = "What was the release date of the movie The 39 Steps?"
response = cypher_qa.invoke({"query": question})
print(response["result"])
Genres
The database contains data about movie genres.
When generating more complex Cypher queries, such as those that involve genres, the LLM may not generate the correct Cypher query.
These queries may require a specific example on how to retrieve genres from the graph:
-
What is the highest user rated movie in the Horror genre?
-
How many Sci-Fi movies has Tom Hanks acted in?
Your challenge is to provide an example Cypher query that demonstrates how to retrieve genres from the graph.
Click to view an example solution
There is no right or wrong solution. Here is an example solution that provides a Cypher query to retrieve genres:
# Cypher template with examples
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".
Schema:
{schema}
Examples:
1. Question: Get user ratings?
Cypher: MATCH (u:User)-[r:RATED]->(m:Movie) WHERE u.name = "User name" RETURN r.rating AS userRating
2. Question: Get average rating for a movie?
Cypher: MATCH (m:Movie)<-[r:RATED]-(u:User) WHERE m.title = 'Movie Title' RETURN avg(r.rating) AS userRating
3. Question: Get movies for a genre?
Cypher: MATCH ((m:Movie)-[:IN_GENRE]->(g:Genre) WHERE g.name = 'Genre Name' RETURN m.title AS movieTitle
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
The question is:
{question}"""
The example is generic enough to be used for any query that involves genres.
Continue
When you are ready, continue to the next lesson.
Lesson Summary
In this lesson, you learned how you can improve the quality of the generated Cypher queries by customizing the prompt and providing specific instructions to the LLM.
In the next lesson, you will learn how to restrict the schema used to generate Cypher queries.