Cypher Generation

To improve the accuracy of the generated Cypher queries you can customize the generation prompt for your data requirements.

In this lesson, you will learn how to provide specific instructions and examples queries to improve Cypher query generation.

Prompt

You can provide a custom prompt to the GraphCypherQAChain. You can tailor the prompt to your use case to generate more accurate Cypher queries.

Update the cypher_qa.py program to include a custom prompt:

python

from langchain_core.prompts.prompt import PromptTemplate

# Cypher template
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.

Schema:
{schema}

Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.

The question is:
{question}"""

cypher_prompt = PromptTemplate(
    input_variables=["schema", "question"], 
    template=cypher_template
)

The prompt includes instructions for generating Cypher queries including parameters for the schema, and question. When invoked the GraphCypherQAChain will insert the schema and question parameters into the prompt.

Add the custom prompt to the GraphCypherQAChain:

python

cypher_qa = GraphCypherQAChain.from_llm(
    graph=graph, 
    llm=model, 
    cypher_llm=cypher_model,
    cypher_prompt=cypher_prompt,
    allow_dangerous_requests=True,
    verbose=True,
)

Specific instructions

To manage specific data or business rules, you can provide specific instructions to the LLM when generating the Cypher.

For example, movie titles that start with "The" are stored in the graph as "Matrix, The" instead of "The Matrix".

Asking the LLM to generate Cypher queries without this information will result in no data being returned.

[user]
Who acted in the movie The Matrix?

[assistant]
I don't know.

Update the cypher_template to include a specific instruction to the LLM to handle this case:

For movie titles that begin with "The", move "the" to the end,
for example "The 39 Steps" becomes "39 Steps, The".

Click to view the complete prompt

python

# Cypher template with additional instructions
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".

Schema:
{schema}

Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.

The question is:
{question}"""

Update the code to ask the question, "Who acted in the movie The Matrix?, and review the results.

Examples

You can provide examples of questions and relevant Cypher queries to help the LLM generate more accurate Cypher queries.

Questions that relate to movies ratings often generate ambiguous or incorrect Cypher. This is because the rating is a property of the RATED relationship, and the Movie node also includes a imdbRating property.

Cypher examples should describe the query and the expected Cypher query, for example:

Question: Get user ratings?
Cypher: MATCH (u:User)-[r:RATED]->(m:Movie)
        WHERE u.name = "User name"
        RETURN r.rating AS userRating

Update the cypher_template to include the examples relating to movie ratings:

python

# Cypher template with examples
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".

Schema:
{schema}
Examples:
1. Question: Get user ratings?
   Cypher: MATCH (u:User)-[r:RATED]->(m:Movie) WHERE u.name = "User name" RETURN r.rating AS userRating
2. Question: Get average rating for a movie?
   Cypher: MATCH (m:Movie)<-[r:RATED]-(u:User) WHERE m.title = 'Movie Title' RETURN avg(r.rating) AS userRating

Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.

The question is:
{question}"""

Click to view the complete code

python

import os
from dotenv import load_dotenv
load_dotenv()

from langchain_neo4j import Neo4jGraph
from langchain_neo4j import GraphCypherQAChain
from langchain.chat_models import init_chat_model
from langchain_core.prompts.prompt import PromptTemplate

model = init_chat_model(
    "gpt-4o", 
    model_provider="openai"
)

cypher_model = init_chat_model(
    "gpt-4o-mini", 
    model_provider="openai",
    temperature=0.0
)

graph = Neo4jGraph(
    url=os.getenv("NEO4J_URI"),
    username=os.getenv("NEO4J_USERNAME"), 
    password=os.getenv("NEO4J_PASSWORD"),
)



# Cypher template with examples
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".

Schema:
{schema}
Examples:
1. Question: Get user ratings?
   Cypher: MATCH (u:User)-[r:RATED]->(m:Movie) WHERE u.name = "User name" RETURN r.rating AS userRating
2. Question: Get average rating for a movie?
   Cypher: MATCH (m:Movie)<-[r:RATED]-(u:User) WHERE m.title = 'Movie Title' RETURN avg(r.rating) AS userRating

Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.

The question is:
{question}"""


cypher_prompt = PromptTemplate(
    input_variables=["schema", "question"], 
    template=cypher_template
)

cypher_qa = GraphCypherQAChain.from_llm(
    graph=graph, 
    llm=model, 
    cypher_llm=cypher_model,
    cypher_prompt=cypher_prompt,
    allow_dangerous_requests=True,
    verbose=True,
)

question = "What was the release date of the movie The 39 Steps?"
response = cypher_qa.invoke({"query": question})
print(response["result"])

Genres

The database contains data about movie genres.

When generating more complex Cypher queries, such as those that involve genres, the LLM may not generate the correct Cypher query.

These queries may require a specific example on how to retrieve genres from the graph:

What is the highest user rated movie in the Horror genre?
How many Sci-Fi movies has Tom Hanks acted in?

Your challenge is to provide an example Cypher query that demonstrates how to retrieve genres from the graph.

Click to view an example solution

There is no right or wrong solution. Here is an example solution that provides a Cypher query to retrieve genres:

python

# Cypher template with examples
cypher_template = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, for example "The 39 Steps" becomes "39 Steps, The".

Schema:
{schema}
Examples:
1. Question: Get user ratings?
   Cypher: MATCH (u:User)-[r:RATED]->(m:Movie) WHERE u.name = "User name" RETURN r.rating AS userRating
2. Question: Get average rating for a movie?
   Cypher: MATCH (m:Movie)<-[r:RATED]-(u:User) WHERE m.title = 'Movie Title' RETURN avg(r.rating) AS userRating
3. Question: Get movies for a genre?
   Cypher: MATCH ((m:Movie)-[:IN_GENRE]->(g:Genre) WHERE g.name = 'Genre Name' RETURN m.title AS movieTitle

Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.

The question is:
{question}"""

The example is generic enough to be used for any query that involves genres.

Continue

When you are ready, continue to the next lesson.

Lesson Summary

In this lesson, you learned how you can improve the quality of the generated Cypher queries by customizing the prompt and providing specific instructions to the LLM.

In the next lesson, you will learn how to restrict the schema used to generate Cypher queries.

Using Neo4j with LangChain

Neo4j and LangChain

Vectors

Text to Cypher

Cypher Generation

Prompt

Specific instructions

Examples

Genres

Continue

Lesson Summary

Chatbot