Cypher QA Chain

In this lesson, you will use the Cypher QA (question-answering) chain to query the graph using natural language queries.

Cypher QA Chain

The LangChain GraphCypherQAChain:

Accepts a question.
Converts the question into a Cypher query using the graph schema.
Executes the query
Uses the result to generate an answer.

If asked the question "What year was the movie Babe released?", the chain will generate messages like:

[human]
What year was the movie Babe released?
[system]
Generate a Cypher query based on this question and this graph schema.
[assistant]
MATCH (m:Movie)
WHERE m.title = 'Babe'
RETURN m.released

The Cypher query is the executed and the result returned.

[system]
Generate an answer based on these results [{m.released_year: 1995}].
[assistant]
The movie Babe was released in 1995.

Open the genai_integration_langchain/cypher_qa.py file and review the code:

python

cypher_qa.py

import os
from dotenv import load_dotenv
load_dotenv()

from langchain.chat_models import init_chat_model
from langchain_neo4j import Neo4jGraph

# Initialize the LLM
model = init_chat_model(
    "gpt-4o", 
    model_provider="openai"
)

# Connect to Neo4j
graph = Neo4jGraph(
    url=os.getenv("NEO4J_URI"),
    username=os.getenv("NEO4J_USERNAME"), 
    password=os.getenv("NEO4J_PASSWORD"),
)

# Create the Cypher QA chain

# Invoke the chain

You will need to:

Create a GraphCypherQAChain instance.
Invoke the chain with a question.
Parse the results

Update the code to create a GraphCypherQAChain instance:

python

from langchain_neo4j import GraphCypherQAChain

# Create the Cypher QA chain
cypher_qa = GraphCypherQAChain.from_llm(
    graph=graph, 
    llm=model, 
    allow_dangerous_requests=True,
    verbose=True, 
)

The GraphCypherQAChain requires the graph connection and an LLM model to generate the Cypher query and response.

Allow Dangerous Requests

You are trusting the generation of Cypher to the LLM. It may generate invalid Cypher queries that could corrupt data in the graph or provide access to sensitive information.

You have to opt-in to this risk by setting the allow_dangerous_requests flag to True.

In a production environment, you should ensure that access to data is limited, and sufficient security is in place to prevent malicious queries. This could include the use of a read only user or role based access control.

Invoke the chain with a question and print the result:

python

# Invoke the chain
question = "How many movies are in the Sci-Fi genre?"
response = cypher_qa.invoke({"query": question})
print(response["result"])

Click to view the complete code

python

import os
from dotenv import load_dotenv
load_dotenv()

from langchain.chat_models import init_chat_model
from langchain_neo4j import Neo4jGraph
from langchain_neo4j import GraphCypherQAChain

# Initialize the LLM
model = init_chat_model(
    "gpt-4o", 
    model_provider="openai"
)

# Connect to Neo4j
graph = Neo4jGraph(
    url=os.getenv("NEO4J_URI"),
    username=os.getenv("NEO4J_USERNAME"), 
    password=os.getenv("NEO4J_PASSWORD"),
)

# Create the Cypher QA chain
cypher_qa = GraphCypherQAChain.from_llm(
    graph=graph, 
    llm=model, 
    allow_dangerous_requests=True,
    verbose=True, 
)

# Invoke the chain
question = "How many movies are in the Sci-Fi genre?"
response = cypher_qa.invoke({"query": question})
print(response["result"])

Run the code and review the results. You should see an output similar to:

> Entering new GraphCypherQAChain chain...
Generated Cypher:
cypher
MATCH (m:Movie {title: "Babe"})
RETURN m.year AS releaseYear

Full Context:
[{'releaseYear': 1995}]

> Finished chain.
The movie Babe was released in 1995.

Verbose

Setting the GraphCypherQAChain verbose parameter to True will print the generated Cypher query and the full context used to generate the answer.

Experiment with different questions to see how the Cypher QA chain generates different Cypher queries and answers, for example:

Who acted in the movie Aliens?
Who directed the movie Superman?
What is the plot of the movie Toy Story?
How many movies are in the Sci-Fi genre?

Generated Cypher

The LLM may not always understand the graph schema or the question correctly. This can lead to the generated Cypher queries being incorrect or inefficient.

You will explore different ways to improve the quality of the generated Cypher queries in the next lesson.

Cypher LLM

You can use different LLMs to generate the Cypher query and the answer.

This is useful as the requirements for generating a Cypher query maybe different from generating answer.

Modify the program to include a different LLM for the Cypher query generation:

python

cypher_model = init_chat_model(
    "gpt-4o", 
    model_provider="openai",
    temperature=0.0
)

Temperature

The temperature is set to 0. When generating Cypher queries, you want the output to be deterministic and precise.

Update the GraphCypherQAChain to use the new LLM:

python

cypher_qa = GraphCypherQAChain.from_llm(
    graph=graph, 
    llm=model, 
    cypher_llm=cypher_model,
    allow_dangerous_requests=True,
    verbose=True,
)

Choosing the right LLM for Cypher generation and answer generation can improve the quality of the results.

Check your understanding

Why Use Different LLMs?

Why might you choose to use different LLMs for generating the Cypher query and generating the answer in the Cypher QA chain?

❏ To reduce the number of requests sent to the LLMs
✓ Because generating Cypher queries and generating answers may have different requirements
❏ To make the generation run faster
❏ Because only some LLMs can connect to Neo4j

Hint

Think about the differences between writing a database query and writing a natural language answer. What kinds of skills or capabilities might be needed for each task?

Solution

The answer is Because generating Cypher queries and generating answers may require different strengths or capabilities from the LLM.

Generating a Cypher query and generating a natural language answer are two different tasks that may require different strengths from an LLM. Some models may be better at producing precise, structured queries, while others may excel at generating clear, conversational answers.

Using different LLMs allows you to choose the best model for each part of the process, improving the overall quality of the results.

Lesson Summary

In this lesson, you used the Cypher QA chain to query the graph using natural language queries.

In the next lesson, you will learn how to enhance the Cypher generation prompt to improve the quality of the generated Cypher queries.

Using Neo4j with LangChain

Neo4j and LangChain

Vectors

Text to Cypher

Cypher QA Chain

Cypher QA Chain

Cypher LLM

Check your understanding

Why Use Different LLMs?

Lesson Summary

Chatbot