Have you ever heard of Cunningham’s Law? Cunningham’s Law states:
The best way to get the correct answer on the internet is not to ask a question; it’s to post the wrong answer.
It also seems that this is true when it comes to LLMs!
While LLMs are good at generating Cypher statements, they really excel at correcting the statements they have written.
To complete this challenge, you must modify the initCypherEvaluationChain()
function in modules/agent/tools/cypher/cypher-evaluation.chain.ts
to return a chain that validates the provided Cypher statement for accuracy and corrects where necessary.
-
Create a prompt instructing the LLM to analyze a Cypher statement and return a list of errors.
-
Create a chain that replaces placeholders in the prompt for
schema
,question
,cypher
, anderrors
. -
Pass the formatted prompt to the LLM
-
Parse the output as a JSON object.
This chain will recursively correct the Cypher statement generated by the LLM.
Opencypher-evaluation.chain.ts
→
Prompt Template
In the initCypherEvaluationChain()
function, use the PromptTemplate.fromTemplate()
method to create a new prompt template with the following prompt.
You are an expert Neo4j Developer evaluating a Cypher statement written by an AI.
Check that the cypher statement provided below against the database schema to check that
the statement will answer the user's question.
Fix any errors where possible.
The query must:
* Only use the nodes, relationships and properties mentioned in the schema.
* Assign a variable to nodes or relationships when intending to access their properties.
* Use `IS NOT NULL` to check for property existence.
* Use the `elementId()` function to return the unique identifier for a node or relationship as `_id`.
* For movies, use the tmdbId property to return a source URL.
For example: `'https://www.themoviedb.org/movie/'+ m.tmdbId AS source`.
* For movie titles that begin with "The", move "the" to the end.
For example "The 39 Steps" becomes "39 Steps, The" or "the matrix" becomes "Matrix, The".
* For the role a person played in a movie, use the role property on the ACTED_IN relationship.
* Limit the maximum number of results to 10.
* Respond with only a Cypher statement. No preamble.
Respond with a JSON object with "cypher" and "errors" keys.
* "cypher" - the corrected cypher statement
* "corrected" - a boolean
* "errors" - A list of uncorrectable errors. For example, if a label,
relationship type or property does not exist in the schema.
Provide a hint to the correct element where possible.
Fixable Example #1:
* cypher:
MATCH (a:Actor {{name: 'Emil Eifrem'}})-[:ACTED_IN]->(m:Movie)
RETURN a.name AS Actor, m.title AS Movie, m.tmdbId AS source,
elementId(m) AS _id, m.released AS ReleaseDate, r.role AS Role LIMIT 10
* errors: ["Variable `r` not defined (line 1, column 172 (offset: 171))"]
* response:
MATCH (a:Actor {{name: 'Emil Eifrem'}})-[r:ACTED_IN]->(m:Movie)
RETURN a.name AS Actor, m.title AS Movie, m.tmdbId AS source,
elementId(m) AS _id, m.released AS ReleaseDate, r.role AS Role LIMIT 10
Schema:
{schema}
Question:
{question}
Cypher Statement:
{cypher}
{errors}
Output Instructions
This prompt instructs the LLM to output a JSON object containing keys for cypher
and errors
.
This differs from the chains you have built before because they all return a string. To parse the output as a string, you will use the JsonOutputParser
class to interpret the response and coerce it into an object.
Your code should resemble the following:
// Prompt template
const prompt = PromptTemplate.fromTemplate(`
You are an expert Neo4j Developer evaluating a Cypher statement written by an AI.
Check that the cypher statement provided below against the database schema to check that
the statement will answer the user's question.
Fix any errors where possible.
The query must:
* Only use the nodes, relationships and properties mentioned in the schema.
* Assign a variable to nodes or relationships when intending to access their properties.
* Use \`IS NOT NULL\` to check for property existence.
* Use the \`elementId()\` function to return the unique identifier for a node or relationship as \`_id\`.
* For movies, use the tmdbId property to return a source URL.
For example: \`'https://www.themoviedb.org/movie/'+ m.tmdbId AS source\`.
* For movie titles that begin with "The", move "the" to the end.
For example "The 39 Steps" becomes "39 Steps, The" or "the matrix" becomes "Matrix, The".
* For the role a person played in a movie, use the role property on the ACTED_IN relationship.
* Limit the maximum number of results to 10.
* Respond with only a Cypher statement. No preamble.
Respond with a JSON object with "cypher" and "errors" keys.
* "cypher" - the corrected cypher statement
* "corrected" - a boolean
* "errors" - A list of uncorrectable errors. For example, if a label,
relationship type or property does not exist in the schema.
Provide a hint to the correct element where possible.
Fixable Example #1:
* cypher:
MATCH (a:Actor {{name: 'Emil Eifrem'}})-[:ACTED_IN]->(m:Movie)
RETURN a.name AS Actor, m.title AS Movie, m.tmdbId AS source,
elementId(m) AS _id, m.released AS ReleaseDate, r.role AS Role LIMIT 10
* errors: ["Variable \`r\` not defined (line 1, column 172 (offset: 171))"]
* response:
MATCH (a:Actor {{\name: 'Emil Eifrem'}})-[r:ACTED_IN]->(m:Movie)
RETURN a.name AS Actor, m.title AS Movie, m.tmdbId AS source,
elementId(m) AS _id, m.released AS ReleaseDate, r.role AS Role LIMIT 10
Schema:
{schema}
Question:
{question}
Cypher Statement:
{cypher}
{errors}
`);
Braces in prompts
Use double braces ({{
and }}
) to escape braces that are not text placeholders.
Return a Runnable Sequence
Use the RunnableSequence.from()
method to create a new chain.
return RunnableSequence.from([
// ...
])
Initial Inputs
The chain will recursively verify using the output described in the prompt, which includes an array of errors.
The prompt will need these in string format, so as the first step, use the RunnablePassthrough.assign()
method to convert the array of errors into a single string.
return RunnableSequence.from<
CypherEvaluationChainInput,
CypherEvaluationChainOutput
>([
RunnablePassthrough.assign({
// Convert errors into an LLM-friendly list
errors: ({ errors }) => {
if (
errors === undefined ||
(Array.isArray(errors) && errors.length === 0)
) {
return "";
}
return `Errors: * ${
Array.isArray(errors) ? errors?.join("\n* ") : errors
}`;
},
}),
// ...
]);
Format Prompt and Process
Now that you have the inputs that the prompt expects, update the chain to format the prompt, pass it to the LLM to process and parse the output.
return RunnableSequence.from<
CypherEvaluationChainInput,
CypherEvaluationChainOutput
>([
RunnablePassthrough.assign({
// Convert errors into an LLM-friendly list
errors: ({ errors }) => {
if (
errors === undefined ||
(Array.isArray(errors) && errors.length === 0)
) {
return "";
}
return `Errors: * ${
Array.isArray(errors) ? errors?.join("\n* ") : errors
}`;
},
}),
prompt,
llm,
new JsonOutputParser<CypherEvaluationChainOutput>(),
]);
Completed Sequence
If you have followed the steps correctly, your code should resemble the following:
export default async function initCypherEvaluationChain(
llm: BaseLanguageModel
) {
// Prompt template
const prompt = PromptTemplate.fromTemplate(`
You are an expert Neo4j Developer evaluating a Cypher statement written by an AI.
Check that the cypher statement provided below against the database schema to check that
the statement will answer the user's question.
Fix any errors where possible.
The query must:
* Only use the nodes, relationships and properties mentioned in the schema.
* Assign a variable to nodes or relationships when intending to access their properties.
* Use \`IS NOT NULL\` to check for property existence.
* Use the \`elementId()\` function to return the unique identifier for a node or relationship as \`_id\`.
* For movies, use the tmdbId property to return a source URL.
For example: \`'https://www.themoviedb.org/movie/'+ m.tmdbId AS source\`.
* For movie titles that begin with "The", move "the" to the end.
For example "The 39 Steps" becomes "39 Steps, The" or "the matrix" becomes "Matrix, The".
* For the role a person played in a movie, use the role property on the ACTED_IN relationship.
* Limit the maximum number of results to 10.
* Respond with only a Cypher statement. No preamble.
Respond with a JSON object with "cypher" and "errors" keys.
* "cypher" - the corrected cypher statement
* "corrected" - a boolean
* "errors" - A list of uncorrectable errors. For example, if a label,
relationship type or property does not exist in the schema.
Provide a hint to the correct element where possible.
Fixable Example #1:
* cypher:
MATCH (a:Actor {{name: 'Emil Eifrem'}})-[:ACTED_IN]->(m:Movie)
RETURN a.name AS Actor, m.title AS Movie, m.tmdbId AS source,
elementId(m) AS _id, m.released AS ReleaseDate, r.role AS Role LIMIT 10
* errors: ["Variable \`r\` not defined (line 1, column 172 (offset: 171))"]
* response:
MATCH (a:Actor {{\name: 'Emil Eifrem'}})-[r:ACTED_IN]->(m:Movie)
RETURN a.name AS Actor, m.title AS Movie, m.tmdbId AS source,
elementId(m) AS _id, m.released AS ReleaseDate, r.role AS Role LIMIT 10
Schema:
{schema}
Question:
{question}
Cypher Statement:
{cypher}
{errors}
`);
return RunnableSequence.from<
CypherEvaluationChainInput,
CypherEvaluationChainOutput
>([
RunnablePassthrough.assign({
// Convert errors into an LLM-friendly list
errors: ({ errors }) => {
if (
errors === undefined ||
(Array.isArray(errors) && errors.length === 0)
) {
return "";
}
return `Errors: * ${
Array.isArray(errors) ? errors?.join("\n* ") : errors
}`;
},
}),
prompt,
llm,
new JsonOutputParser<CypherEvaluationChainOutput>(),
]);
}
Testing your changes
If you have followed the instructions, you should be able to run the following unit test to verify the response using the npm run test
command.
npm run test cypher-evaluation.chain.test.ts
View Unit Test
import { ChatOpenAI } from "@langchain/openai";
import { config } from "dotenv";
import { BaseChatModel } from "langchain/chat_models/base";
import { RunnableSequence } from "@langchain/core/runnables";
import { Neo4jGraph } from "@langchain/community/graphs/neo4j_graph";
import initCypherEvaluationChain from "./cypher-evaluation.chain";
describe("Cypher Evaluation Chain", () => {
let graph: Neo4jGraph;
let llm: BaseChatModel;
let chain: RunnableSequence;
beforeAll(async () => {
config({ path: ".env.local" });
graph = await Neo4jGraph.initialize({
url: process.env.NEO4J_URI as string,
username: process.env.NEO4J_USERNAME as string,
password: process.env.NEO4J_PASSWORD as string,
database: process.env.NEO4J_DATABASE as string | undefined,
});
llm = new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: "gpt-3.5-turbo",
temperature: 0,
configuration: {
baseURL: process.env.OPENAI_API_BASE,
},
});
chain = await initCypherEvaluationChain(llm);
});
afterAll(async () => {
await graph.close();
});
it("should fix a non-existent label", async () => {
const input = {
question: "How many movies are in the database?",
cypher: "MATCH (m:Muvee) RETURN count(m) AS count",
schema: graph.getSchema(),
errors: ["Label Muvee does not exist"],
};
const { cypher, errors } = await chain.invoke(input);
expect(cypher).toContain("MATCH (m:Movie) RETURN count(m) AS count");
expect(errors.length).toBe(1);
let found = false;
for (const error of errors) {
if (error.includes("label Muvee does not exist")) {
found = true;
}
}
expect(found).toBe(true);
});
it("should fix a non-existent relationship", async () => {
const input = {
question: "Who acted in the matrix?",
cypher:
'MATCH (m:Muvee)-[:ACTS_IN]->(a:Person) WHERE m.name = "The Matrix" RETURN a.name AS actor',
schema: graph.getSchema(),
errors: [
"Label Muvee does not exist",
"Relationship type ACTS_IN does not exist",
],
};
const { cypher, errors } = await chain.invoke(input);
expect(cypher).toContain("MATCH (m:Movie");
expect(cypher).toContain(":ACTED_IN");
expect(errors.length).toBeGreaterThanOrEqual(2);
let found = false;
for (const error of errors) {
if (error.includes("ACTS_IN")) {
found = true;
}
}
expect(found).toBe(true);
});
it("should return no errors if the query is fine", async () => {
const cypher = "MATCH (m:Movie) RETURN count(m) AS count";
const input = {
question: "How many movies are in the database?",
cypher,
schema: graph.getSchema(),
errors: ["Label Muvee does not exist"],
};
const { cypher: updatedCypher, errors } = await chain.invoke(input);
expect(updatedCypher).toContain(cypher);
expect(errors.length).toBe(0);
});
it("should keep variables in relationship", async () => {
const cypher =
"MATCH (a:Actor {name: 'Emil Eifrem'})-[r:ACTED_IN]->" +
"(m:Movie {title: 'Neo4j - Into the Graph'}) RETURN r.role AS Role";
const input = {
question: "What role did Emil Eifrem play in Neo4j - Into the Graph",
cypher,
schema: graph.getSchema(),
errors: [],
};
const { cypher: updatedCypher, errors } = await chain.invoke(input);
expect(updatedCypher).toContain(cypher);
expect(errors.length).toBe(0);
});
});
It works!
If all the tests have passed, you will have a chain that evaluates a Cypher statement and provides hints if any errors are detected.
Hit the button below to mark the challenge as complete.
Summary
In this lesson, you built a chain that evaluates the Cypher statement generated in the Cypher Generation chain.
In the next lesson, you will create a chain that will generate an authoritative answer to a question based on the context provided.