The LLM generates Cypher queries based on the schema of the graph. When a query is submitted, the schema is automatically read from the database and added to the prompt.
In this lesson, you will learn how to restrict the schema to only include certain node labels or relationship types.
Restricting the schema can help generate better Cypher by:
-
Reducing the complexity of the generated Cypher queries.
-
Helping the LLM focus on the relevant parts of the graph.
-
Excluding irrelevant or unwanted parts of the graph that may confuse the LLM.
More generally, the more focused the schema, the better the LLM can generate Cypher queries.
Restricting the schema
You can restrict the schema by either providing the GraphCypherQAChain
with a list of node labels and relationship types to include or exclude from the schema.
If you wanted to just include data about movies and their directors, you could provide the following list of node labels and relationship types to include:
-
Movie
-
DIRECTED
-
Director
You provide the types as a list to the include_types
parameter of the GraphCypherQAChain
:
Unresolved directive in lesson.adoc - include::{repository-raw}/new-course/genai-integration-langchain/solutions/cypher_qa_schema.py[tag=cypher_qa_include]
When prompted with a question about movies, the LLM will only be able to respond with answers related to movies and directors.
Alternatively, if you wanted to exclude ratings data, you could provide User
and RATED
as the types to the exclude_types
parameter:
Unresolved directive in lesson.adoc - include::{repository-raw}/new-course/genai-integration-langchain/solutions/cypher_qa_schema.py[tag=cypher_qa_exclude]
How you restrict the schema will depend on the graph structure and the types of questions you want to answer.
Check Your Understanding
Why Restrict the Schema?
Why might you want to restrict the schema when generating Cypher queries with an LLM? (Select all that apply)
-
✓ To reduce the complexity of the generated Cypher queries
-
❏ To improve query performance
-
✓ To help the LLM focus on the relevant parts of the graph
-
✓ To exclude irrelevant or unwanted parts of the graph that may confuse the LLM
Hint
How would limiting the available node labels and relationship types might affect the LLM’s ability to generate accurate and relevant Cypher queries?
Solution
The correct answers are:
-
✓ To reduce the complexity of the generated Cypher queries
-
✓ To help the LLM focus on the relevant parts of the graph
-
✓ To exclude irrelevant or unwanted parts of the graph that may confuse the LLM
While restricting the schema may indirectly impact query performance, this is not guaranteed.
Lesson Summary
In this lesson, you learned how to restrict the schema used to generate Cypher queries by including or excluding specific node labels and relationship types.
In the next lesson, you will create a text to Cypher retriever and add it to the LangChain agent.