You can query Neo4j from a LangChain application using the Neo4jGraph
class.
The Neo4jGraph
class acts as the connection to the database when using other LangChain components, such as retrievers and agents.
In this lesson, you will modify the simple LangChain agent to be able to answer questions about a graph database schema.
The database contains information about movies, actors, and user ratings.
Query Neo4j
To query Neo4j you need to:
-
Create a
Neo4jGraph
instance and connect to a database -
Run a Cypher statement to get data from the database
Open the genai-integration-langchain/neo4j_query.py
file:
import os
from dotenv import load_dotenv
load_dotenv()
# Create Neo4jGraph instance
# Run a query and print the result
Update the code to:
-
Connect to the Neo4j database using your connection details:
pythonfrom langchain_neo4j import Neo4jGraph # Create Neo4jGraph instance graph = Neo4jGraph( url=os.getenv("NEO4J_URI"), username=os.getenv("NEO4J_USERNAME"), password=os.getenv("NEO4J_PASSWORD"), )
-
Run a Cypher query to retrieve data from the database:
python# Run a query and print the result result = graph.query(""" MATCH (m:Movie {title: "Mission: Impossible"})<-[a:ACTED_IN]-(p:Person) RETURN p.name AS actor, a.role AS role """) print(result)
The results are returned as a list of dictionaries, where each dictionary represents a row in the results.
[ {'actor': 'Henry Czerny', 'role': 'Eugene Kittridge'}, {'actor': 'Emmanuelle Béart', 'role': 'Claire Phelps'}, {'actor': 'Tom Cruise', 'role': 'Ethan Hunt'}, {'actor': 'Jon Voight', 'role': 'Jim Phelps'} ]
You can use data returned from queries as context for an agent.
Schema
You are going to modify the agent to retrieve the database schema and add it to the context
.
You can view the database schema using the Cypher query:
CALL db.schema.visualization()
Open the genai-integration-langchain/scheme_agent.py
file.
import os
from dotenv import load_dotenv
load_dotenv()
from langchain.chat_models import init_chat_model
from langgraph.graph import START, StateGraph
from langchain_core.prompts import PromptTemplate
from typing_extensions import List, TypedDict
# Connect to Neo4j
# graph =
# Initialize the LLM
model = init_chat_model("gpt-4o", model_provider="openai")
# Create a prompt
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}
Question: {question}
Answer:"""
prompt = PromptTemplate.from_template(template)
# Define state for application
class State(TypedDict):
question: str
context: List[dict]
answer: str
# Define functions for each step in the application
# Retrieve context
def retrieve(state: State):
context = [
{"data": "None"}
]
return {"context": context}
# Generate the answer based on the question and context
def generate(state: State):
messages = prompt.invoke({"question": state["question"], "context": state["context"]})
response = model.invoke(messages)
return {"answer": response.content}
# Define application steps
workflow = StateGraph(State).add_sequence([retrieve, generate])
workflow.add_edge(START, "retrieve")
app = workflow.compile()
# Run the application
question = "What data is in the context?"
response = app.invoke({"question": question})
print("Answer:", response["answer"])
Add the code to create a connection to the Neo4j database using the Neo4jGraph
class:
from langchain_neo4j import Neo4jGraph
# Connect to Neo4j
graph = Neo4jGraph(
url=os.getenv("NEO4J_URI"),
username=os.getenv("NEO4J_USERNAME"),
password=os.getenv("NEO4J_PASSWORD"),
)
Modify the retrieve
function to use the Neo4jGraph
instance to query the database and retrieve the schema information:
# Retrieve context
def retrieve(state: State):
context = graph.query("CALL db.schema.visualization()")
return {"context": context}
The agent will add the database schema to the context
and use it to answer questions about the database.
Update the question and run the application:
question = "How is the graph structured?"
Click to see the complete code
import os
from dotenv import load_dotenv
load_dotenv()
from langchain.chat_models import init_chat_model
from langgraph.graph import START, StateGraph
from langchain_core.prompts import PromptTemplate
from typing_extensions import List, TypedDict
from langchain_neo4j import Neo4jGraph
# Connect to Neo4j
graph = Neo4jGraph(
url=os.getenv("NEO4J_URI"),
username=os.getenv("NEO4J_USERNAME"),
password=os.getenv("NEO4J_PASSWORD"),
)
# Initialize the LLM
model = init_chat_model("gpt-4o", model_provider="openai")
# Create a prompt
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}
Question: {question}
Answer:"""
prompt = PromptTemplate.from_template(template)
# Define state for application
class State(TypedDict):
question: str
context: List[dict]
answer: str
# Define functions for each step in the application
# Retrieve context
def retrieve(state: State):
context = graph.query("CALL db.schema.visualization()")
return {"context": context}
# Generate the answer based on the question and context
def generate(state: State):
messages = prompt.invoke({"question": state["question"], "context": state["context"]})
response = model.invoke(messages)
return {"answer": response.content}
# Define application steps
workflow = StateGraph(State).add_sequence([retrieve, generate])
workflow.add_edge(START, "retrieve")
app = workflow.compile()
# Run the application
question = "How is the graph structured?"
response = app.invoke({"question": question})
print("Answer:", response["answer"])
When asked "How is the graph structured?", the agent will response with a summary of the graph, typically including details about nodes, relationships, and their properties.
Click to see an example response
The graph is structured using nodes and relationships.
-
Nodes:
-
Movie: Has indexes on
year
,imdbRating
,released
,imdbId
,title
, andtagline
. It has unique constraints onmovieId
andtmdbId
. -
User: Has an index on
name
and a unique constraint onuserId
. -
Actor, Director, and Genre: Do not have indexes listed. The Genre node has a unique constraint on
name
. -
Person: Has an index on
name
and a unique constraint ontmdbId
.
-
-
Relationships:
-
ACTED_IN: Connects
Actor
,Director
, orPerson
nodes withMovie
nodes. -
RATED: Connects
User
nodes toMovie
nodes. -
IN_GENRE: Connects
Movie
nodes toGenre
nodes. -
DIRECTED: Connects
Person
,Actor
, orDirector
nodes toMovie
nodes.
-
In summary, the graph represents a network where movies are connected to various entities like users, genres, actors, directors, and general persons through specific relationships indicating roles like acting, directing, rating, and categorization into genres.
Experiment by asking the agent other questions about the database schema. Here are some examples:
-
How is the graph structured?
-
How are Movie nodes connected to Person nodes?
-
What relationships are in the graph?
-
What properties do Movie nodes have?
-
How would I find user ratings for a movie?
When you are ready, continue to the next module.
Lesson Summary
In this lesson, you learned how to query Neo4j from LangChain and updated the agent to retrieve the database schema.
In the next module, you learn how to use vectors to create RAG and GraphRAG retrievers.