Integrate a Python Chatbot

In this optional challenge, you will complete a Python chatbot that can answer questions using the knowledge graph.

Chatbot

The llm-knowledge-graph repository includes a version of the Python chatbot from the course Build a Neo4j-backed Chatbot using Python. You may find it useful to complete the course before attempting this challenge.

You can view the code in the llm-knowledge-graph/chatbot directory.

Your task is to update the vector and cypher tools to interact with the knowledge graph by generating Cypher queries and querying the Neo4j vector with a retriever.

You can run the chatbot using the streamlit run command:

bash
streamlit run llm-knowledge-graph/chatbot/bot.py

Vector tool

The vector tool uses the Neo4j vector to retrieve the most similar nodes based on the user’s input.

The code is in the llm-knowledge-graph/chatbot/tools/vector.py file:

python
import os
from dotenv import load_dotenv
load_dotenv()

from llm import llm, embeddings
from graph import graph

# You task is to update this tool to query the Neo4j vector to return the most relevant documents

# Create the chunk_vector
# chunk_vector = Neo4jVector.from_existing_index

# Create the instructions and prompt
# instructions = ""
# prompt = ChatPromptTemplate.from_messages

# Create the chunk_retriever and chain
# chunk_retriever = chunk_vector.as_retriever()
# chunk_chain = create_stuff_documents_chain(llm, prompt)
# chunk_retriever = create_retrieval_chain    chunk_retriever, 

def find_chunk(q):
    # Invoke the chunk retriever
    # return chunk_retriever.invoke({"input": input})

    # Currently this functions returns nothing
    return {
        "input": q,
        "context": []
    }

Your task is to update it to use the Neo4j vector retriever you created in the Integrate with a Retriever lesson.

Cypher Tool

The cypher tool generates Cypher queries based on the user’s input.

The code is the llm-knowledge-graph/chatbot/tools/cypher.py file

python
import os
from dotenv import load_dotenv
load_dotenv()

from llm import llm
from graph import graph

from langchain.chains import GraphCypherQAChain
from langchain.prompts import PromptTemplate

# You task is to update this tool to generate and run a Cypher statement, and return the results.
    
# Create cypher_generation prompt
# CYPHER_GENERATION_TEMPLATE = ""

# Create the cypher_chain
# cypher_chain = GraphCypherQAChain.from_llm

def run_cypher(q):
    # Invoke the cypher_chain
    # return  cypher_chain.invoke({"query": q})
    
    # Currently this functions returns nothing.
    return {}

You task is to update this code to use the Cypher generation you created in the Query the Knowledge Graph with an LLM lesson.

Agent prompts and tools

The agent will decide what action to take based on the user’s input, the prompt, and the tools available.

You may need to change these to improve the chatbot’s performance for your use case and the data in your knowledge graph.

The agent code is in the llm-knowledge-graph/chatbot/agent.py file.

View agent.py
python
from llm import llm
from graph import graph
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts import PromptTemplate
from langchain.schema import StrOutputParser
from langchain.tools import Tool
from langchain_community.chat_message_histories import Neo4jChatMessageHistory
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain import hub
from utils import get_session_id

from tools.vector import find_chunk
from tools.cypher import run_cypher

chat_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an AI expert providing information about Neo4j and Knowledge Graphs."),
        ("human", "{input}"),
    ]
)

kg_chat = chat_prompt | llm | StrOutputParser()

tools = [
    Tool.from_function(
        name="General Chat",
        description="For general knowledge graph chat not covered by other tools",
        func=kg_chat.invoke,
    ), 
    Tool.from_function(
        name="Lesson content search",
        description="For when you need to find information in the lesson content",
        func=find_chunk, 
    ),
    Tool.from_function(
        name="Knowledge Graph information",
        description="For when you need to find information about the entities and relationship in the knowledge graph",
        func = run_cypher,
    )
]

def get_memory(session_id):
    return Neo4jChatMessageHistory(session_id=session_id, graph=graph)

agent_prompt = PromptTemplate.from_template("""
You are a Neo4j, Knowledge graph, and generative AI expert.
Be as helpful as possible and return as much information as possible.
Only answer questions that relate to Neo4j, graphs, cypher, generative AI, or associated subjects.
        
Always use a tool and only use the information provided in the context.

TOOLS:
------

You have access to the following tools:

{tools}

To use a tool, please use the following format:

```
Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
```

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:

```
Thought: Do I need to use a tool? No
Final Answer: [your response here]
```

Begin!

Previous conversation history:
{chat_history}

New input: {input}
{agent_scratchpad}
""")

agent = create_react_agent(llm, tools, agent_prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    handle_parsing_errors=True,
    verbose=True
    )

chat_agent = RunnableWithMessageHistory(
    agent_executor,
    get_memory,
    input_messages_key="input",
    history_messages_key="chat_history",
)

def generate_response(user_input):
    """
    Create a handler that calls the Conversational agent
    and returns a response to be rendered in the UI
    """

    response = chat_agent.invoke(
        {"input": user_input},
        {"configurable": {"session_id": get_session_id()}},)

    return response['output']

The prompt sets the instructions for the agent and how it should decide what tool to use:

python
Agent prompt
agent_prompt = PromptTemplate.from_template("""
You are a Neo4j, Knowledge graph, and generative AI expert.
Be as helpful as possible and return as much information as possible.
Only answer questions that relate to Neo4j, graphs, cypher, generative AI, or associated subjects.
        
Always use a tool and only use the information provided in the context.

TOOLS:
------

You have access to the following tools:

{tools}

To use a tool, please use the following format:

```
Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
```

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:

```
Thought: Do I need to use a tool? No
Final Answer: [your response here]
```

Begin!

Previous conversation history:
{chat_history}

New input: {input}
{agent_scratchpad}
""")

The tools are the functions the agent can use to interact with the knowledge graph:

python
Tools
tools = [
    Tool.from_function(
        name="General Chat",
        description="For general knowledge graph chat not covered by other tools",
        func=kg_chat.invoke,
    ), 
    Tool.from_function(
        name="Lesson content search",
        description="For when you need to find information in the lesson content",
        func=find_chunk, 
    ),
    Tool.from_function(
        name="Knowledge Graph information",
        description="For when you need to find information about the entities and relationship in the knowledge graph",
        func = run_cypher,
    )
]

Test the chatbot

Test the chatbot by asking the agent questions about data held within the knowledge graph.

You can tune the agent’s prompt and tool descriptions to improve the chatbot’s performance.

Solution

Here is one possible solution for the vector and cypher tools. Remember there is no one correct answer, and you should customize the chatbot to suit your needs.

The performance of the chatbot is dependent on the data, tools and prompts which may still need tuning.
View the vector.py solution
python
import os
from dotenv import load_dotenv
load_dotenv()

from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_community.graphs import Neo4jGraph
from langchain_community.vectorstores import Neo4jVector
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(
    openai_api_key=os.getenv('OPENAI_API_KEY'), 
    temperature=0
)

embedding_provider = OpenAIEmbeddings(
    openai_api_key=os.getenv('OPENAI_API_KEY')
    )

graph = Neo4jGraph(
    url=os.getenv('NEO4J_URI'),
    username=os.getenv('NEO4J_USERNAME'),
    password=os.getenv('NEO4J_PASSWORD')
)

chunk_vector = Neo4jVector.from_existing_index(
    embedding_provider,
    graph=graph,
    index_name="vector",
    embedding_node_property="embedding",
    text_node_property="text",
    retrieval_query="""
// get the document
MATCH (node)-[:PART_OF]->(d:Document)
WITH node, score, d

// get the entities and relationships for the document
MATCH (node)-[:HAS_ENTITY]->(e)
MATCH p = (e)-[r]-(e2)
WHERE (node)-[:HAS_ENTITY]->(e2)

// unwind the path, create a string of the entities and relationships
UNWIND relationships(p) as rels
WITH 
    node, 
    score, 
    d, 
    collect(apoc.text.join(
        [labels(startNode(rels))[0], startNode(rels).id, type(rels), labels(endNode(rels))[0], endNode(rels).id]
        ," ")) as kg
RETURN
    node.text as text, score,
    { 
        document: d.id,
        entities: kg
    } AS metadata
"""
)

instructions = (
    "Use the given context to answer the question."
    "Reply with an answer that includes the id of the document and other relevant information from the text."
    "If you don't know the answer, say you don't know."
    "Context: {context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", instructions),
        ("human", "{input}"),
    ]
)

chunk_retriever = chunk_vector.as_retriever()
chunk_chain = create_stuff_documents_chain(llm, prompt)
chunk_retriever = create_retrieval_chain(
    chunk_retriever, 
    chunk_chain
)

def find_chunk(q):
    return chunk_retriever.invoke({"input": q})
View the cypher.py solution
python
import os
from dotenv import load_dotenv
load_dotenv()

from llm import llm
from graph import graph

from langchain.chains import GraphCypherQAChain
from langchain.prompts import PromptTemplate

CYPHER_GENERATION_TEMPLATE = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Only include the generated Cypher statement in your response.

Always use case insensitive search when matching strings.

Schema:
{schema}

Examples: 
# Use case insensitive matching for entity ids
MATCH (c:Chunk)-[:HAS_ENTITY]->(e)
WHERE e.id =~ '(?i)entityName'

The question is:
{question}"""

CYPHER_GENERATION_TEMPLATE = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Only include the generated Cypher statement in your response.

Always use case insensitive search when matching strings.

Schema:
{schema}

Examples: 
# Use case insensitive matching for entity ids
MATCH (c:Chunk)-[:HAS_ENTITY]->(e)
WHERE e.id =~ '(?i)entityName'
RETURN e.id

# Find documents that reference entities
MATCH (d:Document)<-[:PART_OF]-(c:Chunk)-[:HAS_ENTITY]->(e)
WHERE e.id =~ '(?i)entityName'
RETURN d.id, c.id, c.text, e.id

The question is:
{question}"""

cypher_generation_prompt = PromptTemplate(
    template=CYPHER_GENERATION_TEMPLATE,
    input_variables=["schema", "question"],
)

cypher_chain = GraphCypherQAChain.from_llm(
    llm,
    graph=graph,
    cypher_prompt=cypher_generation_prompt,
    verbose=True,
)

def run_cypher(q):
    cypher_chain.invoke({"query": q})

When you are ready, move on to finish the course.

Lesson Summary

In this optional challenge, you integrated a Python chatbot with the knowledge graph.

Congratulations on completing the course!