Until now, you have been using a language model to communicate with the LLM. A language model predicts the next word in a sequence of words. Chat models are designed to have conversations.
Chat model
Open the 2-llm-rag-python-langchain\chat_model.py
file.
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
chat_llm = ChatOpenAI(openai_api_key=os.getenv('OPENAI_API_KEY'))
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
),
(
"human",
"{question}"
),
]
)
chat_chain = prompt | chat_llm | StrOutputParser()
response = chat_chain.invoke({"question": "What is the weather like?"})
print(response)
Review this program and identify the following:
-
The prompt is a series of messages ("system" and "human")
-
The chain consists of the prompt, a
ChatOpenAI
object and an output parser -
The question is passed to the chat model as a parameter of the
invoke
method.
Run the program and note how the LLM responds to the question.
Giving context
Currently, the chat model is not grounded; it is unaware of surf conditions on the beach. It responds based on the question and the LLMs training data (which could be months or years out of date).
You can ground the chat model by providing additional information in the prompt.
Open the 2-llm-rag-python-langchain\chat_model_context.py
file.
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
chat_llm = ChatOpenAI(openai_api_key=os.getenv('OPENAI_API_KEY'))
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
),
( "system", "{context}" ),
( "human", "{question}" ),
]
)
chat_chain = prompt | chat_llm | StrOutputParser()
current_weather = """
{
"surf": [
{"beach": "Fistral", "conditions": "6ft waves and offshore winds"},
{"beach": "Polzeath", "conditions": "Flat and calm"},
{"beach": "Watergate Bay", "conditions": "3ft waves and onshore winds"}
]
}"""
response = chat_chain.invoke(
{
"context": current_weather,
"question": "What is the weather like on Watergate Bay?"
}
)
print(response)
The prompt contains an additional context
system message to pass the surf conditions to the LLM.
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
),
( "system", "{context}" ),
( "human", "{question}" ),
]
)
The current_weather
variable contains the surf conditions for three beaches.
current_weather = """
{
"surf": [
{"beach": "Fistral", "conditions": "6ft waves and offshore winds"},
{"beach": "Polzeath", "conditions": "Flat and calm"},
{"beach": "Watergate Bay", "conditions": "3ft waves and onshore winds"}
]
}"""
The program invokes the chat model using the current_weather
as the context
.
response = chat_chain.invoke(
{
"context": current_weather,
"question": "What is the weather like on Watergate Bay?"
}
)
Run the program and predict what the response will be.
Click to reveal the response
Below is a typical response. The LLM has used the context passed in the prompt to provide a more accurate response.
Dude, the surf at Watergate Bay is pumping! We got some sick 3ft waves rolling in, but unfortunately, we got some onshore winds messing with the lineup. But hey, it's all good, still plenty of stoke to be had out there!
Investigate what happens when you change the context by adding additional beach conditions.
Providing context is one aspect of Retrieval Augmented Generation (RAG). In this program, you manually gave the model context; however, you could have retrieved real-time information from an API or database.
Memory
For a chat model to be helpful, it must remember what messages have been sent and received.
Without a memory the conversation may go in circles:
[user] Hi, my name is Martin
[chat model] Hi, nice to meet you Martin
[user] Do you have a name?
[chat model] I am the chat model. Nice to meet you. What is your name?
You are going to add a memory to the chat model code.
You will modify the program to store the chat history in Neo4j and pass it to the LLM in the prompt.
You will need to:
-
Connect to the Neo4j database
-
Create a function that returns a
Neo4jChatMessageHistory
component that will store the chat history -
Modify the prompt to include the chat history
-
Wrap the chat chain in a Runnable that will store and retrieve the chat history
Add History to the Prompt
Start by importing the required components.
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_neo4j import Neo4jChatMessageHistory, Neo4jGraph
from uuid import uuid4
As each call to the LLM is stateless, you need to include the chat history in every call to the LLM.
You can modify the prompt template to include the chat history as a list of messages using a MessagesPlaceholder
object.
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
),
("system", "{context}"),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{question}"),
]
)
Session ID
You must create and assign a session ID to each conversation to identify them.
You can generate a random UUID using the Python uuid.uuid4
function.
Create a new SESSION_ID
constant in your chat model program.
SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")
Neo4j Chat Message History
Create a Neo4jGraph
object to connect to your Neo4j sandbox.
graph = Neo4jGraph(
url=os.getenv('NEO4J_URI'),
username=os.getenv('NEO4J_USERNAME'),
password=os.getenv('NEO4J_PASSWORD'),
)
The chain will require a callback function to return a memory component.
def get_memory(session_id):
return Neo4jChatMessageHistory(session_id=session_id, graph=graph)
The get_memory
function will return an instance of Neo4jChatMessageHistory
.
You should pass the session_id
and the graph
connection you created as parameters.
Chat Message History
You can now create a new chain using the RunnableWithMessageHistory
, passing the chat_chain
and the get_memory
function.
chat_chain = prompt | chat_llm | StrOutputParser()
chat_with_message_history = RunnableWithMessageHistory(
chat_chain,
get_memory,
input_messages_key="question",
history_messages_key="chat_history",
)
Invoke the Chat Model
When you call the chat_with_message_history
chain, the user’s question and the response will be stored in the ChatMessageHistory
memory component.
Every subsequent call to the chat_with_message_history
chain will include the chat history in the prompt.
Put the call to the chat_with_message_history
chain in a loop.
while True:
question = input("> ")
response = chat_with_message_history.invoke(
{
"context": current_weather,
"question": question,
},
config={
"configurable": {"session_id": SESSION_ID}
}
)
print(response)
SESSION_ID
is passed to the chain in the invoke
configuration.Click to reveal the complete code.
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_neo4j import Neo4jChatMessageHistory, Neo4jGraph
from uuid import uuid4
SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")
chat_llm = ChatOpenAI(openai_api_key=os.getenv('OPENAI_API_KEY'))
graph = Neo4jGraph(
url=os.getenv('NEO4J_URI'),
username=os.getenv('NEO4J_USERNAME'),
password=os.getenv('NEO4J_PASSWORD'),
)
def get_memory(session_id):
return Neo4jChatMessageHistory(session_id=session_id, graph=graph)
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
),
("system", "{context}"),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{question}"),
]
)
chat_chain = prompt | chat_llm | StrOutputParser()
chat_with_message_history = RunnableWithMessageHistory(
chat_chain,
get_memory,
input_messages_key="question",
history_messages_key="chat_history",
)
current_weather = """
{
"surf": [
{"beach": "Fistral", "conditions": "6ft waves and offshore winds"},
{"beach": "Bells", "conditions": "Flat and calm"},
{"beach": "Watergate Bay", "conditions": "3ft waves and onshore winds"}
]
}"""
while True:
question = input("> ")
response = chat_with_message_history.invoke(
{
"context": current_weather,
"question": question,
},
config={
"configurable": {"session_id": SESSION_ID}
}
)
print(response)
Run the program and ask the LLM a few questions and note how the LLM can now response based on the conversation history.
[user] Hi, I'm down at Watergate Bay.
[chat model] Hey dude, stoked to hear you're at Watergate Bay! How's the surf looking over there?
[user] It's good, do you know the forecast here?
[chat model] Right on, dude! The surf at Watergate Bay is firing with 3ft waves and some onshore winds.
Conversation History Graph
The conversation history is stored using the following data model:
You can return the graph of the conversation history using the following Cypher query:
MATCH (s:Session)-[:LAST_MESSAGE]->(last:Message)<-[:NEXT*]-(msg:Message)
RETURN s, last, msg
Continue
When you are ready, you can move on to the next task.
Lesson Summary
You learned how to use an LLM chat model, give it context, and store the conversation memory in Neo4j.
In the next lesson, you will learn how to create agents.