Chat models

Until now, you have been using a language model to communicate with the LLM. A language model predicts the next word in a sequence of words. Chat models are designed to have conversations.

Chat model

Open the 2-llm-rag-python-langchain\chat_model.py file.

python
chat_model.py
import os
from dotenv import load_dotenv
load_dotenv()

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser

chat_llm = ChatOpenAI(openai_api_key=os.getenv('OPENAI_API_KEY'))

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
        ),
        (
            "human", 
            "{question}"
        ),
    ]
)

chat_chain = prompt | chat_llm | StrOutputParser()

response = chat_chain.invoke({"question": "What is the weather like?"})

print(response)

Review this program and identify the following:

  • The prompt is a series of messages ("system" and "human")

  • The chain consists of the prompt, a ChatOpenAI object and an output parser

  • The question is passed to the chat model as a parameter of the invoke method.

Run the program and note how the LLM responds to the question.

Giving context

Currently, the chat model is not grounded; it is unaware of surf conditions on the beach. It responds based on the question and the LLMs training data (which could be months or years out of date).

You can ground the chat model by providing additional information in the prompt.

Open the 2-llm-rag-python-langchain\chat_model_context.py file.

python
chat_model_context.py
import os
from dotenv import load_dotenv
load_dotenv()

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser

chat_llm = ChatOpenAI(openai_api_key=os.getenv('OPENAI_API_KEY'))

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
        ),
        ( "system", "{context}" ),
        ( "human", "{question}" ),
    ]
)

chat_chain = prompt | chat_llm | StrOutputParser()

current_weather = """
    {
        "surf": [
            {"beach": "Fistral", "conditions": "6ft waves and offshore winds"},
            {"beach": "Polzeath", "conditions": "Flat and calm"},
            {"beach": "Watergate Bay", "conditions": "3ft waves and onshore winds"}
        ]
    }"""

response = chat_chain.invoke(
    {
        "context": current_weather,
        "question": "What is the weather like on Watergate Bay?"
    }
)

print(response)

The prompt contains an additional context system message to pass the surf conditions to the LLM.

python
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
        ),
        ( "system", "{context}" ),
        ( "human", "{question}" ),
    ]
)

The current_weather variable contains the surf conditions for three beaches.

python
current_weather = """
    {
        "surf": [
            {"beach": "Fistral", "conditions": "6ft waves and offshore winds"},
            {"beach": "Polzeath", "conditions": "Flat and calm"},
            {"beach": "Watergate Bay", "conditions": "3ft waves and onshore winds"}
        ]
    }"""

The program invokes the chat model using the current_weather as the context.

python
response = chat_chain.invoke(
    {
        "context": current_weather,
        "question": "What is the weather like on Watergate Bay?"
    }
)

Run the program and predict what the response will be.

Click to reveal the response

Below is a typical response. The LLM has used the context passed in the prompt to provide a more accurate response.

Dude, the surf at Watergate Bay is pumping! We got some sick 3ft waves rolling in, but unfortunately, we got some onshore winds messing with the lineup. But hey, it's all good, still plenty of stoke to be had out there!

Investigate what happens when you change the context by adding additional beach conditions.

Providing context is one aspect of Retrieval Augmented Generation (RAG). In this program, you manually gave the model context; however, you could have retrieved real-time information from an API or database.

Memory

For a chat model to be helpful, it must remember what messages have been sent and received.

Without a memory the conversation may go in circles:

[user] Hi, my name is Martin
[chat model] Hi, nice to meet you Martin
[user] Do you have a name?
[chat model] I am the chat model. Nice to meet you. What is your name?

You are going to add a memory to the chat model code.

You will modify the program to store the chat history in Neo4j and pass it to the LLM in the prompt.

You will need to:

  1. Connect to the Neo4j database

  2. Create a function that returns a Neo4jChatMessageHistory component that will store the chat history

  3. Modify the prompt to include the chat history

  4. Wrap the chat chain in a Runnable that will store and retrieve the chat history

Add History to the Prompt

Start by importing the required components.

python
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.graphs import Neo4jGraph
from langchain_community.chat_message_histories import Neo4jChatMessageHistory
from uuid import uuid4

As each call to the LLM is stateless, you need to include the chat history in every call to the LLM. You can modify the prompt template to include the chat history as a list of messages using a MessagesPlaceholder object.

python
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
        ),
        ("system", "{context}"),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{question}"),
    ]
)

Session ID

You must create and assign a session ID to each conversation to identify them. You can generate a random UUID using the Python uuid.uuid4 function.

Create a new SESSION_ID constant in your chat model program.

python
SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")

Neo4j Chat Message History

Create a Neo4jGraph object to connect to your Neo4j sandbox.

python
graph = Neo4jGraph(
    url=os.getenv('NEO4J_URI'),
    username=os.getenv('NEO4J_USERNAME'),
    password=os.getenv('NEO4J_PASSWORD'),
)

The chain will require a callback function to return a memory component.

python
def get_memory(session_id):
    return Neo4jChatMessageHistory(session_id=session_id, graph=graph)

The get_memory function will return an instance of Neo4jChatMessageHistory. You should pass the session_id and the graph connection you created as parameters.

Chat Message History

You can now create a new chain using the RunnableWithMessageHistory, passing the chat_chain and the get_memory function.

python
chat_chain = prompt | chat_llm | StrOutputParser()

chat_with_message_history = RunnableWithMessageHistory(
    chat_chain,
    get_memory,
    input_messages_key="question",
    history_messages_key="chat_history",
)

Invoke the Chat Model

When you call the chat_with_message_history chain, the user’s question and the response will be stored in the ChatMessageHistory memory component. Every subsequent call to the chat_with_message_history chain will include the chat history in the prompt.

Put the call to the chat_with_message_history chain in a loop.

python
while True:
    question = input("> ")

    response = chat_with_message_history.invoke(
        {
            "context": current_weather,
            "question": question,
            
        }, 
        config={
            "configurable": {"session_id": SESSION_ID}
        }
    )
    
    print(response)

The SESSION_ID is passed to the chain in the invoke configuration.
Click to reveal the complete code.
python
import os
from dotenv import load_dotenv
load_dotenv()

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.graphs import Neo4jGraph
from langchain_community.chat_message_histories import Neo4jChatMessageHistory
from uuid import uuid4
SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")

chat_llm = ChatOpenAI(openai_api_key=os.getenv('OPENAI_API_KEY'))

graph = Neo4jGraph(
    url=os.getenv('NEO4J_URI'),
    username=os.getenv('NEO4J_USERNAME'),
    password=os.getenv('NEO4J_PASSWORD'),
)

def get_memory(session_id):
    return Neo4jChatMessageHistory(session_id=session_id, graph=graph)

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
        ),
        ("system", "{context}"),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{question}"),
    ]
)


chat_chain = prompt | chat_llm | StrOutputParser()

chat_with_message_history = RunnableWithMessageHistory(
    chat_chain,
    get_memory,
    input_messages_key="question",
    history_messages_key="chat_history",
)

current_weather = """
    {
        "surf": [
            {"beach": "Fistral", "conditions": "6ft waves and offshore winds"},
            {"beach": "Bells", "conditions": "Flat and calm"},
            {"beach": "Watergate Bay", "conditions": "3ft waves and onshore winds"}
        ]
    }"""

while True:
    question = input("> ")

    response = chat_with_message_history.invoke(
        {
            "context": current_weather,
            "question": question,
            
        }, 
        config={
            "configurable": {"session_id": SESSION_ID}
        }
    )
    
    print(response)

Run the program and ask the LLM a few questions and note how the LLM can now response based on the conversation history.

[user] Hi, I'm down at Watergate Bay.
[chat model] Hey dude, stoked to hear you're at Watergate Bay! How's the surf looking over there?
[user] It's good, do you know the forecast here?
[chat model] Right on, dude! The surf at Watergate Bay is firing with 3ft waves and some onshore winds.

Conversation History Graph

The conversation history is stored using the following data model:

A graph data model showing 2 nodes Session and Message connected by a LAST_MESSAGE relationship. There is a circular NEXT relationship on the Message node.

You can return the graph of the conversation history using the following Cypher query:

cypher
MATCH (s:Session)-[:LAST_MESSAGE]->(last:Message)<-[:NEXT*]-(msg:Message)
RETURN s, last, msg
A graph showing a Session node connected to a Message through with a LAST_MESSAGE relationship. Message nodes are connected to each other with NEXT relationships.

Continue

When you are ready, you can move on to the next task.

Lesson Summary

You learned how to use an LLM chat model, give it context, and store the conversation memory in Neo4j.

In the next lesson, you will learn how to create agents.