You can use a retriever as part of a RAG (Retrieval-Augmented Generation) pipeline to provide context to a LLM.
In this lesson, you will use the vector retriever you created to pass additional context to an LLM allowing it to generate more accurate and relevant responses.
Open the genai-fundamentals/vector_rag.py
file and review the program:
import os
from dotenv import load_dotenv
load_dotenv()
from neo4j import GraphDatabase
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
from neo4j_graphrag.retrievers import VectorRetriever
# Connect to Neo4j database
driver = GraphDatabase.driver(
os.getenv("NEO4J_URI"),
auth=(
os.getenv("NEO4J_USERNAME"),
os.getenv("NEO4J_PASSWORD")
)
)
# Create embedder
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")
# Create retriever
retriever = VectorRetriever(
driver,
index_name="moviePlots",
embedder=embedder,
return_properties=["title", "plot"],
)
# Create the LLM
# Create GraphRAG pipeline
# Search
# CLose the database connection
driver.close()
The program includes the code to connect to Neo4j and create the vector retriever.
You will add the code to:
-
Create and configure the LLM
-
Create the GraphRAG pipeline to use the vector retriever
-
Submit a query to the RAG pipeline
-
Parse the results
LLM
You will need an LLM to generate the response based on the users query and the context provided by the vector retriever.
Create the LLM using the OpenAILLM
class from the neo4j_graphrag
package:
from neo4j_graphrag.llm import OpenAILLM
# Create the LLM
llm = OpenAILLM(model_name="gpt-4o")
The LLM is configured to use gpt-4o
.
You can change the configuration to another OpenAI model by changing the model
parameter.
You can also change the temperature
to control the randomness of the generated response, by providing a value in the model_params
.
A value of 0
will make responses more deterministic, while a value closer to 1
will make responses more random.
# Modify the LLM configuration if needed
llm = OpenAILLM(
model_name="gpt-3.5-turbo",
model_params={"temperature": 1}
)
neo4j-graphrag
package supports multiple LLM models and the ability to create your own interface.GraphRAG pipeline
The GraphRAG
class allows you to create a RAG pipeline including a retriever and an LLM.
Create the GraphRAG
pipeline using the retriever
and the llm
you created:
from neo4j_graphrag.generation import GraphRAG
# Create GraphRAG pipeline
rag = GraphRAG(retriever=retriever, llm=llm)
The GraphRAG
pipeline will:
-
Use the retriever to find relevant context based on the user’s query.
-
Pass the user’s query and the retrieve context to the LLM.
Search
You can use the search
method to submit a query.
# Search
query_text = "Find me movies about toys coming alive"
response = rag.search(
query_text=query_text,
retriever_config={"top_k": 5}
)
print(response.answer)
The search
method takes the user’s query and returns the generated response from the LLM.
You can also specify addition retriever_config
, such as the number of results to return.
Click to view the complete code
import os
from dotenv import load_dotenv
load_dotenv()
from neo4j import GraphDatabase
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
from neo4j_graphrag.retrievers import VectorRetriever
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.generation import GraphRAG
# Connect to Neo4j database
driver = GraphDatabase.driver(
os.getenv("NEO4J_URI"),
auth=(
os.getenv("NEO4J_USERNAME"),
os.getenv("NEO4J_PASSWORD")
)
)
# Create embedder
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")
# Create retriever
retriever = VectorRetriever(
driver,
index_name="moviePlots",
embedder=embedder,
return_properties=["title", "plot"],
)
# Create the LLM
llm = OpenAILLM(model_name="gpt-4o")
# Create GraphRAG pipeline
rag = GraphRAG(retriever=retriever, llm=llm)
# Search
query_text = "Find me movies about toys coming alive"
response = rag.search(
query_text=query_text,
retriever_config={"top_k": 5}
)
print(response.answer)
# Close the database connection
driver.close()
Run the program to search for similar movies based on a plot.
Return context
You can also return the context that was used to generate the response. This can be useful in understanding how the LLM generated the response.
Add the return_context=True
parameter to the search
method:
# Search
query_text = "Find me movies about toys coming alive"
response = rag.search(
query_text=query_text,
retriever_config={"top_k": 5},
return_context=True
)
print(response.answer)
print("CONTEXT:", response.retriever_result.items)
The retriever_result
is returned along with the generated response.
Experiment with different queries and see how the LLM generates responses based on the context provided by the vector retriever.
Randomness
Responses from the LLM will not be consistent and may vary each time you run the program.
Experiment with the temperature
parameter in the LLM configuration to see how it affects the randomness of the generated response.
Check Your Understanding
Context
True or False: Using the GraphRAG
pipeline, the LLM generates a response without using any context from the retriever.
-
❏ True
-
✓ False
Hint
The GraphRAG
pipeline relies on the retriever to provide relevant context, which is then used by the LLM to generate a more accurate response.
Solution
The statement is False. In the GraphRAG
pipeline always uses passes the context to the LLM to generate a response.
Lesson Summary
In this lesson, you learned how to create a GraphRAG pipeline using a vector retriever and an LLM.
In the next lesson, you will build a more advanced retriever that combines vector search with graph traversal and relationships to enhance information retrieval.