Vector Retriever

Vector search can be used in Retrieval Augmented Generation (RAG) applications to find relevant documents based on their content.

In this lesson, you will update the LangChain agent to use a vector retriever that will allow you to search for movies based on plot.

Open the genai-integration-langchain/vector_retriever.py file.

python
vector_retriever.py
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/vector_retriever.py[tags=**]

The code is similar to the agent you explored in the previous module.

To add the vector RAG retriever, you will to:

  1. Connect to Neo4j using Neo4jGraph

  2. Create an embedding model

  3. Create a vector store using Neo4jVector

  4. Update the retrieve function to use the vector store

Neo4j connection

Connect to Neo4j using Neo4jGraph:

python
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=import_neo4j]

Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=graph]

Embedding model

Create the embedding model using OpenAIEmbeddings:

python
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=import_embedding_model]

Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=embedding_model]

Vector store

Create the vector store using Neo4jVector:

python
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=import_neo4jvector]

Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=plot_vector]

Retriever

Finally, you need to update the retrieve function to use the vector store.

The retrieve function will need to:

  1. Use the similarity_search method of the Neo4jVector class to find similar nodes based on the question.

  2. Store the results as the context for the agent.

python
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=retrieve]
Click to see the complete code
python
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tags="**;!examples;!print_context"]

Run the application, review the results, and experiment with different questions:

Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=examples]

Context

The context is a list of Document objects returned by the similarity_search method. The Document content contains the movie plot and the node’s properties are returned as metadata.

The context is defined in the State class:

python
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=state]

You can define the context as any data type you want, as long as the retrieve function returns it in the expected format.

It is useful in development to output the context to see what data was used to generate the answer.

Update the code to print the context after the answer is generated:

python
Unresolved directive in lesson.adoc - include::https://raw.githubusercontent.com/neo4j-graphacademy/genai-integration-langchain/refs/heads/new-course/genai-integration-langchain/solutions/vector_retriever.py[tag=print_context]

Try running the application again and see how the context relates to the question and generated answer.

When you are ready, continue to the next module.

Lesson Summary

In this lesson, you learned how to integrate a Neo4j vector store with a LangChain agent.

In the next lesson, you will extend the retriever to use data from a graph.