Vector Retriever

In this lesson, you will create a vector retriever to retrieve relevant data from Neo4j.

A retriever is a component that takes unstructured data (typically a users query) and retrieves relevant data.

You will create a vector retriever that find similar movies based on a movie plot. The retriever will use the moviePlots vector index you used to search for similar movies using Cypher.

To find similar movies using a retriever you need to:

Connect to a Neo4j database
Create an embedder to convert users queries into vectors
Create a retriever that uses the moviePlots vector index
Use the retriever to search for similar movies using the users query
Parse the results

Open the genai-fundamentals/vector_retriever.py file and review the program:

python

vector_retriever.py

import os
from dotenv import load_dotenv
load_dotenv()

from neo4j import GraphDatabase

# Connect to Neo4j database
driver = GraphDatabase.driver(
    os.getenv("NEO4J_URI"), 
    auth=(
        os.getenv("NEO4J_USERNAME"), 
        os.getenv("NEO4J_PASSWORD")
    )
)

# Create embedder

# Create retriever

# Search for similar items

# Parse results

# CLose the database connection
driver.close()

The programs includes the code to connect to the Neo4j database using the Neo4j Python driver.

You can learn more about the Neo4j Python driver in the Graph Academy Using Neo4j with Python course.

Embedder

Create the embedder that will convert the users query into a vector:

python

from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings

# Create embedder
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")

You must use the same embedding model as the one used to create the movie plots embeddings, text-embedding-ada-002, to ensure the vectors are compatible.

The neo4j-graphrag package supports multiple embeddings models and the ability to create your own interface.

Retriever

Create the retriever that will use the moviePlots vector index:

python

from neo4j_graphrag.retrievers import VectorRetriever

# Create retriever
retriever = VectorRetriever(
    driver,
    index_name="moviePlots",
    embedder=embedder,
    return_properties=["title", "plot"],
)

The retriever allows you to specify what properties to return from the nodes that match the query.

Search

You can use the retriever to search the vector index by passing a query and the number of results to return. The retriever will use the embedder to convert the query into a vector to use in the search.

Search for similar movies:

python

# Search for similar items
result = retriever.search(query_text="Toys coming alive", top_k=5)

The search method returns a list of items that match the query.

Iterate over the items and print the results:

python

# Parse results
for item in result.items:
    print(item.content, item.metadata["score"])

Click to view the complete code

python

import os
from dotenv import load_dotenv
load_dotenv()

from neo4j import GraphDatabase
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
from neo4j_graphrag.retrievers import VectorRetriever

# Connect to Neo4j database
driver = GraphDatabase.driver(
    os.getenv("NEO4J_URI"), 
    auth=(
        os.getenv("NEO4J_USERNAME"), 
        os.getenv("NEO4J_PASSWORD")
    )
)

# Create embedder
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")

# Create retriever
retriever = VectorRetriever(
    driver,
    index_name="moviePlots",
    embedder=embedder,
    return_properties=["title", "plot"],
)

# Search for similar items
result = retriever.search(query_text="Toys coming alive", top_k=5)

# Parse results
for item in result.items:
    print(item.content, item.metadata["score"])

# Close the database connection
driver.close()

Run the program to search for similar movies based on a query.

You should see movie titles, plots, and the similarity score for the times found.

Click to reveal a typical output

{'title': 'Toy Story', 'plot': "A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room."} 0.9099578857421875
{'title': 'Pinocchio', 'plot': 'A living puppet, with the help of a cricket as his conscience, must prove himself worthy to become a real boy.'} 0.9085540771484375
{'title': 'Adventures of Pinocchio, The', 'plot': "One of puppet-maker Geppetto's creations comes magically to life. This puppet, Pinocchio, has one major desire and that is to become a real boy someday. In order to accomplish this goal he ..."} 0.9070587158203125
{'title': 'Jumanji', 'plot': 'When two kids find and play a magical board game, they release a man trapped for decades in it and a host of dangers that can only be stopped by finishing the game.'} 0.9043426513671875
{'title': 'Secret Adventures of Tom Thumb, The', 'plot': 'A boy born the size of a small doll is kidnapped by a genetic lab and must find a way back to his father in this inventive adventure filmed using stop motion animation techniques. Tom meets...'} 0.903472900390625

Experiment with different queries to find different movies.

Check Your Understanding

Embedders Role

Why do you need an embedder when searching a vector index?

✓ To convert the user’s query into a vector.
❏ To store the results of the search.
❏ To display the search results to the user.
❏ To create the vector index in the database.

Hint

The embedder is used to convert the unstructured text input into a vector.

Solution

The correct answer is To convert the user’s query into a vector.

The embedder is responsible for transforming the user’s query into a vector format that can be compared against the vectors stored in the vector index.

Lesson Summary

In this lesson, you learned how to create a vector retriever using the neo4j-graphrag package.

In the next module, you will build this retriever into a simple RAG pipeline that will use an LLM to answer questions using the retrieved data.

Neo4j & GenerativeAI Fundamentals

Generative AI

Retrieval Augmented Generation (RAG)

Knowledge Graphs

Integrating Neo4j with Generative AI

Vector Retriever

Embedder

Retriever

Search

Check Your Understanding

Embedders Role

Lesson Summary

Chatbot