In the Answer Generation Chain lesson, you built a chain that answers a question based on the context provided in the prompt.
As we covered in the Retrievers lesson of Neo4j & LLM Fundamentals, semantic search in LangChain is performed using an object called a Retriever.
A Retriever is an abstraction that uses a Vector Store to identify similar documents based on an input by converting the input into a vector embedding and performing a similarity search against the vectors stored in an index.
To pass this challenge, you must modify the initVectorStore()
function in modules/agent/vector.store.ts
to create a new Neo4jVectorStore
instance.
vector.store.ts
→
Set up the Vector Index
First, you must create a vector index in your Sandbox instance to use a Vector Store.
Run the CREATE VECTOR INDEX
command below to create a vector index called moviePlots
if it does not already exist.
CREATE VECTOR INDEX `moviePlots` IF NOT EXISTS
FOR (n: Movie) ON (n.embedding)
OPTIONS {indexConfig: {
`vector.dimensions`: 1536,
`vector.similarity_function`: 'cosine'
}};
The statement creates a new index called moviePlots
, indexing the vectors in the embedding
property.
The vectors stored in the embedding
property have been created using the text-embedding-ada-002
model and therefore have 1536
dimensions.
The index will use cosine
similarity to identify similar documents.
To learn more about how Vector Retrievers work, see the Retrievers lesson in Neo4j & LLM Fundamentals.
Next, run the following statement to load a CSV file containing embeddings of movie plots.
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/llm-fundamentals/openai-embeddings.csv'
AS row
MATCH (m:Movie {movieId: row.movieId})
CALL db.create.setNodeVectorProperty(m, 'embedding', apoc.convert.fromJsonList(row.embedding))
RETURN count(*);
Creating a Store
Inside modules/agent/vector.store.ts
, you will find an initVectorStore()
function.
export default async function initVectorStore(
embeddings: EmbeddingsInterface
): Promise<Neo4jVectorStore> {
// TODO: Create vector store
// const vectorStore = await Neo4jVectorStore.fromExistingIndex(embeddings, { ... })
// return vectorStore
}
Inside this function, use the Neo4jVectorStore.fromExistingIndex()
method to create a new vector store instance.
const vectorStore = await Neo4jVectorStore.fromExistingIndex(embeddings, {
url: process.env.NEO4J_URI as string,
username: process.env.NEO4J_USERNAME as string,
password: process.env.NEO4J_PASSWORD as string,
indexName: "moviePlots",
textNodeProperty: "plot",
embeddingNodeProperty: "embedding",
retrievalQuery: `
RETURN
node.plot AS text,
score,
{
_id: elementid(node),
title: node.title,
directors: [ (person)-[:DIRECTED]->(node) | person.name ],
actors: [ (person)-[r:ACTED_IN]->(node) | [person.name, r.role] ],
tmdbId: node.tmdbId,
source: 'https://www.themoviedb.org/movie/'+ node.tmdbId
} AS metadata
`,
});
Document Metadata
You may have noticed the retrievalQuery
argument defined when creating the vectorStore
variable.
The metadata
object allows you to return additional information that could help improve the LLM response.
In this case, the title is returned with the names of actors and directors and a canonical link to the movie on The Movie Database (TMDB).
The _id
property will contain the Element ID for each source document in the database.
You will use these IDs to create relationships that provide transparency on the context provided to help the LLM generate its response.
Finally, return the vectorStore
from the function.
return vectorStore;
If you have followed the steps correctly, your code should resemble the following:
export default async function initVectorStore(
embeddings: EmbeddingsInterface
): Promise<Neo4jVectorStore> {
const vectorStore = await Neo4jVectorStore.fromExistingIndex(embeddings, {
url: process.env.NEO4J_URI as string,
username: process.env.NEO4J_USERNAME as string,
password: process.env.NEO4J_PASSWORD as string,
indexName: "moviePlots",
textNodeProperty: "plot",
embeddingNodeProperty: "embedding",
retrievalQuery: `
RETURN
node.plot AS text,
score,
{
_id: elementid(node),
title: node.title,
directors: [ (person)-[:DIRECTED]->(node) | person.name ],
actors: [ (person)-[r:ACTED_IN]->(node) | [person.name, r.role] ],
tmdbId: node.tmdbId,
source: 'https://www.themoviedb.org/movie/'+ node.tmdbId
} AS metadata
`,
});
return vectorStore;
}
Testing your changes
If you have followed the instructions, you should be able to run the following unit test to verify the response using the npm run test
command.
npm run test vector.store.test.ts
View Unit Test
import { OpenAIEmbeddings } from "@langchain/openai";
import initVectorStore from "./vector.store";
import { Neo4jVectorStore } from "@langchain/community/vectorstores/neo4j_vector";
import { close } from "../graph";
describe("Vector Store", () => {
afterAll(() => close());
it("should instantiate a new vector store", async () => {
const embeddings = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY as string,
configuration: {
baseURL: process.env.OPENAI_API_BASE,
},
});
const vectorStore = await initVectorStore(embeddings);
expect(vectorStore).toBeInstanceOf(Neo4jVectorStore);
await vectorStore.close();
});
it("should create a test index", async () => {
const indexName = "test-index";
const embeddings = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY as string,
configuration: {
baseURL: process.env.OPENAI_API_BASE,
},
});
const index = await Neo4jVectorStore.fromTexts(
["Neo4j GraphAcademy offers free, self-paced online training"],
[],
embeddings,
{
url: process.env.NEO4J_URI as string,
username: process.env.NEO4J_USERNAME as string,
password: process.env.NEO4J_PASSWORD as string,
nodeLabel: "Test",
embeddingNodeProperty: "embedding",
textNodeProperty: "text",
indexName,
}
);
expect(index).toBeInstanceOf(Neo4jVectorStore);
expect(index["indexName"]).toBe(indexName);
await index.close();
});
});
Verifying the Test
If every test in the test suite has passed, a new test-index
vector index will be created in your database.
Click the Check Database button below to verify the tests have succeeded.
Hint
You can compare your code with the solution in src/solutions/modules/agent/vector.store.ts
and double-check that the conditions have been met in the test suite.
Solution
You can compare your code with the solution in src/solutions/modules/agent/vector.store.ts
and double-check that the conditions have been met in the test suite.
You can also run the following Cypher statement to double-check that the index has been created in your database.
SHOW INDEXES WHERE type = 'VECTOR'
Once you have verified your code and re-ran the tests, click Try again…* to complete the challenge.
Summary
In this lesson, you wrote the code to save and retrieve conversation history in a Neo4j database.
In the next lesson, you will construct a chain that will take this history to rephrase the user’s input into a standalone question.