In the last lesson, embeddings were automatically created for you by the Neo4jVector class.
You are going to learn how to create embeddings directly and query Neo4j using Python.
Publicly available Large Language Models (LLMs) will typically have an API that you can use to create embeddings for text.
For example, OpenAI has an API that you can use to create embeddings for text.
The llm-vectors-unstructured/create_embeddings.py program uses the OpenAI API and Python library to create embeddings for text.
import os
from dotenv import load_dotenv
load_dotenv()
from openai import OpenAI
llm = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
response = llm.embeddings.create(
input="Text to create embeddings for",
model="text-embedding-ada-002"
)
print(response.data[0].embedding)You should be able to identify:
-
The
OpenAIclass requires an API key to be passed to it. -
The
llm.embeddings.createmethod is used to create an embedding for a piece of text. -
The
text-embedding-ada-002model is used to create the embedding. -
The
response.data[0].embeddingattribute is used to access the embedding.
Run the program and you should see the embedding returned:
[-0.02844466269016266, 0.009961248375475407, 0.0017426918493583798, -0.01016482524573803, 0.019080106168985367, 0.02178979106247425, -0.01836407743394375, -0.005099962465465069, -0.014285510405898094, ... ]Experiment by changing the input text - you will see the embeddings change.
Query neo4j with an embedding
Next, you are going to use the embedding to query the Neo4j chunkVector vector index you created in the last lesson.
Open the llm-vectors-unstructured/query_neo4j.py program:
import os
from dotenv import load_dotenv
load_dotenv()
from openai import OpenAI
llm = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
response = llm.embeddings.create(
input="What does Hallucination mean?",
model="text-embedding-ada-002"
)
embedding = response.data[0].embedding
# Connect to Neo4j
# graph =
# Run query
# result =
# Display results
# for row ... The program includes the code to create an embedding.
You will need to add the code to query Neo4j using the embedding.
First, import the LangChain Neo4jGraph class and create an object which will connect to the Neo4j sandbox:
from langchain_neo4j import Neo4jGraph
graph = Neo4jGraph(
url=os.getenv('NEO4J_URI'),
username=os.getenv('NEO4J_USERNAME'),
password=os.getenv('NEO4J_PASSWORD')
)Neo4jGraph class provides a simple mechanism with LangChain to interact with Neo4j. It is not a full-featured Neo4j client.Use the query method to run the Cypher to query the chunkVector index using the embedding:
result = graph.query("""
CALL db.index.vector.queryNodes('chunkVector', 6, $embedding)
YIELD node, score
RETURN node.text, score
""", {"embedding": embedding})The embedding is passed to the query method as a key/value pair in a dictionary.
Finally, iterate through the result and print the node.text and score values.
for row in result:
print(row['node.text'], row['score'])Click to view the complete code
import os
from dotenv import load_dotenv
load_dotenv()
from openai import OpenAI
from langchain_neo4j import Neo4jGraph
llm = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
response = llm.embeddings.create(
input="What does Hallucination mean?",
model="text-embedding-ada-002"
)
embedding = response.data[0].embedding
graph = Neo4jGraph(
url=os.getenv('NEO4J_URI'),
username=os.getenv('NEO4J_USERNAME'),
password=os.getenv('NEO4J_PASSWORD')
)
result = graph.query("""
CALL db.index.vector.queryNodes('chunkVector', 6, $embedding)
YIELD node, score
RETURN node.text, score
""", {"embedding": embedding})
for row in result:
print(row['node.text'], row['score'])When running the program, you should see the chunk text printed followed by the score.
Try modifying the input text and see how the results change.
When you have successfully queried Neo4j using the embedding, you can move on to the next lesson.
Lesson Summary
In this lesson, you used the OpenAI API to create an embedding and queried Neo4j using Python.
In the next lesson, you will create a graph of the course content.