Creating Embeddings

Find a movie plot

In this task, you will use Cypher and Python to create embeddings.

To find a movie with a plot you define, you need to create an embedding for your text before you can query the vector index.

To find a movie about "A mysterious spaceship lands Earth", you need to:

Create an embedding for the text "A mysterious spaceship lands Earth".
Pass the embedding to the db.index.vector.queryNodes function.

Setting browser parameters

To use the genai.* producedures you will need an API key. You can set this as a variable for the duration of the browser session by running the following Cypher command:

cypher

Setting parameters

:param openAiApiKey: 'sk-...'

You can use the RETURN clause to varify the parameter has been set successfully.

cypher

Using parameters

RETURN $openAiApiKey

This mirrors the process you would follow to use a parameter in your application.

Generate embedding

You can generate a new embedding in Cypher using the genai.vector.encode function:

cypher

WITH genai.vector.encode(
    "Text to create embeddings for",
    "OpenAI",
    { token: $token }) AS embedding
RETURN embedding

Query the vector index

You can use the embedding to query the vector index to find similar movies.

cypher

WITH genai.vector.encode(
    "A mysterious spaceship lands Earth",
    "OpenAI",
    { token: $openAiApiKey }) AS myMoviePlot
CALL db.index.vector.queryNodes('moviePlots', 6, myMoviePlot)
YIELD node, score
RETURN node.title, node.plot, score

Experiment with different movie plots and observe the results.

Generate embeddings using Python

You can also use LangChain and the OpenAI API to create embeddings using Python.

Open the 1-knowledge-graphs-vectors\create_embeddings.py file in the code editor.

python

import os
from dotenv import load_dotenv
load_dotenv()

from langchain_openai import OpenAIEmbeddings

embedding_provider = OpenAIEmbeddings(
    openai_api_key=os.getenv('OPENAI_API_KEY'),
    model="text-embedding-ada-002"
    )

embedding = embedding_provider.embed_query(
    "Text to create embeddings for"
    )

print(embedding)

Review the code before running it and note that:

load_dotenv() loads the environment variables from the .env file.
OpenAIEmbeddings() creates an instance of the OpenAI embedding class using the text-embedding-ada-002 model.
embedding_provider.embed_query() creates an embedding for the input text.
The embedding is printed to the console.

Run the code. You should see a list of numbers representing the embedding:

[-0.028445715084671974, 0.009996716864407063, 0.0017208183417096734, -0.010130099952220917, ...]

Continue

When you are ready, you can move on to the next task.

Summary

You learned how to create embeddings using Cypher and Python.

In the next task, you will learn how to create a vector index on an embedding.

Gen-AI - Hands-on Workshop

Knowledge Graphs, Unstructured Data, and Vectors

LLMs, RAG, Python, and LangChain

Creating Embeddings

Find a movie plot

Setting browser parameters

Generate embedding

Query the vector index

Generate embeddings using Python

Continue

Summary

Chatbot