Finding Movie Plots

In the previous lesson, you learned about the theory of semantic search.

In this lesson, you will use Neo4j to explore a simple example of semantic search.

GraphAcademy has loaded the sandbox with 1000 movies and their plots. Run the following Cypher query to return the titles and plots for the movies in the database:

cypher
MATCH (m:Movie)
RETURN m.title, m.plot

Review the movies and find a plot that you think looks interesting.

You can adapt the query to only return a named movie by adding a filter:

cypher
MATCH (m:Movie {title: "Toy Story"})
RETURN m.title, m.plot

Finding similar movies

Semantic search works by comparing numerical representations of the text (known as embeddings).

You can view the embedding for a movie plot by running the following query:

cypher
MATCH (m:Movie {title: "Toy Story"})
RETURN m.title, m.plotEmbedding

You can find similar movies by using the embedding for the movie plot and a vector index.

You will learn more about vectors and embeddings and how to create them in the rest of this course.

You can query the vector index to find similar movies by running the following query:

cypher
MATCH (m:Movie {title: 'Toy Story'})

CALL db.index.vector.queryNodes('moviePlots', 6, m.plotEmbedding)
YIELD node, score

RETURN node.title, node.plot, score

In this example, the query returns plots similar to that of the Toy Story movie.

Understand and search unstructured data using vector indexesSimilar Plots Results

title

plot

score

"Toy Story"

"A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy’s room."

1.0

"Little Rascals, The"

"Alfalfa is wooing Darla and his He-Man-Woman-Hating friends attempt to sabotage the relationship."

0.9214372634887695

"NeverEnding Story III, The"

"A young boy must restore order when a group of bullies steal the magical book that acts as a portal between Earth and the imaginary world of Fantasia."

0.9206198453903198

"Drop Dead Fred"

"A young woman finds her already unstable life rocked by the presence of a rambunctious imaginary friend from childhood."

0.9199690818786621

"E.T. the Extra-Terrestrial"

"A troubled child summons the courage to help a friendly alien escape Earth and return to his home-world."

0.919100284576416

"Gumby: The Movie"

"In this offshoot of the 1950s claymation cartoon series, the crazy Blockheads threaten to ruin Gumby’s benefit concert by replacing the entire city of Clokeytown with robots."

0.9180967211723328

The similarity score is between 0.0 and 1.0, with 1.0 being the most similar. Note how the most similar plot is that of the Toy Story movie itself!

Run the same query for different movies and explore the results. Identify how the plots are similar and how the similarity score changes.

Lesson Summary

In this lesson, you explored a simple example of semantic search using Neo4j.

In the next lesson, you will learn about how semantic search uses embeddings and vectors to compare text.