During this course, you have created queries which complete the following tasks:
-
Create
Person
andMovie
constraints -
Import data from
persons.csv
and createPerson
nodes -
Import data from
movies.csv
and createMovie
nodes -
Create
ACTED_IN
andDIRECTED
relationships betweenPerson
andMovie
nodes -
Create additional
ACTOR
andDIRECTOR
labels onPerson
nodes
All the queries are independent of each other and do not form a single process.
View all the import queries
CREATE CONSTRAINT Person_tmdbId IF NOT EXISTS
FOR (x:Person)
REQUIRE x.tmdbId IS UNIQUE;
CREATE CONSTRAINT Movie_movieId IF NOT EXISTS
FOR (x:Movie)
REQUIRE x.movieId IS UNIQUE;
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = date(row.born),
p.died = date(row.died);
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row
MERGE (m:Movie {movieId: toInteger(row.movieId)})
SET
m.tmdbId = toInteger(row.movie_tmdbId),
m.imdbId = toInteger(row.movie_imdbId),
m.released = date(row.released),
m.title = row.title,
m.year = toInteger(row.year),
m.plot = row.plot,
m.budget = toInteger(row.budget),
m.imdbRating = toFloat(row.imdbRating),
m.poster = row.poster,
m.runtime = toInteger(row.runtime),
m.imdbVotes = toInteger(row.imdbVotes),
m.revenue = toInteger(row.revenue),
m.url = row.url,
m.countries = split(row.countries, '|'),
m.languages = split(row.languages, '|');
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/acted_in.csv' AS row
MATCH (p:Person {tmdbId: toInteger(row.person_tmdbId)})
MATCH (m:Movie {movieId: toInteger(row.movieId)})
MERGE (p)-[r:ACTED_IN]->(m)
SET r.role = row.role;
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/directed.csv' AS row
MATCH (p:Person {tmdbId: toInteger(row.person_tmdbId)})
MATCH (m:Movie {movieId: toInteger(row.movieId)})
MERGE (p)-[r:DIRECTED]->(m);
MATCH (p:Person)-[:ACTED_IN]->()
WITH DISTINCT p SET p:Actor;
MATCH (p:Person)-[:DIRECTED]->()
WITH DISTINCT p SET p:Director;
In this lesson, you will build a single import process to rebuild the graph by:
-
Deleting any existing data
-
Dropping any existing constraints
-
Run the queries to create the constraints, nodes, and relationships
The benefit of this approach is that you can easily re-run and test a single repeatable process.
Multiple Queries
To run multiple queries together, you must put a semi-colon (;
) at the end of each query.
For example, this Cypher code contains two separate queries which will run one after the other:
MATCH (p:Person) RETURN p;
MATCH (m:Movie) RETURN m;
Resetting the data
Before you can re-run the import process, you must delete any existing data and drop any constraints.
The nodes and relationships within the graph hold all of the data. You will need to delete the relationships before deleting the nodes.
The following Cypher will delete the ACTED_IN
and DIRECTED
relationships:
MATCH (Person)-[r:ACTED_IN]->(Movie) DELETE r;
MATCH (Person)-[r:DIRECTED]->(Movie) DELETE r;
Once the relationships are deleted, you can delete the Person
and Movie
nodes:
MATCH (p:Person) DELETE p;
MATCH (m:Movie) DELETE m;
Alternatively, you can use DETACH DELETE
to delete the nodes and relationships at the same time:
MATCH (p:Person) DETACH DELETE p;
MATCH (m:Movie) DETACH DELETE m;
You could also drop the constraints on the Person
and Movie
nodes if they exist:
DROP CONSTRAINT Person_tmdbId IF EXISTS;
DROP CONSTRAINT Movie_movieId IF EXISTS;
These queries reset the database and allow you to re-run the import process.
Importing the data
Combining the queries above with those from the previous lessons will create a single import process.
MATCH (p:Person) DETACH DELETE p;
MATCH (m:Movie) DETACH DELETE m;
DROP CONSTRAINT Person_tmdbId IF EXISTS;
DROP CONSTRAINT Movie_movieId IF EXISTS;
CREATE CONSTRAINT Person_tmdbId IF NOT EXISTS
FOR (x:Person)
REQUIRE x.tmdbId IS UNIQUE;
CREATE CONSTRAINT Movie_movieId IF NOT EXISTS
FOR (x:Movie)
REQUIRE x.movieId IS UNIQUE;
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = date(row.born),
p.died = date(row.died);
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row
MERGE (m:Movie {movieId: toInteger(row.movieId)})
SET
m.tmdbId = toInteger(row.movie_tmdbId),
m.imdbId = toInteger(row.movie_imdbId),
m.released = date(row.released),
m.title = row.title,
m.year = toInteger(row.year),
m.plot = row.plot,
m.budget = toInteger(row.budget),
m.imdbRating = toFloat(row.imdbRating),
m.poster = row.poster,
m.runtime = toInteger(row.runtime),
m.imdbVotes = toInteger(row.imdbVotes),
m.revenue = toInteger(row.revenue),
m.url = row.url,
m.countries = split(row.countries, '|'),
m.languages = split(row.languages, '|');
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/acted_in.csv' AS row
MATCH (p:Person {tmdbId: toInteger(row.person_tmdbId)})
MATCH (m:Movie {movieId: toInteger(row.movieId)})
MERGE (p)-[r:ACTED_IN]->(m)
SET r.role = row.role;
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/directed.csv' AS row
MATCH (p:Person {tmdbId: toInteger(row.person_tmdbId)})
MATCH (m:Movie {movieId: toInteger(row.movieId)})
MERGE (p)-[r:DIRECTED]->(m);
MATCH (p:Person)-[:ACTED_IN]->()
WITH DISTINCT p SET p:Actor;
MATCH (p:Person)-[:DIRECTED]->()
WITH DISTINCT p SET p:Director;
You can run this query at any point to refresh the database with the latest data.
A single process to build your graph provides a consistent mechanism to test your import.
Check Your Understanding
Running multiple queries
What character must you put at the end of a query to run multiple queries together?
-
❏
,
-
✓
;
-
❏
|
-
❏
/
Hint
Cypher uses the same pattern as many programming languages to determine the end of an instruction.
Solution
To run multiple queries together, you must put a semi-colon (;
) at the end of each query.
Summary
In this lesson, you learned:
-
How to run multiple queries together.
-
The benefits of creating a single import process.
-
How to delete nodes and relationships and drop constraints.
In the next lesson, you will learn to split your import into multiple transactions.