Import Using Cypher

In this Challenge, you will use different CSV files that are much larger than what you have used previously.

This challenge has 7 steps:

  1. Delete all nodes and relationships in the graph.

  2. Ensure that all constraints exist in the graph.

  3. Import Movie and Genre data.

  4. Import Person data.

  5. Import the ACTED_IN relationships.

  6. Import the DIRECTED relationships.

  7. Import User data.

Step 1: Delete all nodes and relationships in the graph

As a first step, execute this code in the sandbox to the right to remove all data in the graph.

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/shared/detach-delete-all-nodes.cypher[]

Step 2: Ensure all constraints exist in the graph

Execute this code in the sandbox to the right to show the constraints in the graph.

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/shared/show-constraints.cypher[]

You must have four uniqueness constraints defined for:

  • Person.tmdbId

  • Movie.movieId

  • User.userId

  • Genre.name

Constraints created

If Person, Movie, and User constraints were previously created by the Data Importer, the names will be different, but the constraints should be in the graph and it is alright if they have different names.

For example, here is the code to create Genre constraint:

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/create-genre-name-constraint.cypher[]

You may need to create additional constraints so that you have a total of four constraints defined.

Step 3: Import Movie and Genre data

As a first step, execute this code in the sandbox to the right so that you can verify that the movie data is being properly transformed from the CSV file:

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/list-movie-entities.cypher[]

This is the Cypher code for first pass we will perform using the 2-movieData.csv file to create the Movie and Genre nodes. Notice in this code we perform all of the necessary transformations of types when we set the properties for the Movie node. We use MERGE to only create the Movie and Genre nodes if they do not already exist. And we create the IN_GENRE relationships.

Execute this code in the sandbox to the right to read the CSV data and create the Movie and Genre nodes:

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-start.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/load-movie-entities.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-end.cypher[]

When you execute this code you should see:

Added 9145 labels, created 9145 nodes, set 146020 properties, created 20340 relationships.

You may encounter a Neo.ClientError.Transaction.TransactionTimedOut error. This means that only part of the import was committed to the graph. You can simply rerun the code, but the number of nodes, labels, properties, relationships created may be different.

Step 4: Import Person data

As a first step, execute this code in the sandbox to the right so that you can verify that the person data is being property transformed from the CSV file:

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/list-person-entities.cypher[]

This is the Cypher code for second pass we will make through the 2-movieData.csv file to create the Person nodes for actors. Notice in this code we perform all the necessary transformations of types when we set the properties for the Person node. We use MERGE to only create the Person nodes if they do not already exist. We also set the Actor label and create the ACTED_IN relationships and set the role property for the relationship.

Execute this code in the sandbox on the right.

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-start.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/load-person-entities.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-end.cypher[]

When you execute this code, you should see:

Added 19047 labels, created 19047 nodes, set 152376 properties

You may encounter a Neo.ClientError.Transaction.TransactionTimedOut error. This means that only part of the import was committed to the graph. You can simply rerun the code, but the number of nodes, labels, properties, relationships created may be different.

Step 5: Import the ACTED_IN relationships

As a first step, execute this code in the sandbox to the right to see what data is being read from the CSV file:

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/list-acting-entities.cypher[]

This is the Cypher code for third pass we will make through the 2-movieData.csv file to create ACTED_IN relationships in the graph. We also add the Actor label to the Person node. Execute this code in the sandbox on the right.

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-start.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/load-acting-entities.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-end.cypher[]

When you execute this code, you should see:

Added 15443 labels, set 34274 properties, created 35910 relationships

You may encounter a Neo.ClientError.Transaction.TransactionTimedOut error. This means that only part of the import was committed to the graph. You can simply rerun the code, but the number of nodes, labels, properties, relationships created may be different.

Step 6: Import the DIRECTED relationships

As a first step, execute this code in the sandbox to the right to see what data is being read from the CSV file:

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/list-directing-entities.cypher[]

There are some rows in the CSV file where a value of "Directing" Work could have an associated role value. Modify the above query to show such rows.

Hint: Add AND row.role IS NOT NULL to the WHERE clause.

This is the Cypher code for forth pass we will make through the 2-movieData.csv file to create DIRECTED relationships in the graph. We also add the Director label to the Person node. Execute this code in the sandbox on the right.

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-start.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/load-directing-entities.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-end.cypher[]

When you execute this code, you should see:

Added 4091 labels, set 1152 properties, created 10007 relationships

Step 7: Import the User data

The 2-ratingData.csv file contains data for users who rated movies.

As a first step, execute this code in the sandbox to the right to see what data is being read from the CSV file:

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/list-ratings.cypher[]

Here is the code to create the users and RATED relationships.

Execute this code in the sandbox on the right.

Cypher
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-start.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/load-ratings.cypher[]
Unresolved directive in lesson.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/call-end.cypher[]

When you execute this code, you should see:

Added 671 labels, created 671 nodes, set 201350 properties, created 100004 relationships

You may encounter a Neo.ClientError.Transaction.TransactionTimedOut error. This means that only part of the import was committed to the graph. You can simply rerun the code, but the number of nodes, labels, properties, relationships created may be different.

Validate Results

Once you completed the steps of this Challenge, click the Check Database button and we will check the database for you.

Hint

Did you execute all seven steps of the Challenge?

Solution

Here is all the code that you should have executed.

You may want to execute each section (separated by comments "//") separately so that each code block executes without timihg out.

cypher
Unresolved directive in questions/verify.adoc - include::{repository-raw}/main/modules/4-importing-data-cypher/lessons/2-c-importing-with-cypher/solution.cypher[]

If your graph does not verify, you may need to redo ALL the steps.

Summary

In this challenge, you imported a large dataset using Cypher.

This concludes your introduction to importing CSV data into Neo4j.