In this lesson, you will learn how to use the MERGE clause to create nodes using data from a CSV file.
Load the CSV file
You will load a CSV file of "person" data into Person nodes in Neo4j. The CSV file contains the following fields:
-
person_tmdbId -
bio -
born -
bornIn -
died -
person_imdb -
Id -
name -
person_poster -
person_url
Follow these instructions to open the file and inspect the contents:
-
Download the persons.csv file
-
Open the file using a text editor to look at the contents.
You should see that the file contains headers, and the field delimiter is a comma (
,). -
Run the following Cypher statement to load the CSV file and return the contents:
cypherLOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row RETURN row
The next step is to use the data in the CSV file to create Person nodes.
Before running it, review the following Cypher statement:
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = row.born,
p.died = row.diedTry to answer the following questions:
-
Where does the CSV data come from?
-
What does the
MERGEclause do? -
What variable holds the data from the CSV file?
-
Where are the properties set?
-
Why is the
toIntegerfunction used?
Review the answers
-
The LOAD CSV clause loads the CSV file from the specified URL.
-
The MERGE clause creates a new
Personif one does not already exist with the sametmdbIdvalue. -
The
rowvariable holds the data from the CSV file. -
The SET clause sets the properties of the
Personnode to the values of the corresponding fields in the CSV file. -
The
toIntegerfunction converts theperson_tmdbIdandperson_imdbIdvalues from strings to integers.
Create Person nodes
-
Run the Cypher statement to create the
Personnodes:cypherLOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)}) SET p.imdbId = toInteger(row.person_imdbId), p.bornIn = row.bornIn, p.name = row.name, p.bio = row.bio, p.poster = row.poster, p.url = row.url, p.born = row.born, p.died = row.diedThe import should create 444
Personnodes. -
Confirm the data is in the graph by returning the first 25
Personnodes:cypherMATCH (p:Person) RETURN p LIMIT 25 -
Check the results. Do the nodes have the correct properties?
Check Your Understanding
Creating nodes
Select the correct Cypher statement to create a node from data in a CSV file.
LOAD CSV WITH HEADERS 'file:///games.csv' AS record
/*select:MERGE (g:Game {title: row.title})*/-
❏
MERGE (g:Game {title: row.title}) -
❏
MERGE (g:Game {title: g.title}) -
✓
MERGE (g:Game {title: record.title})
Hint
You get data from the CSV by using the file’s alias.
Solution
The alias of the CSV file is record. You would use record when setting property values.
LOAD CSV WITH HEADERS FROM 'file:///games.csv' AS record
MERGE (g:Game {title: record.title})Summary
In this lesson, you reviewed a Cypher statement that loads a CSV file and creates nodes from the data in the file.