In this lesson, you will learn how to use the MERGE clause to create nodes using data from a CSV file.
Load the CSV file
You will load a CSV file of "person" data into Person
nodes in Neo4j. The CSV file contains the following fields:
-
person_tmdbId
-
bio
-
born
-
bornIn
-
died
-
person_imdb
-
Id
-
name
-
person_poster
-
person_url
Follow these instructions to open the file and inspect the contents:
-
Download the persons.csv file
-
Open the file using a text editor to look at the contents.
You should see that the file contains headers, and the field delimiter is a comma (
,
). -
Run the following Cypher statement to load the CSV file and return the contents:
cypherLOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row RETURN row
The next step is to use the data in the CSV file to create Person
nodes.
Before running it, review the following Cypher statement:
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = row.born,
p.died = row.died
Try to answer the following questions:
-
Where does the CSV data come from?
-
What does the
MERGE
clause do? -
What variable holds the data from the CSV file?
-
Where are the properties set?
-
Why is the
toInteger
function used?
Review the answers
-
The LOAD CSV clause loads the CSV file from the specified URL.
-
The MERGE clause creates a new
Person
if one does not already exist with the sametmdbId
value. -
The
row
variable holds the data from the CSV file. -
The SET clause sets the properties of the
Person
node to the values of the corresponding fields in the CSV file. -
The
toInteger
function converts theperson_tmdbId
andperson_imdbId
values from strings to integers.
Create Person nodes
-
Run the Cypher statement to create the
Person
nodes:cypherLOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)}) SET p.imdbId = toInteger(row.person_imdbId), p.bornIn = row.bornIn, p.name = row.name, p.bio = row.bio, p.poster = row.poster, p.url = row.url, p.born = row.born, p.died = row.died
The import should create 444
Person
nodes. -
Confirm the data is in the graph by returning the first 25
Person
nodes:cypherMATCH (p:Person) RETURN p LIMIT 25
-
Check the results. Do the nodes have the correct properties?
Check Your Understanding
Creating nodes
Select the correct Cypher statement to create a node from data in a CSV file.
LOAD CSV WITH HEADERS 'file:///games.csv' AS record
/*select:MERGE (g:Game {title: row.title})*/
-
❏
MERGE (g:Game {title: row.title})
-
❏
MERGE (g:Game {title: g.title})
-
✓
MERGE (g:Game {title: record.title})
Hint
You get data from the CSV by using the file’s alias.
Solution
The alias of the CSV file is record
. You would use record
when setting property values.
LOAD CSV WITH HEADERS FROM 'file:///games.csv' AS record
MERGE (g:Game {title: record.title})
Summary
In this lesson, you reviewed a Cypher statement that loads a CSV file and creates nodes from the data in the file.