In this lesson, you will explore how to cast data from a CSV file to different data types in Neo4j.
Casting
All data loaded using LOAD CSV
will be returned as strings - you need to cast the data to an appropriate data type before being written to a property.
The types of data that you can store as properties in Neo4j include:
-
String
-
Integer
-
Float (decimal values)
-
Boolean
-
Date/Datetime
-
Point (spatial)
-
Lists of values
There are Cypher functions to cast data to appropriate types. For example, when creating the Person
nodes, you used the toInteger()
function to cast IDs to integers.
...
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId)
Cypher functions to cast data include:
Function | Description |
---|---|
|
Converts a string to a boolean value |
|
Converts a string to a float value |
|
Converts a string to an integer value |
|
Converts a value to a string |
|
Converts a string to a date value |
|
Converts a string to a date and time value |
You can use the apoc.meta.nodeTypeProperties()
function to show the data types used in the graph:
CALL apoc.meta.nodeTypeProperties()
YIELD nodeType, propertyName, propertyTypes
Review the results and note, except for the IDs, that the data types for properties of Person
are all strings.
Node Type | Property | Data Type |
---|---|---|
":`Person`" |
"tmdbId" |
["Long"] |
":`Person`" |
"imdbId" |
["Long"] |
":`Person`" |
"bornIn" |
["String"] |
":`Person`" |
"born" |
["String"] |
":`Person`" |
"name" |
["String"] |
":`Person`" |
"bio" |
["String"] |
":`Person`" |
"died" |
["String"] |
Long
for integer values.Person node dates
The Person
nodes born
and died
properties are both dates, not strings.
You used this Cypher statement to create the Person
nodes:
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = row.born,
p.died = row.died
It should be modified to use the date()
function to convert the born
and died
properties to Date
values.
Correct the Person nodes
Run this updated query to modify the born
and died
properties to be Date
values.
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = date(row.born),
p.died = date(row.died)
Using MERGE not CREATE?
AsMERGE
was used in this Cypher statement, you can run it multiple times without creating duplicate nodes. It will update the existing nodes with the new date values. If you used CREATE
instead, you would create new nodes each time you ran the statement.Use the apoc.meta.nodeTypeProperties
function again to check that the born
and died
properties are now Date
values:
CALL apoc.meta.nodeTypeProperties()
YIELD nodeType, propertyName, propertyTypes
Advantages of using Date
The Date
data type allows you to extract the year
, month
, and day
from the date. For example,
MATCH (p:Person)
RETURN p.born.year as YearOfBirth
The remaining properties are all string values, so casting them to a different data type is unnecessary.
Check Your Understanding
Casting strings
True or False - You must cast text data from a CSV file to a string.
-
❏ True
-
✓ False
Hint
All data loaded using LOAD CSV
will be returned as strings
Solution
The statement is False - all fields in CSV files are already strings, text fields do not need to be cast to a string.
Summary
In this lesson, you learned how to cast data to different data types in Neo4j.
In the next lesson, you will update the Movie
nodes to use the relevant data type for each property.