In this lesson, you will explore how to cast data from a CSV file to different data types in Neo4j.
Casting
All data loaded using LOAD CSV will be returned as strings - you need to cast the data to an appropriate data type before being written to a property.
The types of data that you can store as properties in Neo4j include:
-
String
-
Integer
-
Float (decimal values)
-
Boolean
-
Date/Datetime
-
Point (spatial)
-
Lists of values
There are Cypher functions to cast data to appropriate types. For example, when creating the Person nodes, you used the toInteger() function to cast IDs to integers.
...
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId)Cypher functions to cast data include:
| Function | Description |
|---|---|
|
Converts a string to a boolean value |
|
Converts a string to a float value |
|
Converts a string to an integer value |
|
Converts a value to a string |
|
Converts a string to a date value |
|
Converts a string to a date and time value |
You can use the apoc.meta.nodeTypeProperties() function to show the data types used in the graph:
CALL apoc.meta.nodeTypeProperties()
YIELD nodeType, propertyName, propertyTypesReview the results and note, except for the IDs, that the data types for properties of Person are all strings.
| Node Type | Property | Data Type |
|---|---|---|
":`Person`" |
"tmdbId" |
["Long"] |
":`Person`" |
"imdbId" |
["Long"] |
":`Person`" |
"bornIn" |
["String"] |
":`Person`" |
"born" |
["String"] |
":`Person`" |
"name" |
["String"] |
":`Person`" |
"bio" |
["String"] |
":`Person`" |
"died" |
["String"] |
Long for integer values.Person node dates
The Person nodes born and died properties are both dates, not strings.
You used this Cypher statement to create the Person nodes:
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = row.born,
p.died = row.diedIt should be modified to use the date() function to convert the born and died properties to Date values.
Correct the Person nodes
Run this updated query to modify the born and died properties to be Date values.
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = date(row.born),
p.died = date(row.died)Using MERGE not CREATE?
AsMERGE was used in this Cypher statement, you can run it multiple times without creating duplicate nodes. It will update the existing nodes with the new date values. If you used CREATE instead, you would create new nodes each time you ran the statement.Use the apoc.meta.nodeTypeProperties function again to check that the born and died properties are now Date values:
CALL apoc.meta.nodeTypeProperties()
YIELD nodeType, propertyName, propertyTypesAdvantages of using Date
The Date data type allows you to extract the year, month, and day from the date. For example,
MATCH (p:Person)
RETURN p.born.year as YearOfBirthThe remaining properties are all string values, so casting them to a different data type is unnecessary.
Check Your Understanding
Casting strings
True or False - You must cast text data from a CSV file to a string.
-
❏ True
-
✓ False
Hint
All data loaded using LOAD CSV will be returned as strings
Solution
The statement is False - all fields in CSV files are already strings, text fields do not need to be cast to a string.
Summary
In this lesson, you learned how to cast data to different data types in Neo4j.
In the next lesson, you will update the Movie nodes to use the relevant data type for each property.