The Cypher queries you have written will run within a single transaction. As a result, the data is rolled back if a failure occurs; and the graph is unchanged.
Importing significant volumes of data in a single transaction can result in large write operations - this can cause performance issues and potential failure.
You can split a query into multiple transactions using the CALL
clause with IN TRANSACTIONS
.
CALL {
// query
} IN TRANSACTIONS [OF X ROWS]
For example, the following query would create the Person
nodes in individual transactions.
CALL {
LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/importing-cypher/persons.csv'
AS row
MERGE (p:Person {tmdbId: toInteger(row.person_tmdbId)})
SET
p.imdbId = toInteger(row.person_imdbId),
p.bornIn = row.bornIn,
p.name = row.name,
p.bio = row.bio,
p.poster = row.poster,
p.url = row.url,
p.born = date(row.born),
p.died = date(row.died);
} IN TRANSACTIONS
You can batch the transactions by specifying the number of rows to process in each transaction.
For example, modifying the query above to process 100 rows in each transaction:
} IN TRANSACTIONS OF 100 ROWS
Check Your Understanding
Batching transactions
Complete the following Cypher statement to run a query in batches of 1000 rows.
CALL {
// query
/*select:} IN TRANSACTIONS OF 1000 ROWS*/
-
❏
} BY TRANSACTIONS OF 1000
-
❏
} BY TRANSACTIONS OF 1000 ROWS
-
❏
} IN TRANSACTIONS OF 1000
-
✓
} IN TRANSACTIONS OF 1000 ROWS
Hint
You need to specify the number of rows in each transaction.
Solution
The correct syntax is:
CALL {
// query
} IN TRANSACTIONS OF 1000 ROWS
Summary
In this lesson, you learned how to split a query into multiple transactions using the CALL
clause with IN TRANSACTIONS
.
In the next lesson, you will learn the importance of splitting your import into multiple steps and how to avoid the Eager problem.