How can I import data into Neo4j?

Approaches for Importing Data

When I say importing data, I mean the process of moving data from one system into Neo4j.

The import could be a one-time operation, or it could be a regular process that you need to automate.

Typically, the source system would have a different data model than Neo4j, and by importing data, you would transform the data into a graph model.

The source may expose data in different ways, for example:

  • Relational Database Management Systems (RDBMS)

  • Web APIs

  • Public data directories

  • BI tools

  • Excel

  • Flat files (CSV, JSON, XML)

Methods

The method by which you import data into Neo4j will depend on several factors, including:

  • The source of the data

  • The volume of data

  • The frequency of the import

  • The complexity of the data model

  • The transformation required

Options

The options available to you are numerous, and include:

  • One-off batch import of all data

  • One-off load with a regular update

  • Continuous import of data

  • Real-time application updates

  • ETL (Extract, Transform, Load) pipelines

It depends!

Ultimately, how you import data into Neo4j will depend on your specific requirements.

Tools

There are many tools available to import data into Neo4j, including:

  • Data Importer

  • Cypher and LOAD CSV

  • neo4j-admin

  • ETL tools

  • Custom application

Data Importer

The Neo4j Data Importer is a "no-code" tool that facilitates CSV data importing into Neo4j. Its graphical user interface allows for simple data conversion into nodes and relationships.

Data Importer allows you to:

  • Visually define the graph data model, including nodes, relationships, and properties.

  • Upload source data files

  • Map CSV columns to properties

  • Define unique ID constraints and indexes

Data Importer is an excellent tool for quickly importing data into Neo4j without writing any code.

A screenshot of the Neo4j Data Importer user interface

Cypher and LOAD CSV

Cypher has built-in support for importing data from CSV files using the LOAD CSV clause.

cypher
LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row

MERGE (t:Transactions {id: row.id})
SET
    t.reference = row.reference,
    t.amount = toInteger(row.amount),
    t.timestamp = datetime(row.timestamp)

You can control the import process by writing Cypher queries to:

  • Load data from CSV files

  • Create the data model

  • Transform and aggregate data

  • Control transactions

You can learn more about using Cypher and LOAD CSV in the Importing CSV data into Neo4j GraphAcademy course.

neo4j-admin

The neo4j-admin import command line interface supports importing large data sets. neo4j-admin import converts CSV files into the internal binary format of Neo4j and can import millions of rows within minutes.

The neo4j-admin import command expects you to format the data in a specific way and requires the database to be offline during the import process.

Data should preferably be clean and transformed before importing, as faults in the data will reduce the performance of the import process.

The neo4j-admin import is a highly configurable high-performance tool, applicable when you need to import large data sets very quickly.

You can learn more about using neo4j-admin import in the Neo4j-admin import tutorial.

ETL (Extract, Transform, Load) Tool

An ETL tool, for example Apache Hop, is a good choice for importing data from multiple sources. ETL tools generally support various data sources, can transform data into the desired format, and have visualization tools.

Many organizations use ETL tools to import data into Neo4j because they can handle complex data transformations and integrations.

Custom integration using Neo4j drivers

Building a custom application to load data into the graph database is a good option if you have complex business rules or need to integrate with other systems. A custom application will allow you complete control over the import process and integration with other systems and data sources.

There are several GraphAcademy courses for developers, where you can learn how to build applications using Neo4j drivers.

Mixed approaches

In practice, you may use a combination of these tools to import data into Neo4j. For example, you may use Data Importer for quick prototyping, Cypher for small data sets, and ETL tools for complex data transformations.

You may also choose to do one-off batch imports using one tool and real-time data ingestion using another tool.

Options for Importing Data

The Neo4j field team shared this flowchart of potential options and tools.

This flowchart is not an exhaustive list or a decision-making tool. It is a conversation starter, a way to understand some options, decision points, and tools available.

A flowchart showing numerous paths and options for importing data into Neo4j. The flowchart lists import tools such as neosemantics, Neo4j admin-import, Custom apps, ETL tools, and more.

Summary

In this lesson, you explored some of the approaches for importing data.