Developing a data model

Your import process defines the graph data model. Neo4j is schema-optional—you create nodes and relationships as you import, and that becomes your model. Design the model for your query objectives, not for the source structure.

Design for your objectives, not for the source structure

You should not let the source data structure dictate the graph data model. Instead, build a data model that works for your project’s objectives.

Create an import process that transforms the source data into a graph data model; do not create a model that fits the source data.

Common Import Misconceptions

When extracting and preparing data for import, avoid these common mistakes:

Misconception 1: "Export all columns from every table"

Believing that a complete export of every column from the source tables is necessary or advisable for the graph import.

Select only needed columns

Wrong: Extracting every column from the relational database.

Reality: Only extract columns that will become: * Node properties you actually need * Relationship properties * Foreign keys needed to create relationships

Columns like created_at, updated_at, or internal flags may not be needed in the graph.

Exporting orders with internal_version, row_checksum, and legacy_system_id adds noise. Extract only order_id, order_date, customer_id, employee_id, ship_via, and any columns that become useful properties or relationship keys.

Misconception 2: "Keep the same column names as properties"

Assuming that SQL column names (snake_case, prefixed) should be carried over unchanged as Neo4j property names.

Transform names for the graph

Wrong: Using SQL column names directly as Neo4j property names.

Reality: Transform column names to follow Neo4j conventions: * Use camelCase: company_name becomes companyName * Remove prefixes: customer_id becomes customerID or just use as identifier * Make names meaningful: ship_via becomes relationship to Shipper, not a property

Export contact_name and contact_title as contactName and contactTitle on Customer. Do not export ship_via as a property; create a SHIPPED_BY relationship to a Shipper node instead.

Misconception 3: "Import everything in one step"

Assuming that the entire graph—all nodes and relationships—can or should be imported in a single operation.

Break the import into steps

Wrong: Trying to import all data, nodes, and relationships in a single operation.

Reality: Import in phases: 1. Create constraints first 2. Import nodes (one label at a time) 3. Create relationships (after all referenced nodes exist)

This ensures referential integrity and better error handling.

Import Categories and Suppliers first, then Products (which reference both). Import Customers and Employees, then Orders (which reference Customers, Employees, Shippers). Finally create PLACED, CONTAINS, IN_CATEGORY, and other relationships. Do not try to create all nodes and relationships in one Cypher statement.

Misconception 4: "CSV structure must match the final graph structure"

Believing that CSV files must mirror the final graph structure (one file per node type, matching property names) with no transformation.

Transform CSVs to fit your graph model

Wrong: Assuming you need one CSV file per node type with exact property names.

Reality: You can: * Transform data during import using Cypher * Use a single CSV to create multiple node types * Derive properties from combinations of columns * Skip columns you do not need

A joined CSV with order_id, customer_id, customer_name, order_date can create both Order nodes and PLACED relationships in Data Importer. You do not need separate orders.csv and customers.csv with pre-joined keys; map the same file to multiple elements and derive what you need.

Modelling

You have two options for developing your graph data model visually:

Option A: Neo4j Data Importer (Aura)

If you are using Aura, the Data Importer provides built-in modelling:

  1. Open your AuraDB instance and click Import

  2. Add node labels by clicking Add node label on the canvas

  3. Drag between nodes to create relationships

  4. The model is automatically saved and can be exported

The Data Importer combines modelling and import in one tool - you design the model and import data in the same interface.

Option B: Arrows.app

Arrows is a standalone tool for creating graph data models.

A screenshot of the Arrows user interface

Arrows allows you to create a visual representation of the data model. Arrows supports:

  • Creation of nodes, relationships, properties, and labels

  • Styling including colors, sizes, and layouts

  • Export as an image or Cypher

When to use which tool

  • Data Importer - Best when you want to model and import in one step

  • Arrows.app - Best for creating documentation, sharing models, or planning before import

Optional arrows activity

Use Arrows to create a simple data model.

The data model should include the following nodes, properties, and relationships:

  • Node labels - Customer, Product, Order

  • Relationships

    • Customer - PURCHASEDProduct

    • Customer - PLACEDOrder

    • Order - CONTAINSProduct

  • Properties

    • Customer - id, name, email, address

    • Product - id, name, price

    • Order - id, date

    • CONTAINS - quantity

The above data model created in Arrows

The data model in Neo4j is flexible and can evolve as you import data. Neo4j supports a schema-less approach, allowing you to create data without a predefined schema.

Data types

As part of your data modeling and import process, you should consider data types and how you will represent them in Neo4j.

Neo4j supports a range of data types, including BOOLEAN, DATE, DURATION, FLOAT, INTEGER, LIST, LOCAL DATETIME, LOCAL TIME, POINT, STRING, ZONED DATETIME, and ZONED TIME.

You can learn more about Neo4j data types in the Neo4j documentation.

Check Your Understanding

Data Model

True or False - A data model has to exist before you can import data into Neo4j.

  • ❏ True

  • ✓ False

Hint

Neo4j is schema-optional and allows you to create data without a predefined schema.

Solution

The statement is False - You create the data model as you import data into Neo4j.

Summary

In this lesson, you explored how the data model influences how you import data into Neo4j.

In the next optional challenge, you will import your own data into Neo4j.

Chatbot

How can I help you today?