In this lesson, you will explore the differences between relational and graph data models and how they affect migration.
What you will learn
In this lesson, you will:
-
Understand the differences between relational and graph data models
-
Learn about the Northwind dataset used in this course
-
See example questions that are difficult in SQL but straightforward in a graph
Business problems graphs solve
These business questions are easier to answer with graphs:
-
Management hierarchy: Who reports to whom? SQL needs recursive queries; Cypher follows the path directly.
-
Supply chain: Trace customer → orders → products → supplier. SQL needs many JOINs; Cypher traverses in one pattern.
-
Frequently bought together: Which products share an order? SQL needs self-joins; Cypher matches by pattern.
-
Category overlap: Which customers bought from two or more categories? SQL needs complex aggregates; Cypher traverses and checks.
-
Recommendations: What might a customer like based on similar customers? SQL needs nested queries; Cypher follows paths.
-
Supplier diversity: Which customers use the most suppliers? SQL needs subqueries; Cypher counts along the path.
If your queries follow paths between entities, a graph database may fit better.
Understanding relational and graph data models
Use this comparison to choose the right model and plan your migration.
Relational data model
Relational databases store data in tables: rows (records), columns (attributes), and foreign keys linking tables. You connect data at query time with JOINs. Schema is fixed upfront; normalization spreads data across tables. Relationship-heavy queries can get expensive.
Example relational schema: Northwind
In this course, you will work with the Northwind database, a classic sample that represents a company selling food products. It includes tables such as:
-
Customers:CustomerID,CompanyName,ContactName,City,Country -
Orders:OrderID,CustomerID,EmployeeID,OrderDate,ShipCity -
OrderDetails:OrderID,ProductID,Quantity,UnitPrice,Discount -
Products:ProductID,ProductName,CategoryID,SupplierID,UnitPrice -
Categories:CategoryID,CategoryName,Description -
Suppliers:SupplierID,CompanyName,ContactName,City,Country -
Employees:EmployeeID,FirstName,LastName,Title,ReportsTo
Implicit relationships in Northwind
The relationships are implicit through foreign keys:
-
Orders.CustomerIDreferencesCustomers.CustomerID -
Orders.EmployeeIDreferencesEmployees.EmployeeID -
OrderDetails.OrderIDreferencesOrders.OrderID -
OrderDetails.ProductIDreferencesProducts.ProductID -
Products.CategoryIDreferencesCategories.CategoryID -
Products.SupplierIDreferencesSuppliers.SupplierID
Northwind schema links
The Northwind schema originates from Microsoft. For the canonical relational schema and official diagram, see Microsoft sql-server-samples northwind-pubs and Microsoft Northwind database diagram.
Course dataset: Northwind
Throughout this course, you will work with the Northwind database, a classic relational database sample that represents a company selling food products.
What the Northwind database contains
The Northwind database includes data about:
-
Customers - Companies that purchase products
-
Orders - Purchase transactions with dates and shipping information
-
Products - Items available for sale with pricing
-
Categories: Product groupings such as Beverages, Condiments, Seafood
-
Suppliers - Companies that provide products
-
Employees - Staff members who process orders
-
Shippers - Delivery companies
Northwind data sources
You can download the SQL script for the Northwind dataset from: Northwind SQL script
If you are ready to import the data into Neo4j, this repository contains the CSV files: Northwind CSV files
Graph data model
Graph databases use nodes (entities), relationships (links between them), and properties (key-value pairs). Relationships are stored directly—you follow them instead of JOINing. Schema is flexible; traversal cost stays roughly constant as the graph grows.
Example graph model
The equivalent Northwind data in a graph would have:
-
Customernodes with properties:companyName,contactName,city,country -
Ordernodes with properties:orderDate,shipCity -
Productnodes with properties:productName,unitPrice -
Categorynodes with properties:categoryName,description -
Suppliernodes with properties:companyName,contactName,city,country -
Employeenodes with properties:firstName,lastName,title -
PLACEDrelationships fromCustomertoOrder -
CONTAINSrelationships fromOrdertoProductwith properties:quantity,unitPrice,discount -
BELONGS_TOrelationships fromProducttoCategory -
SUPPLIESrelationships fromSuppliertoProduct
Northwind graph model (continued)
-
SOLDrelationships fromEmployeetoOrder -
REPORTS_TOrelationships fromEmployeetoEmployeefor the management hierarchy
Main differences
Summary of how the two models differ:
Relational model |
Graph model |
Data is stored in separate tables |
Data is stored as connected nodes |
Relationships are implicit as foreign keys |
Relationships are explicit and stored directly |
Requires JOIN operations to traverse |
Relationships are traversed directly, no JOINs |
Schema must be defined upfront |
Schema is optional and flexible |
Normalization spreads data out, which can complicate queries |
Structure matches how you think about connections |
Query comparison: same result, different approach
Compare these queries side by side. Both should return a customer’s orders and products.
SELECT c.CompanyName, o.OrderDate,
p.ProductName, od.Quantity,
od.UnitPrice
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
JOIN OrderDetails od ON o.OrderID = od.OrderID
JOIN Products p ON od.ProductID = p.ProductID
WHERE c.CustomerID = 'ALFKI';MATCH (c:Customer)-[:PLACED]->
(o:Order)-[r:CONTAINS]->
(p:Product)
WHERE c.customerID = 'ALFKI'
RETURN c.companyName, o.orderDate,
p.productName, r.quantity,
r.unitPriceSQL joins by keys; Cypher follows the path. More hops = heavier SQL; Cypher stays consistent.
Performance
JOIN cost grows with tables and data size. Graph traversal stays roughly constant per hop—path length matters, not graph size.
When to use each model
Use these criteria to decide when to use relational vs graph:
Use relational when:
-
Data is tabular; relationships are simple
-
Queries are mostly aggregations
-
Schema is stable
Use graph when:
-
Relationships matter and are complex
-
You traverse paths often
-
You find patterns (recommendations, fraud, networks)
-
Schema may evolve
Migration considerations
-
Not everything belongs in a graph — Keep some data in relational storage.
-
Transform, don’t copy — Design a graph model for your use case; don’t mirror tables 1:1.
-
Foreign keys → relationships — They become first-class links with optional properties.
-
Some duplication is OK — Graphs can duplicate data to speed up queries.
-
Schema can change — Evolve your model as needs change.
Common Misconceptions
Avoid these common misconceptions. Each of these is a big consideration and is covered in depth in later lessons (Module 2: Identifying Nodes, Mapping Relationships, and related topics).
Misconception 1: "Every table becomes a node"
Not every table becomes a node
Wrong: Automatically converting every SQL table to a graph node.
Reality: * Entity tables (customers, products, orders) → Nodes * Junction tables (order_details) → Relationships with properties * Lookup tables (status codes) → Often become properties, not nodes * Audit tables → May not belong in the graph at all
Northwind’s order_details links orders to products with quantity and price. It becomes a CONTAINS relationship between Order and Product, not an OrderDetail node.
Misconception 2: "Rows become nodes, columns become relationships"
Rows, columns, and foreign keys map to graph elements
Wrong: Thinking columns map to relationships.
Reality: * Rows → Individual node instances * Columns → Properties on nodes * Foreign key columns → Relationships to other nodes * Junction tables → Relationships with properties
Example: customer_id in orders becomes a PLACED relationship to Customer, not an Order property. company_name becomes a companyName property on Customer.
Misconception 3: "The graph should mirror the relational schema"
Design for query patterns, not schema replication
Wrong: Copying the relational schema 1:1.
Reality: Design the graph for your queries. A good graph model often looks different from the source schema.
Example: For "products bought by customers in the same region," add a SAME_REGION relationship between Customer nodes. It doesn’t exist in the relational schema but makes the query simpler.
Misconception 4: "All data must be migrated"
Migrate only what benefits from graph representation
Wrong: Import everything into the graph.
Reality: Migrate only what benefits from being a graph. Logs, archives, and simple lookups often stay in relational storage or get omitted.
Example: An audit_log or us_states lookup table rarely benefits. Keep them in the database or omit them.
Design for your use case, not for schema replication
Design a graph model that serves your project—don’t replicate the relational schema.
Check your understanding
Understanding relational and graph models
Which of the following statements best describes a difference between relational and graph data models?
-
❏ Relational databases store data in tables, while graph databases store data in documents
-
✓ Relational databases use foreign keys to represent relationships implicitly, while graph databases store relationships explicitly as first-class citizens
-
❏ Graph databases require JOIN operations to traverse relationships, while relational databases traverse relationships directly
-
❏ Relational databases have flexible schemas, while graph databases require rigid schema definitions
Hint
Relationships are represented and accessed differently in each model. Consider whether relationships are stored directly or need to be computed.
Solution
Relational databases use foreign keys to represent relationships implicitly, while graph databases store relationships explicitly as first-class citizens.
In relational databases, relationships between tables are represented through foreign key columns that reference primary keys in other tables. These relationships are implicit and require JOIN operations to traverse.
In graph databases, relationships are stored directly as edges between nodes. They are first-class citizens with their own properties and can be traversed directly without JOIN operations, making relationship queries much faster.
Summary
-
Relational: tables, rows, columns, foreign keys; connect with JOINs.
-
Graph: nodes, relationships, properties; follow links directly.
-
Choose graph when your queries follow paths between entities.
-
Migrate only what benefits; design for your use case, not schema replication.
Next: prerequisites and tools for importing relational data into Neo4j.