Introduction
You are going to learn how to approach relational to graph data modeling.
Imagine a simple relational model for an e-commerce system with Customers, Orders, Products, and Reviews:
-
Customerscan placeOrders. -
Orderscan containProducts. -
Customerscan give reviews forProducts.
erDiagram
Customer {
int customer_id PK
string name
string email
string address
}
Order {
int order_id PK
int customer_id FK
date order_date
decimal total_amount
}
Product {
int product_id PK
string name
string description
decimal price
}
Review {
int review_id PK
int customer_id FK
int product_id FK
int rating
string comment
date review_date
}
OrderProduct {
int order_id FK
int product_id FK
int quantity
decimal unit_price
}
Customer ||--o{ Order : "places"
Order ||--o{ OrderProduct : "contains"
Product ||--o{ OrderProduct : "included_in"
Customer ||--o{ Review : "writes"
Product ||--|| Review : "reviewed_in"Golden Rule
A useful guide when creating graph data models is to use:
(Noun)-[:VERB]-(Noun)
Nodes represents entities or things. Relationship describe how they are connected.
graph TD
Noun -->|Verb| Noun_
Customer -->|PLACED| Order
Person -->|DIRECTED| Movie
Noun(("<b>Noun</b>"))
Noun_(("<b>Noun</b>"))
Order(("<b>Order</b>"))
Customer(("<b>Customer</b>"))
Person(("<b>Person</b>"))
Movie(("<b>Movie</b>"))Tables to Node labels
A simplistic approach is to map tables to node labels and primary/foreign key pairs to relationships.
graph LR
subgraph "Relational Tables"
CT["Customer"]
OT["Order"]
PT["Product"]
OPT["OrderProduct<br/>(Many-to-Many)"]
end
subgraph "Graph Model"
CN(("Customer"))
ON(("Order"))
PN(("Product"))
REL["(Order)-[<b>CONTAINS</b>]->(Product)"]
end
CT --> CN
OT --> ON
PT --> PN
OPT --> RELMany to many relationships
There is a separate table, OrderProduct to store data relating to the many to many, Orders to Products, relationship. Many to many tables only exists to support the relational database structure, when modelling in a graph, you will not need these tables.
Properties
Fields can be mapped to properties in nodes.
graph LR
subgraph "Table"
CT["<b>Customer</b><br/>customer_id (PK)<br/>name<br/>email<br/>address"]
end
subgraph "Node"
CN(("<b>Customer</b>"))
PROPS["customerId<br/>name<br/>email<br/>address"]
end
CT --> CNRelationship properties
Fields can also be mapped to properties on relationships.
graph LR
subgraph "Table"
OPT["<b>OrderProduct Table</b><br/>order_id (FK)<br/>product_id (FK)<br/>quantity<br/>unit_price"]
end
subgraph "Graph"
O((Order))
P((Product))
REL[CONTAINS<br/>quantity: 2<br/>unitPrice: 15.99]
O --> REL --> P
end
OPT --> RELNeo4j’s schema-optional approach means that null values in tables can be excluded and do not need to be mapped to a property.
Naming conventions
Neo4j recommends the following naming conventions:
-
Node labels - PascalCase -
Customer,SalesRepresentative -
Relationships - ALL_CAPS_UNDERSCORE -
PLACED,LIVES_AT -
Properties - camelCase -
firstName,address
Graph Model
The final graph data model for this simple e-commerce system would be:
graph TD
Customer -->|PLACED| Order
Order -->|CONTAINS| Product
Customer -->|REVIEWED| Product
Customer(("<b>Customer</b>"))
Order(("<b>Order</b>"))
Product(("<b>Product</b>"))Simpler to understand
The graph data model is simpler and easier to understand than the relational model.
The many-to-many table has been removed and the relationships are more intuitive.
The review data is now stored as properties on the relationship between Customer and Product.
Next
Lesson Summary
In this lesson, you learned about relational to graph data modeling. You learned the golden rule of (Noun)-[:VERB]-(Noun), how to map tables to node labels and relationships, and how to map fields to properties.
In the next lesson, you will use the Aura Import tool to import data from a relational data set into a graph.