Introduction
In the previous module, you imported nodes into your graph. Nodes represent entities, but relationships connect them together to enable complex queries about how entities relate.
In this lesson, you will learn about relationships in Neo4j and how they enable the recommendation system.
Understanding relationships
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
C((Customer)):::primary
O((Order)):::highlight
C -- "PLACED" --> O
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7a
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3
linkStyle default stroke:#94a3b8,stroke-width:1.25pxRelationships connect nodes and represent associations between entities.
Every relationship has:
-
Type - A name that describes the connection (PLACED, CONTAINS, IN_CATEGORY)
-
Direction - Points from one node to another (Customer placed an Order)
-
Properties - Optional attributes that describe the relationship (quantity, discount)
Deciding what should be a relationship
Relationships represent connections - associations between entities that already exist as nodes.
Think of relationships as verbs - they describe actions or connections between nouns (Customer placed an Order, Order contains a Product).
The problem with relational databases
The irony is that relational databases do not handle relationships well. Foreign keys are backed by indexes, and joins are calculated at read-time. The larger the tables grow, the larger the indexes become, and the slower the queries become.
Modeling the same data
Question: "What product categories has this customer purchased from?"
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
erDiagram
%% Customer places Orders (one-to-many)
%% Customer ||--o{ Order : places
%% Order contains OrderDetails (one-to-many)
%% Order ||--o{ OrderDetail : contains
%% OrderDetail references Product (many-to-one)
%% OrderDetail }o--|| Product : for
%% Product in Category (many-to-one)
Product }o--|| Category : in
%% Entity definitions
%% Customer entity
%% Customer {
%% string customerId PK
%% string companyName
%% string country
%% }
%% Order entity
%% Order {
%% int orderId PK
%% string customerId FK
%% date orderDate
%% }
%% OrderDetail entity
%% OrderDetail {
%% int orderId FK
%% int productId FK
%% int quantity
%% decimal unitPrice
%% }
Product {
int productId PK
string productName
int categoryId FK
}
Category {
int categoryId PK
string categoryName
}%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
%% C((Customer<br/>ALFKI))
%% O((Order))
P((Product)):::forest
Cat((Category)):::earth
%% C -->|PLACED| O
%% O -->|CONTAINS| P
P -->|IN_CATEGORY| Cat
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1e
classDef earth fill:#f4ebe3,stroke:#dcc4a2,stroke-width:1.25px,color:#5c3a1e
linkStyle default stroke:#94a3b8,stroke-width:1.25pxWhy graph traversals scale well
Graph databases optimize for relationship-heavy workloads by storing adjacency directly. That reduces the need for global index lookups and join materialization, making deep traversals more predictable and often significantly faster than equivalent multi-join relational queries—especially when relationship depth increases.
SQL JOINs compound
Each JOIN adds another index scan. With large tables:
-
1 JOIN: Scan 2 indexes, match keys
-
2 JOINs: Scan 3 indexes, match keys twice
-
3 JOINs: Scan 4 indexes, match keys three times
-
Result: Cost grows with each hop as the database repeatedly scans indexes and materializes joins
Graph traversals follow pointers
Each relationship is a direct pointer stored with the node. Following it is pointer chasing in memory:
-
1 hop: Follow one pointer from the current node
-
2 hops: Follow two pointers in sequence
-
3 hops: Follow three pointers in sequence
-
Result: Each hop is a direct memory follow; speed stays consistent regardless of total database size
The performance gap
The performance gap widens with scale. With millions of records, the difference becomes dramatic. The more relationships you traverse, the bigger the advantage:
-
1-2 hops: SQL is acceptable
-
3+ hops: SQL becomes prohibitively slow
-
Many-to-many relationships: SQL requires junction tables and additional joins; graphs use direct relationships
Identifying Northwind relationships
Based on the decision criteria, these Northwind relationships connect your entities:
-
PLACED - Customer to Order (who placed which orders)
-
IN_CATEGORY - Product to Category (how products are organized)
-
CONTAINS - Order to Product (what products are in each order)
Each of these relationships represents a connection between entities, has a natural direction, and will be traversed in queries.
Analyzing PLACED as a relationship
PLACED is a relationship in Northwind because:
-
Connects two entities - Links Customer (who) to Order (what), both of which exist independently
-
Clear direction - Customer PLACED Order makes natural sense; "Order PLACED Customer" does not
-
Describes the connection - Has properties like
orderDateandshipCountrythat belong to the act of placing -
Enables traversal - You can follow PLACED to find "What orders did this customer place?" or "Who placed this order?"
Querying relationship direction
Every relationship has a direction, but you can query in either direction, or omit the direction entirely.
MATCH (c:Customer {customerId: 'ALFKI'})-[:PLACED]->(o:Order)
RETURN oThis query is written with the Customer node on the left hand side.
MATCH (o:Order {orderId: 10248})<-[:PLACED]-(c:Customer)
RETURN cThis query starts with the order node, with the arrow denoting the direction of the relationship.
Undirected relationships
Some relationships don’t have an inherent direction. You can omit the arrow to match in either direction.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
Dan((Dan)):::primary ---|MARRIED_TO| Ann((Ann)):::primary
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7a
linkStyle default stroke:#94a3b8,stroke-width:1.25pxMARRIED_TO is symmetric - if Dan is married to Ann, Ann is married to Dan.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
Dan((Dan)):::highlight -->|LOVES| Ann((Ann)):::primary
Ann2((Ann)):::primary -->|INDIFFERENT_TO| Dan2((Dan)):::muted
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7a
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3
classDef muted fill:#ffffff,stroke:#e5e7eb,stroke-width:1px,color:#334155
linkStyle default stroke:#94a3b8,stroke-width:1.25pxLOVES has a direction - if Dan loves Ann, it doesn’t mean Ann loves Dan.
Order dates as node properties
Relationships can have properties that describe the connection:
MATCH (c:Customer)-[:PLACED]->(o:Order) // (1)
WHERE datetime('1997-07-01') <= o.date <= datetime('1996-07-04') // (2)
RETURN c.name, o.id, o.shipCountry // (3)
LIMIT 5-
Match pattern - Find customers and their orders
-
Filter by date - Filter orders by date range using the datetime() function
-
Return properties - Show customer name, order ID, and ship country
Order dates are properties of the Order node, representing when that specific order was placed.
Connecting to the goal
To recommend products, you need to know:
-
Who placed orders? (Customer→Order via PLACED)
-
What products were ordered? (Order→Product via ORDERS - next module)
-
Who bought similar products? (traverse backward from products to find other customers)
Summary
In this lesson, you learned about relationships in Neo4j:
-
Relationships connect nodes - They have a type, direction, and optional properties
-
Performance advantage - Graph databases store adjacency directly; traversals follow pointers in memory instead of repeated index scans and join materialization, so deep traversals are more predictable and often much faster than equivalent multi-join SQL
-
Multi-hop queries - Graphs excel when traversing 3+ relationships (Customer→Order→Product→Category)
-
PLACED relationship - Connects Customer to Order (Customer→Order) for the recommendation path
-
Direction flexibility - Query in any direction regardless of how it’s stored
-
Node properties - Data like dates belong to the entity itself (Order.date)
-
Relationship properties - Data that belongs to the connection (quantity on CONTAINS - coming in Module 4)
-
Naming conventions - Use verbs in UPPER_SNAKE_CASE (PLACED, CONTAINS, IN_CATEGORY)
-
Connection to goal - Customer→Order→Product path enables recommendations
In the next lesson, you will import Customer and Order nodes and create PLACED relationships between them.