Introduction
Before importing data, you need to understand how graph databases store information differently from relational databases.
In this lesson, you will learn the components of a Neo4j graph and basic Cypher query syntax.
Understanding graph building blocks
Neo4j graphs have four components that work together to model your domain:
-
Nodes - Entities that exist independently
-
Labels - Categories for nodes
-
Relationships - Connections between nodes
-
Properties - Attributes that describe nodes or relationships
Let’s explore each one.
Representing nodes
Nodes represent entities that exist independently.
-
The "things" in your business domain
-
Typically nouns: Customer, Product, Order
-
Each node is a distinct entity
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph TB
Customer((Customer)):::primary
Product((Product)):::forest
Order((Order)):::highlight
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7a
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1e
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3Categorizing with labels
Labels categorize nodes into types.
-
Written with a colon prefix:
:Customer,:Product,:Order -
Every node has at least one label
-
Multiple labels are possible:
:Productand:DiscontinuedItem -
Labels make queries faster by narrowing what to search
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph TB
C((":Customer")):::primary
P((":Product")):::forest
O((":Order")):::highlight
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7a
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1e
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3Connecting with relationships
Relationships connect nodes together.
-
Always have a type and a direction
-
Typically verbs:
PLACED,CONTAINS,IN_CATEGORY -
Direction shows meaning: Customer placed Order
-
Can have properties like
quantity
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
O((Order 10248)):::highlight
P((Chai)):::forest
O -->|"CONTAINS<br/>quantity: 12"| P
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1e
linkStyle default stroke:#94a3b8,stroke-width:1.25pxStoring data in properties
Properties are key-value pairs that describe nodes or relationships.
-
Node examples:
name,price,orderDate -
Relationship examples:
quantity,discount
Tip: Start with properties on nodes. Put properties on relationships when they describe the connection itself.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
P(("Product<br/>name: 'Chai'<br/>unitPrice: 18.00")):::forest
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1eThinking in patterns
Nodes are nouns, relationships are verbs.
Read these patterns like sentences:
(Customer)-[:PLACED]->(Order) → "Customer PLACED Order"
(Order)-[:CONTAINS]->(Product) → "Order CONTAINS Product"
(Product)-[:IN_CATEGORY]->(Category) → "Product IN_CATEGORY Category"This natural language mapping is why graph models are intuitive to design and query.
Combining the elements: Relational approach
In SQL, many-to-many relationships require a join table (ProductOrder).
The join table stores the foreign keys from both tables, plus any properties of the connection (like quantity).
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
erDiagram
Order ||--o{ ProductOrder : contains
Product ||--o{ ProductOrder : "ordered in"
Order {
int orderId
date orderDate
}
ProductOrder {
int orderId
int productId
int quantity
}
Product {
int productId
string productName
float unitPrice
}Combining the elements: Graph approach
In Neo4j, a single relationship connects two nodes directly. The properties of the relationship provide context on how the two nodes are connected.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
O((Order)):::highlight
P((Product)):::forest
O -->|"CONTAINS<br/>{quantity: 10}"| P
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1e
linkStyle default stroke:#94a3b8,stroke-width:1.25pxThere is no concept of a many-to-many relationship in graphs - just direct connections between nodes.
Nodes and labels work together
A node is an individual instance. A label is the category it belongs to.
Think of it like objects and classes: the node is the object, the label is its class.
Each circle below is a node. They all share the label :Customer.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
subgraph ":Customer label"
C1(("ALFKI")):::primary
C2(("BERGS")):::primary
C3(("CACTU")):::primary
end
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7aRelationship types work the same way
A relationship is an individual connection. A type categorizes it.
Each arrow is a relationship instance. They all share the type :PLACED.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph TB
C1((ALFKI)):::primary -->|PLACED| O1((10643)):::highlight
C1 -->|PLACED| O2((10692)):::highlight
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7a
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3
linkStyle default stroke:#94a3b8,stroke-width:1.25pxDecision guide: Node or Property?
Category as a NODE:
Multiple products share the same category. You can ask "What products are in Beverages?"
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
P1((Chai)):::forest -->|IN_CATEGORY| Cat((Beverages)):::primary
P2((Chang)):::forest -->|IN_CATEGORY| Cat
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7a
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1eDecision guide: Node or Property? (continued)
Country as a PROPERTY:
Country is a property - you filter BY it, but don’t navigate FROM countries to customers.
Rule of thumb: If multiple nodes share it and you’d query FROM it, make it a node.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
C1(("ALFKI<br/>country: Germany")):::primary
C2(("BERGS<br/>country: Sweden")):::primary
classDef primary fill:#eef6f9,stroke:#c7e0ec,stroke-width:1.25px,color:#0b5c7aDecision guide: Node Property or Relationship Property?
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
P(("Product<br/>name: Chai<br/>unitPrice: 18.00")):::forest
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1eNode property: The product’s list price is intrinsic to the product - it doesn’t depend on who buys it.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#eef6f9",
"primaryBorderColor": "#c7e0ec",
"lineColor": "#94a3b8",
"fontFamily": "Public Sans, Arial, Helvetica, sans-serif"
}
}}%%
graph LR
O((Order<br/>10248)):::highlight -->|"CONTAINS<br/>quantity: 12<br/>unitPrice: 14.00"| P((Chai)):::forest
classDef highlight fill:#f4f5ff,stroke:#c7d2fe,stroke-width:1.25px,color:#3730a3
classDef forest fill:#edf6e8,stroke:#b7df9c,stroke-width:1.25px,color:#2f5d1e
linkStyle default stroke:#94a3b8,stroke-width:1.25pxRelationship property: Quantity and purchase price exist because of THIS order-product connection.
Rule of thumb: If the value varies per connection, put it on the relationship.
Quick decision guide
When in doubt, ask yourself:
| Question | Answer | Graph Element |
|---|---|---|
Is this a distinct "thing" I can connect to? |
Yes → Customer "ALFKI", Order #10248 |
Node |
What type/category is this node? |
|
Label |
Does this describe how two things are connected? |
PLACED, CONTAINS, IN_CATEGORY |
Relationship |
Is this a simple attribute? |
name, price, quantity |
Property |
Introducing Cypher
Cypher is Neo4j’s query language. It uses ASCII-art patterns to represent graph structures.
You will use Cypher to:
-
Query the data you import
-
Verify your imports worked
-
Build the recommendation query
The syntax is designed to be readable and match how you think about graph patterns.
Writing node patterns
Nodes are represented with parentheses ().
MATCH (p:Product) // (1)
WHERE p.name = 'Chai' // (2)
RETURN p.name, p.unitPrice // (3)
LIMIT 1-
Pattern -
pis a variable to reference the node,:Productis a label to filter node types -
Filter -
WHEREclause filters by property value -
Return - Returns the product name and price properties
Writing relationship patterns
Relationships are represented with arrows and square brackets. Here is a pattern that spans customers, the orders they placed, and the products they purchased.
MATCH path = (c:Customer)-[:PLACED]->(o:Order)-[:CONTAINS]->(p:Product) // (1)
WHERE c.id = 'ALFKI' // (2)
RETURN c.name AS company, p.name AS product, count(*) AS count // (3)
LIMIT 5-
Multi-hop pattern - The
path =syntax captures the entire pattern. Find a customer who has placed an order, which contains a product. -
Filter - Find a specific customer by their ID.
-
Return - The
ASkeyword is used to rename the properties.
How graph queries work
Every graph query follows two steps:
-
Find the anchor node - Use an index to locate the starting point (e.g.,
WHERE c.id = 'ALFKI') -
Traverse relationships - Follow pointers directly from node to node
Instructor demo
Show this visually by running a query and explaining how Neo4j finds the starting customer, then follows the PLACED and CONTAINS pointers - only touching relevant data, not scanning entire tables.
Indexes and constraints
When you import data using the Import tool, you will set unique identifiers.
This automatically creates:
-
Constraint - Ensures no duplicate IDs (data quality)
-
Index - Enables fast anchor node lookups (performance)
You will see this in action when you import customers and products.
Summary
In this lesson, you learned:
-
Graph components - Nodes (things), labels (types), relationships (connections), properties (attributes)
-
How to differentiate:
-
Node - A distinct instance you can connect to (customer ALFKI, order #10248)
-
Label - The type/category (
:Customer,:Order) - nodes have labels -
Relationship - A connection between nodes, typically a verb (
PLACED,CONTAINS) -
Property - An attribute on a node or relationship (
name,quantity)
-
-
Decision rules:
-
Node vs. property → If multiple things share it and you’d query FROM it, make it a node
-
Node property vs. relationship property → If the value varies per connection, put it on the relationship
-
-
Cypher basics - MATCH finds patterns, WHERE filters, RETURN specifies output
Next, you can optionally test your understanding with a quick Cypher patterns quiz, or continue to learn how to identify what should be a node in your graph model.