Introduction
You’ve built the core Customer→Order→Product graph and learned how to query it.
Now you’ll build a complete recommendation query that answers: "What products do people like me buy, that I haven’t bought yet?"
In this lesson, you will learn how to:
-
Build queries incrementally to understand traversal patterns
-
Optimize graph queries using list expressions
-
Measure query performance with
PROFILE
Loading the complete dataset
Need the complete dataset?
This lesson works with the core graph (Customer, Order, Product). If you also completed the optional Category import, that’s great, but it’s not required for recommendations.
If you skipped earlier lessons or want to start fresh, download the complete model:
-
Download complete-model.zip
-
In the Import tool, click the 3-dot menu (…) and select Open model (with data)
-
Select the downloaded zip file
-
Click the Run Import button to load the data
This model includes all entities from the workshop (core and optional).
Understanding the recommendation algorithm
Collaborative filtering finds recommendations based on similar customers:
-
Find the customer’s products - What they’ve purchased
-
Find customers who bought those products - Who else bought the same items
-
What did those customers purchase? - Their other purchases
-
Exclude the customer’s products - Only show things they haven’t bought
-
Optimize the query - Make it efficient for production
You will build this query step by step, running each version to see how it works.
Finding the customer’s products
Start by finding what customer ALFKI has purchased.
MATCH path = (me:Customer {id: 'ALFKI'})-[:PLACED]->(order:Order)-[:CONTAINS]->(product:Product) // (1)
RETURN path // (2)
LIMIT 10-
Start at customer - Find
ALFKIand traverse through orders to products -
Return the path - Shows the full visualization of customer, orders, and products
Finding customers who bought those products
Extend the pattern to traverse from the customer’s products back to other customers.
MATCH path = (me:Customer {id: 'ALFKI'})-[:PLACED]->(order:Order)-[:CONTAINS]->(product:Product)
<-[:CONTAINS]-(other:Order)<-[:PLACED]-(otherCustomer) // (1)
RETURN path // (2)
LIMIT 10-
Backward traversal - From products, follow
:CONTAINSback to other orders, then:PLACEDback to other customers
Finding what those customers purchased
Add another traversal to find what similar customers bought.
MATCH (me:Customer {id: 'ALFKI'})-[:PLACED]->(order:Order)-[:CONTAINS]->(product:Product)
<-[:CONTAINS]-(other:Order)<-[:PLACED]-(otherCustomer)
MATCH (otherCustomer)-[:PLACED]->(otherOrder)-[:CONTAINS]->(otherProduct) // (1)
WHERE product <> otherProduct // (2)
RETURN product.name AS product,
otherProduct.name AS boughtWith,
count(*) AS count // (3)
ORDER BY count DESC
LIMIT 20-
Forward traversal - From similar customers, traverse to their orders and products
-
Exclude same product - Don’t recommend the product they already share
-
Count occurrences - How many times this product pair appears across all customers
Excluding the customer’s products
Add a filter to only show products the customer hasn’t bought yet.
MATCH (me:Customer {id: 'ALFKI'})-[:PLACED]->(order:Order)-[:CONTAINS]->(product:Product)
<-[:CONTAINS]-(other:Order)<-[:PLACED]-(otherCustomer)
MATCH (otherCustomer)-[:PLACED]->()-[:CONTAINS]->(otherProduct)
WHERE product <> otherProduct
AND NOT (me)-[:PLACED|CONTAINS*2]->(otherProduct) // (1)
RETURN otherProduct.name AS recommendation, // (2)
count(*) AS score
ORDER BY score DESC
LIMIT 20-
Exclusion filter added - Check if customer already purchased this product using a 2-hop pattern match
-
Changed return - Now only returns new recommendations, not products already owned
Optimizing the query
The query works, but it’s inefficient. Every row returned from the first MATCH multiplies the work downstream.
To optimize the query, you can retrieve the list of products the customer has bought once at the start of the query, rather than once per product purchased.
To do this in a single line, you can use a list expression.
Pre-calculating the customer’s products
MATCH (me:Customer {id: 'ALFKI'})
WITH me, [ (me)-[:PLACED|CONTAINS*2]->(product) | product ] AS myProducts // (1)
MATCH (me)-[:PLACED]->()-[:CONTAINS]->(product:Product)<-[:CONTAINS]-()
<-[:PLACED]-(otherCustomer)
MATCH (otherCustomer)-[:PLACED]->()-[:CONTAINS]->(otherProduct)
WHERE product <> otherProduct
AND NOT otherProduct IN myProducts // (2)
RETURN otherProduct.name AS recommendation,
count(*) AS score
ORDER BY score DESC
LIMIT 20-
Pre-calculate products - The list expression calculates all customer’s products once at query start
-
Simpler exclusion check - The combination of
NOTandINfilters out products matched in the list expression
Measuring query performance
Prefixing PROFILE to a query shows will execute the query and return the results along with the query plan and execution statistics.
PROFILE
MATCH (me:Customer {id: 'ALFKI'})-[:PLACED]->()-[:CONTAINS]->(product:Product)
<-[:CONTAINS]-()<-[:PLACED]-(otherCustomer)
MATCH (otherCustomer)-[:PLACED]->()-[:CONTAINS]->(otherProduct)
WHERE product <> otherProduct
AND NOT (me)-[:PLACED|CONTAINS*2]->(otherProduct)
RETURN otherProduct.name, count(*) AS score
ORDER BY score DESC
LIMIT 20Understanding PROFILE
The profile returns:
-
Database accesses - How many times the query reads from storage
-
Allocated memory - Memory used during execution
-
Time - Total execution time (varies based on cache)
Lower database accesses mean better performance, especially on larger datasets.
Performance improvement
The optimized query reduces database accesses by 80%:
| Metric | Before | After | Improvement |
|---|---|---|---|
Database accesses |
366,128 |
49,177 |
80% reduction |
Allocated memory |
387,560 bytes |
214,204 bytes |
45% reduction |
Time |
138 ms |
137 ms |
~same |
Timing
While database access and allocated memory metrics stay the same, the timing of the query may vary due to the page cache. Running the query a second time will reduce the execution time as the data will be cached in memory.
You can learn more about the page cache and performance optimization in the Aura In Production course.
Combining with content-based filtering
You can combine collaborative filtering with content-based filtering by excluding categories.
First, create an EXCLUDE relationship for customer ALFKI who doesn’t want dairy products.
Run this query to add the exclusion:
MATCH (c:Customer {id: 'ALFKI'})
MATCH (cat:Category {name: "Dairy Products"})
MERGE (c)-[:EXCLUDE]->(cat)Now modify the recommendation query to exclude products in that category.
Recommendations with category exclusions
Use the EXCLUDE relationship to filter out products in excluded categories.
MATCH (me:Customer {id: 'ALFKI'})-[:PLACED]->(:Order)-[:CONTAINS]->(myProduct:Product)
WITH me,
collect(DISTINCT myProduct) AS myProducts, // (1)
[ (me)-[:EXCLUDE]->(cat) | cat ] AS excludedCategories // (2)
UNWIND myProducts AS product // (3)
MATCH (other)-[:PLACED]->(o:Order)-[:CONTAINS]->(p:Product)
WHERE NOT p IN myProducts
AND NOT any(cat IN [ (p)-[:IN_CATEGORY]->(cat) | cat ] // (4)
WHERE cat IN excludedCategories)
RETURN p.name AS recommendation, // (5)
count(DISTINCT o) AS orderCount
ORDER BY orderCount DESC
LIMIT 10-
Collect products - Use
collect()to aggregate products -
Get excluded categories - List expression for customer preferences
-
Unwind products - Check each product individually
-
Category filter - Use
any()to check excluded categories -
Return results - Show recommendations with order count
Summary
In this lesson, you built a recommendation query step by step:
-
Finding the customer’s products - Starting point for the traversal
-
Finding customers who bought those products - Backward traversal to similar customers
-
Finding what those customers purchased - Forward traversal to find recommendations
-
Excluding the customer’s products - Filter out products already owned
-
Optimizing the query - Use list expressions to reduce database accesses by 80%
You’ve modeled your data and written an efficient recommendation query by thinking about how graph traversals work!