Introduction
CPU is a critical resource for database performance. Your database uses it for planning and executing queries.
In this lesson, you will learn how to monitor CPU usage, identify bottlenecks, and determine when you need to scale your instance.
What Consumes CPU in Neo4j
Before you can effectively monitor CPU usage, you need to understand what consumes CPU in Neo4j. CPU is consumed by several key processes.
Query Planning and Execution
The query planner analyzes your Cypher statements and creates an execution plan - a series of operations to retrieve your data. This planning process consumes CPU, especially for complex queries.
Once planned, Neo4j executes your query using various operators, each with different CPU costs.
-
Index seeks are the most efficient operations, directly accessing specific nodes through indexes.
-
Label scans consume moderate CPU by scanning all nodes with a particular label.
-
Full node scans are the most expensive operation, fetching every node in your database.
-
Relationship traversals expand relationships by type and direction between nodes, with CPU cost proportional to the number of relationships expanded.
-
Filtering operations apply WHERE clauses to data held in memory.
-
Sorting and aggregations can be CPU-intensive for large datasets.
You can read more about the different operators in the Cypher Manual or enroll to the Cypher Optimization course.
Connection and Thread Management
Every client connection to your database uses CPU through Neo4j’s Bolt protocol.
Worker threads execute your queries and process client requests, I/O threads manage network communication with clients, and transaction threads handle the transaction lifecycle.
Each active connection or transaction holds a thread, and that thread consumes CPU while processing. When you have hundreds of concurrent connections, this can put significant pressure on the CPU which subsequently can cause performance issues.
Background Operations
Neo4j periodically performs maintenance tasks that consume CPU.
Checkpointing writes modified data from memory to disk, while index updates keep indexes synchronized as data changes. Statistics collection gathers counts of nodes, relationships and properties for the query planner to create efficient execution plans.
Garbage collection occurs when the Java Virtual Machine (JVM) reclaims memory, often indicating high workloads that lead to memory pressure.
Transaction log management processes and rotates transaction logs.
Read the CPU usage chart
The CPU usage chart displays the minimum, maximum, and average percentage of your CPU capacity being used within the timeframe.
You’ll typically see a steady baseline like this from background operations and regular queries, but may notice periodic peaks from batch jobs or scheduled tasks, gradual increases as your workload grows over time, and sudden spikes from large queries or unexpected load.
In this example, the CPU usage jumps from around 10% average to over 80%, which could either be an increase in workload or a sign of a problem.
Identify CPU issues
Understanding normal patterns helps you spot problems. CPU issues manifest in three distinct patterns, each requiring different diagnostic approaches.
-
Consistently High CPU (70-90%)
-
Frequent CPU Spikes
-
Sustained 100% CPU
Use this quick decision guide to identify which pattern you’re experiencing:
Consistently High CPU (70-90%)
When your CPU stays consistently high (70-90%), queries start queuing and waiting for available CPU time. Your application users will notice slower response times and increased latency.
In this case, you should:
-
Review query logs to identify resource-intensive queries.
-
Review the execution plan to identify inefficient queries.
-
Optimize the queries.
-
Scale your instance if the queries are still resource-intensive.
This pattern may indicate that you need to optimize your queries.
This pattern usually indicates optimization opportunities rather than a genuine need for more resources. Follow this diagnostic path to identify and fix the root cause:
Pattern 2: Frequent CPU Spikes
Seeing regular spikes that affect all queries equally? Unlike slow individual queries, these spikes impact your entire database at once. This is often a sign of memory pressure rather than query problems.
Think of spikes as your database taking a "pause" to clean up - when those pauses happen too often or take too long, you’ll feel it everywhere. Here’s how to diagnose what’s causing them:
Pattern 3: Sustained 100% CPU
This is a critical situation - your database has hit its limit. Queries are timing out, users can’t complete transactions, and things are breaking. You need to act fast to restore service, then figure out why this happened.
This flowchart walks you through the emergency response and recovery process:
Query Optimization and Monitoring Strategies
Before scaling your instance, apply the query optimization techniques from the Optimizing Query Performance lesson - use PROFILE to identify expensive operations, add strategic indexes, use query parameters for plan caching, and optimize query structure to resolve CPU bottlenecks.
Monitor CPU patterns by workload type
Once you can diagnose and fix CPU issues, implement these ongoing monitoring strategies tailored to your workload type. Different workloads create distinct CPU patterns that help you identify what’s consuming CPU and whether you need optimization or scaling.
Read-Heavy Workloads
Read-heavy workloads show regular spikes during complex queries or aggregations, with baseline CPU generally lower than write workloads. Simple indexed lookups consume minimal CPU, but analytics queries scanning large portions of the graph can spike usage to 100%. Page cache misses add overhead as CPU waits for disk I/O.
Optimize by caching frequently accessed data at the application layer and ensuring your hot data fits in memory. Add indexes for common query patterns to reduce expensive scans.
Scale horizontally by adding read replicas to distribute read queries across multiple instances. This is effective for both temporary spikes (like end-of-quarter reports) and sustained high read CPU.
Write-Heavy Workloads
Write-heavy workloads show sustained high CPU during peak write periods with more consistent patterns than reads. Writes consume CPU through transaction log writes, index updates for all indexed properties (a major contributor), data structure updates on disk, statistics collection for the query planner, and consistency checks.
Optimize by batching operations - process 1,000 updates in one transaction instead of 1,000 individual transactions. Review your indexed properties since each index adds write overhead. Schedule bulk operations during off-peak hours to reduce impact on normal workloads.
Scale vertically by increasing the primary instance size for more CPU cores. Write operations must go through the primary instance to maintain consistency, so adding read replicas does not reduce write-related CPU load.
Mixed Workloads (OLTP + OLAP)
Mixed workloads show high CPU with many active connections but low query throughput - queries are waiting rather than executing. Background writes hold locks that block concurrent reads while long-running analytics queries occupy thread pool threads for extended periods. This causes short transactional queries to queue despite available CPU capacity.
Optimize by separating workloads - schedule heavy batch operations and analytics queries during dedicated time windows, away from peak transactional traffic. Implement query timeouts to prevent long-running queries from monopolizing thread pool resources and blocking shorter transactions.
Scale based on your dominant workload: if writes drive your CPU consumption, scale the primary instance vertically for more cores. If reads are the bottleneck, add read replicas to distribute the load horizontally.
We will cover how to check the transaction counts for read and write transactions at the database level later in this course.
Proactive vs Reactive Scaling
Monitor CPU trends over weeks and months. Scale when you’ll reach 80% sustained utilization within your planning horizon, not when you hit 100% and users are experiencing problems.
Proactive scaling prevents performance degradation. Reactive scaling means users suffer through slow queries while you scramble to add capacity.
Check Your Understanding
High CPU Usage Response
You notice your Aura instance CPU usage has been consistently at 90-95% for the past 3 hours during normal business operations.
What should you do first?
-
❏ Wait and monitor for another day to confirm it’s not temporary
-
❏ Restart the instance to clear any issues
-
✓ Review query logs to identify resource-intensive queries, then consider scaling
-
❏ Immediately scale up the instance without investigation
Hint
When CPU is consistently high, you need to understand the cause before taking action. Consider what information would help you make an informed decision.
Solution
Review query logs to identify resource-intensive queries, then consider scaling is correct.
This is the best approach because 90-95% sustained usage indicates the instance is under-provisioned or queries are inefficient, query logs will show if specific queries are consuming excessive CPU, you may be able to optimize queries instead of scaling, and understanding the root cause ensures the right fix.
Why the alternatives are less effective: Waiting another day prolongs poor performance for users, restarting doesn’t address the underlying cause, and scaling without investigation might be unnecessary if queries can be optimized.
After reviewing query logs, you may find you can optimize problematic queries, or you may confirm that scaling is needed for the workload.
Summary
You now understand how CPU is actually used in Neo4j and how to monitor CPU usage for your Aura instances. You’ve learned what specific operations consume CPU - from efficient index seeks to expensive full scans - and how query execution operators differ in CPU cost (sometimes by millions of operations). You understand the role of thread pools in managing concurrent connections, how to use PROFILE to identify expensive operations, specific optimization techniques to reduce CPU consumption, and how to recognize and diagnose CPU issues through concrete examples.
With this knowledge, you can identify whether high CPU usage is due to inefficient queries, too many connections, or genuine capacity limits - and take appropriate action.
In the next lesson, you’ll learn how to monitor storage consumption and query rates.