Introduction
Neo4j performs internal operations that help maintain database health and optimize query performance.
In this lesson, you will learn how to monitor checkpoint and replan events to understand your database activity.
Understanding Checkpoints and Replan Events
Checkpoints are the process of flushing pending updates from memory to disk. They ensure your data is safe and durable by creating recovery points that allow Neo4j to restart quickly after an unexpected shutdown. This is a normal, essential operation that occurs automatically.
Replan events occur when Neo4j recreates the execution plan for a Cypher query. Neo4j caches query plans for efficiency, rebuilding them when database statistics change significantly—such as when the number of nodes, relationships, or index characteristics evolve—to ensure optimal performance as your data grows.
Monitoring Checkpoints
The Monitoring dashboard shows four checkpoint metrics that help you understand database write activity and health.
Checkpoint Events - Total Count
Total checkpoint events shows the total number of checkpoint events executed since the server started.
The count depends on your write activity—busier databases checkpoint more often. This is a cumulative counter that helps you understand overall checkpoint frequency.
Checkpoint Events - Rate
Checkpoint rate shows the number of checkpoint events per minute.
A consistent, moderate rate is normal and healthy. The rate naturally correlates with write transaction volume.
Checkpoint Cumulative Time
Cumulative checkpoint time shows the total time in milliseconds spent in checkpointing since the server started.
This metric helps you understand the overall time investment in checkpointing operations over the lifetime of the instance.
Last Checkpoint Duration
Last checkpoint duration shows the duration of the most recent checkpoint event in milliseconds.
The typical duration is several seconds to several minutes, which is normal and healthy. As a general guideline, if you see checkpoint duration consistently over 30 minutes, this suggests an opportunity to review storage performance or optimize your write patterns. These thresholds will vary depending on your specific workload.
Checkpoint count and cumulative time values may drop if background maintenance is performed by Aura. This is normal and doesn’t indicate a problem.
Monitoring Replan Events
The Monitoring dashboard shows two replan metrics that help you understand query plan caching efficiency.
Replan Events - Total Count
Total replan events shows the total number of times Cypher has replanned a query since the server started.
A low count with occasional spikes is normal and healthy. You’ll naturally see replanning when executing new queries for the first time, after schema changes, or when database statistics change significantly as your data grows—this is Neo4j adapting to your evolving database.
Replan Events - Rate
Replan rate shows the number of replanning events per minute.
As a general guideline, consistently high replan rates suggest an optimization opportunity: your queries may benefit from using parameters instead of literal values. What constitutes "high" will vary depending on your query patterns.
Troubleshooting Checkpoint and Replan Issues
Monitor checkpoint and replan metrics to identify database health issues and optimization opportunities.
Long Checkpoint Duration
As a general guideline, if checkpoint duration is consistently over 30 minutes, you should investigate potential causes. These thresholds will vary depending on your specific write patterns and storage configuration.
Long checkpoints can indicate heavy write load, storage I/O limitations, or large transaction logs waiting to be flushed. Review your write patterns and consider batching large updates into smaller transactions.
If checkpoints are slow during periods of normal activity, this may indicate storage performance issues. Contact Neo4j support for assistance with persistent checkpoint performance problems.
High Replan Rates
A low replan rate with occasional spikes is normal—you’ll naturally see replanning when executing new queries, after schema changes, or when database statistics change significantly as data volumes grow.
As a general guideline, if replan rates are consistently high, this suggests an optimization opportunity. High replan rates typically indicate queries using literal values instead of parameters.
Review your query logs to identify frequently executed queries. Look for queries with hardcoded values that could be replaced with parameters. This simple change can significantly improve query performance and reduce planning overhead.
Check Your Understanding
Checkpoint Duration Threshold
What checkpoint duration value warrants investigation?
-
❏ Over 5 minutes
-
❏ Over 10 minutes
-
✓ Over 30 minutes
-
❏ Over 60 minutes
Hint
Normal checkpoints take several seconds to several minutes.
Solution
Over 30 minutes is correct.
Checkpoint duration over 30 minutes indicates potential issues such as I/O problems, heavy write load, or storage performance problems. This threshold signals that the checkpoint process is taking significantly longer than the expected several seconds to several minutes.
Over 5 minutes and over 10 minutes are within normal ranges for checkpoints. Over 60 minutes is too high a threshold and would miss performance issues that should be investigated earlier.
Replan Event Causes
What is the primary cause of consistently high replan rates?
Hint
Consider how query plan caching works and what prevents plan reuse.
Solution
Queries not using parameters is correct.
When queries use literal values instead of parameters, each unique value triggers a new execution plan. This prevents Neo4j from reusing cached plans and causes unnecessary replanning. Using parameterized queries allows the database to reuse the same plan for all executions.
Too many schema changes cause occasional spikes, not consistent high rates. Insufficient memory affects query execution but not plan caching. Heavy write load may trigger more frequent checkpoints but doesn’t directly cause replan events.
Summary
Checkpoints flush pending updates from memory to disk, creating recovery points that enable faster database restarts. Typical duration is seconds to a few minutes; as a general guideline, duration consistently over 30 minutes warrants investigation. Replan events occur when Neo4j recreates query execution plans as database statistics change or when new queries are executed. Both are normal operations that help maintain database health and performance. High replan rates often indicate queries using literal values instead of parameters, which can be optimized as covered earlier in the course.