Cypher Query Lifecycle

Introduction

In this lesson you will learn:

How Cypher queries are processed
The Cypher runtimes
Query execution phases

Understanding Query Optimization

Query optimization is the process of improving query performance by identifying slow queries and applying specific techniques to make them faster. Slow queries impact user experience and consume valuable database resources.

The most important metric for prioritization is total time spent, calculated as frequency multiplied by duration. A query running 1,000 times at 500ms has more total impact than a query running once at 60 seconds, despite being individually faster.

How Cypher Queries Work

Cypher queries execute in two phases: finding anchor nodes then expanding relationships.

Anchor nodes are the starting points. Neo4j locates these nodes first, then expands outward by traversing relationships.

Quick tips

The fewer anchor nodes you have, the faster your query will run.
Filter data as early as possible to reduce work in later stages.
The fewer relationships you traverse, the faster your query will run.
The more specific the relationship types you use, the faster your query will run.
Using indexes helps Neo4j quickly find anchor nodes.

Cypher runtimes

There are three Cypher runtimes that Neo4j uses to execute queries, each with different performance characteristics and use cases:

Slotted - community edition and Aura free
Pipelined - default for enterprise edition and Aura production
Parallel - available in enterprise edition and Aura for analytical queries

Slotted Runtime

Slotted Runtime:

Interpreted runtime using pull-based "Volcano" execution model
Processes queries row-by-row with single-threaded execution
Faster planning phase but slower execution
Best for applications with short, non-cached queries where fast planning is more important than execution speed

Pipelined Runtime

Pipelined Runtime:

Push-based execution model that processes data in batches
Uses compiled approach with code generation for better performance
Transforms logical operators into execution pipelines with improved CPU cache utilization
Ideal for transactional use cases with high concurrency and most general query workloads

Parallel Runtime

Parallel Runtime:

Multi-threaded runtime
Allows single queries to utilize multiple CPU cores simultaneously
Designed for analytical, graph-global read queries that process large sections of the graph
Uses partitioned operators to segment and process data in parallel
Best for long-running analytical queries (>500ms) on systems with many CPUs and low concurrency workloads
Does not support write operations or procedures that aren’t thread-safe

Runtime tips

The pipelined runtime will generally provide the best performance.
The slotted runtime can be useful where planning time is more important than execution time, such as short, non-cached queries.
When multiple CPU cores are available, long-running queries can be expected to run faster on the parallel runtime.

Lesson Summary

In this lesson, you learned about the Cypher query lifecycle, including the parsing, planning, and execution phases that Neo4j goes through when processing your queries.

In the next lesson, you will learn how to use PROFILE and EXPLAIN to analyze query performance.

Neo4j Management, Optimization, and Refactoring Workshop

Setup Neo4j Aura

Aura Administration

Query Performance

Graph refactoring