The Complete Memory API

In the previous lesson, you installed neo4j-agent-memory and configured the three configuration objects. In this lesson, you will learn every public method the library exposes — across all three memory layers — and see a single end-to-end example that uses them all in the order you would call them in a real application.

The methods are grouped in call order: session management first, then short-term memory, then long-term memory, then context retrieval, then reasoning traces.

Setting up the environment

Every example in this lesson uses the same settings object. Store your credentials in environment variables and build MemorySettings once:

python
Configure the library from environment variables
import os
import asyncio
from neo4j_agent_memory import MemoryClient, MemorySettings
from neo4j_agent_memory.config import Neo4jConfig, EmbeddingConfig

settings = MemorySettings(
    neo4j=Neo4jConfig(
        uri=os.environ["NEO4J_URI"],
        username=os.environ["NEO4J_USERNAME"],
        password=os.environ["NEO4J_PASSWORD"]
    ),
    embedding=EmbeddingConfig(
        api_key=os.environ["OPENAI_API_KEY"]
    )
)

All three configuration objects are required: Neo4jConfig (database connection), EmbeddingConfig (vector embeddings for semantic search), and MemorySettings (the container that binds them together). Set NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD, and OPENAI_API_KEY in your environment before running any code in this course.

Session management

Sessions group messages into conversations. You choose the session ID — any string works. The library creates the Conversation node automatically the first time you store a message for that ID.

python
List and clear sessions
async with MemoryClient(settings) as memory:
    # List all existing sessions
    sessions = await memory.short_term.list_sessions()
    for s in sessions:
        print(s.session_id)

    # Remove a session and all its messages when the conversation ends
    await memory.short_term.clear_session("user_123")

list_sessions() returns all stored Conversation nodes — useful for building a session history view or managing sessions across users. clear_session() removes the session and all its linked Message nodes.

Short-term memory

Short-term memory stores the active conversation as a linked message chain. All short-term methods live on memory.short_term:

python
Add, retrieve, search, and summarise messages
async with MemoryClient(settings) as memory:
    # Store messages — the session is created automatically on first use
    await memory.short_term.add_message(
        session_id="user_123",
        role="user",
        content="Review Jessica Norris account for a credit limit increase"
    )
    await memory.short_term.add_message(
        session_id="user_123",
        role="assistant",
        content="I will retrieve Jessica's full profile now."
    )

    # Retrieve the full conversation and its messages
    conversation = await memory.short_term.get_conversation("user_123")

    # Semantic search over message history
    results = await memory.short_term.search_messages(
        query="credit limit increase",
        session_id="user_123",
        limit=10
    )

    # Auto-generate a plain-English summary of the conversation
    summary = await memory.short_term.get_conversation_summary("user_123")
    print(summary)

add_message() does more than store text — it runs the entity extraction pipeline on each message, creating or merging entity nodes in long-term memory and linking them back to the message. get_conversation_summary() uses the stored messages to produce a concise summary, useful for seeding long-term memory at session end.

Long-term memory

Long-term memory stores a persistent entity knowledge graph that survives across sessions. The six long-term methods cover creating, enriching, and retrieving the graph:

python
Store and retrieve entities, facts, and preferences
async with MemoryClient(settings) as memory:
    # Store a typed entity using the POLE+O classification
    await memory.long_term.add_entity(
        name="Jessica Norris",
        entity_type="PERSON",
        subtype="CUSTOMER",
        description="High-value customer, flagged for compliance review April 2025",
        properties={"risk_score": 0.415}
    )

    # Store a temporal fact linking two entities
    await memory.long_term.add_fact(
        subject="Jessica Norris",
        predicate="manages",
        object="Acme Corp account",
        valid_from="2024-01-01",
        valid_until="2025-03-31"
    )

    # Store a user or agent preference
    await memory.long_term.add_preference(
        category="communication",
        preference="Prefers concise responses",
        context="Confirmed during onboarding"
    )

    # Semantic search across all entity types
    entities = await memory.long_term.search_entities(
        query="Jessica Norris accounts",
        limit=10
    )

    # Retrieve preferences matching a query
    prefs = await memory.long_term.search_preferences(query="communication")

    # Retrieve the full neighborhood subgraph around an entity (2-hop by default)
    subgraph = await memory.long_term.get_entity_graph(
        entity_id="jessica-norris",
        depth=2
    )

add_fact() records a temporal relationship between two named entities — the valid_from and valid_to fields enable time-aware queries ("who managed this account in Q1 2024?"). search_preferences() is the retrieval half of add_preference() — you store preferences during setup and retrieve them at the start of each session to personalise the agent’s behaviour.

Combined context retrieval

get_context() is the top-level method that pulls from all three memory layers at once and formats the result for injection into an LLM prompt:

python
Retrieve combined context for LLM prompt injection
async with MemoryClient(settings) as memory:
    context = await memory.get_context(
        query="What do I know about Jessica Norris?",
        session_id="user_123",
        include_short_term=True,
        include_long_term=True,
        include_reasoning=True,
        max_items=10
    )

    # context is a formatted string ready to inject into an LLM prompt
    print(context)

get_context() returns a formatted string combining recent conversation history, relevant entities and preferences from long-term memory, and similar past reasoning traces. It is the recommended method for injecting memory into a prompt when you are not using the Pydantic AI integration tools. Any agent framework — LangChain, CrewAI, a custom prompt builder — can call get_context() to retrieve the most relevant memory for the current query.

Reasoning traces

Reasoning memory records the agent’s full decision process. The four core methods form a start-record-complete lifecycle:

python
Record a reasoning trace manually
async with MemoryClient(settings) as memory:
    # 1. Start a new trace — creates the ReasoningTrace node
    trace = await memory.reasoning.start_trace(
        task="Evaluate credit limit for Jessica Norris",
        session_id="user_123"
    )

    # 2. Record a reasoning step
    step = await memory.reasoning.add_step(
        trace_id=trace.id,
        thought="Retrieving customer entity from long-term memory",
        action="search_entities"
    )

    # 3. Record a tool call within the step
    await memory.reasoning.record_tool_call(
        step_id=step.id,
        tool_name="search_entities",
        arguments={"query": "Jessica Norris", "limit": 5},
        result={"entities": ["Jessica Norris (EntityPerson)"]},
        status="success"
    )

    # 4. Complete the trace with an outcome
    await memory.reasoning.complete_trace(
        trace.id,
        outcome="Approved — risk score within threshold",
        success=True
    )

start_trace() creates the ReasoningTrace node and returns a trace object whose id you pass to all subsequent calls. add_step() records one reasoning iteration. record_tool_call() attaches a ToolCall node to a step. complete_trace() marks the trace as finished and records the outcome.

When using the Pydantic AI integration, record_agent_trace() calls all four of these automatically — so you only need the low-level API when integrating with other frameworks or recording traces for non-agent workflows.

Querying and analysing traces

Three additional methods let you analyse traces after they are recorded:

python
Query and analyse reasoning traces
async with MemoryClient(settings) as memory:
    # Find past traces for semantically similar tasks
    similar = await memory.reasoning.get_similar_traces(
        task="What do you know about me?",
        limit=3
    )

    # List all traces, with optional filtering
    traces = await memory.reasoning.list_traces()

    # Retrieve pre-aggregated tool usage statistics
    stats = await memory.reasoning.get_tool_stats()
    for tool_name, count in stats.items():
        print(f"{tool_name}: {count} calls")

    # Retrieve the complete causal chain for one trace
    provenance = await memory.reasoning.get_trace_provenance(trace.id)

get_similar_traces() uses vector similarity on task_embedding to find past traces for related tasks. list_traces() retrieves all traces and supports date and status filtering. get_tool_stats() returns pre-aggregated counts — how many times each tool has been called — without requiring a Cypher query. get_trace_provenance() returns the complete audit record for a single trace: originating message, every step and thought, every tool call with parameters and results, and the final outcome.

Complete API reference

Namespace Method Purpose

memory.short_term

add_message(session_id, role, content)

Store a message and run entity extraction (creates session automatically)

memory.short_term

get_conversation(session_id)

Retrieve the full conversation and its messages

memory.short_term

search_messages(query, session_id, limit)

Semantic search over message history

memory.short_term

get_conversation_summary(session_id)

Auto-generate a summary of the conversation

memory.short_term

list_sessions()

List all sessions with optional pagination

memory.short_term

clear_session(session_id)

Remove a session and all its messages

memory

get_context(query, session_id, …​)

Retrieve combined context string from all three memory layers

memory.long_term

add_entity(name, entity_type, …​)

Store or update a POLE+O typed entity

memory.long_term

add_fact(subject, predicate, object, valid_from, valid_until)

Store a temporal fact between two entities

memory.long_term

add_preference(category, preference, context)

Store a user or agent preference

memory.long_term

search_entities(query, limit)

Semantic search across the entity graph

memory.long_term

search_preferences(query)

Retrieve preferences matching a query

memory.long_term

get_entity_graph(entity_id, depth)

Retrieve the neighborhood subgraph for an entity

memory.reasoning

start_trace(task, session_id)

Begin a new reasoning trace

memory.reasoning

add_step(trace_id, thought, action)

Record one reasoning step within a trace

memory.reasoning

record_tool_call(trace_id, step_id, tool_name, …​)

Attach a tool call to a reasoning step

memory.reasoning

complete_trace(trace_id, outcome, result)

Finalise a trace with its outcome

memory.reasoning

get_similar_traces(task, limit)

Find past traces for semantically similar tasks

memory.reasoning

list_traces()

List all traces with optional date and status filters

memory.reasoning

get_tool_stats()

Retrieve pre-aggregated tool usage counts

memory.reasoning

get_trace_provenance(trace_id)

Retrieve the complete causal chain for a trace

Check your understanding

What Does get_context() Return?

Question

What does the memory.get_context() method return?

  • ❏ A structured object with .messages, .entities, .preferences, and .traces fields

  • ✓ A formatted string combining context from all three memory layers

  • ❏ A list of Message objects from short-term memory

  • ❏ A dictionary keyed by memory type

Hint

get_context() is designed to be injected directly into an LLM prompt. Think about what format is most useful for that purpose.

Solution

get_context() returns a formatted string. It combines recent conversation history, relevant entities and preferences from long-term memory, and similar past reasoning traces into a single text block ready to include in an LLM prompt. You access the result with print(context), not context.messages or similar field access.

Summary

In this lesson, you learned the complete neo4j-agent-memory API across all three memory layers:

  • Session management — sessions are created automatically; memory.short_term.list_sessions() and memory.short_term.clear_session() manage existing sessions

  • Short-term memorymemory.short_term.add_message(), get_conversation(), search_messages(), get_conversation_summary() store and retrieve the active conversation

  • Long-term memorymemory.long_term.add_entity(), add_fact(), add_preference() write to the knowledge graph; search_entities(), search_preferences(), get_entity_graph() read from it

  • Context retrievalmemory.get_context() combines all three layers into a formatted string ready to inject into an LLM prompt

  • Reasoning tracesmemory.reasoning.start_trace(), add_step(), record_tool_call(), complete_trace() record the full decision lifecycle; get_similar_traces(), list_traces(), get_tool_stats(), get_trace_provenance() retrieve and analyse it

In the next module, you will learn the short-term memory graph schema in detail and see the Cypher that each short-term method generates.

Chatbot

How can I help you today?

Data Model

Your data model will appear here.