Query database

Overview

In this lesson, you will add a tool to your agent so it can query the database directly.

The tool will use a TextToCypherRetriever retriever to convert user queries into Cypher statements and return the results as context.

Text to Cypher tools are useful for "catch all" scenarios where the user may ask a question not covered by other tools, such as:

  • Finding specific data in the graph.

  • Performing aggregations or calculations.

  • Exploring relationships between entities.

Query database tool

You will modify the agent.py code to:

  1. Create a Text2CypherRetriever retriever that uses an llm to convert user queries into Cypher.

  2. Define a new tool function that uses this retriever to query the database.

  3. Add the new tool to the agent’s list of available tools.

Continue with the lesson to create the text to Cypher retriever.

Modify the agent

Open workshop-genai\agent.py and make the following changes:

  1. Instantiate an llm that will be used to generate the Cypher.

    python
    Cypher generating llm
    from neo4j_graphrag.llm import OpenAILLM
    
    # Crete LLM for Text2CypherRetriever
    llm = OpenAILLM(
        model_name="gpt-4o", 
        model_params={"temperature": 0}
    )
  2. Create a Text2CypherRetriever using the llm and Neo4j driver.

    python
    Text2CypherRetriever
    from neo4j_graphrag.retrievers import Text2CypherRetriever
    
    # Cypher examples as input/query pairs
    examples = [
        "USER INPUT: 'Find a node with the name $name?' QUERY: MATCH (node) WHERE toLower(node.name) CONTAINS toLower($name) RETURN node.name AS name, labels(node) AS labels",
    ]
    
    # Build the retriever
    text2cypher_retriever = Text2CypherRetriever(
        driver=driver,
        neo4j_database=os.getenv("NEO4J_DATABASE"),
        llm=llm,
        examples=examples,
    )
  3. Define a tool function to query the database using the retriever.

    python
    Query-database tool
    # Define a tool to query the database
    @tool("Query-database")
    def query_database(query: str):
        """A catchall tool to get answers to specific questions about lesson content."""
        result = text2cypher_retriever.get_search_results(query)
        return result
  4. Update the tools list to include the new database query tool.

    python
    tools
    tools = [get_schema, search_lessons, query_database]
  5. Modify the query variable to test the new database query tool.

    python
    query
    query = "How many lessons are there?"
Reveal the complete code
python
import os
from dotenv import load_dotenv
load_dotenv()

from neo4j import GraphDatabase
from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings
from neo4j_graphrag.retrievers import VectorCypherRetriever
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.retrievers import Text2CypherRetriever
from langchain.chat_models import init_chat_model
from langchain.agents import create_agent
from langchain_core.tools import tool

# Initialize the chat model
model = init_chat_model("gpt-4o", model_provider="openai")

# Connect to Neo4j database
driver = GraphDatabase.driver(
    os.getenv("NEO4J_URI"), 
    auth=(
        os.getenv("NEO4J_USERNAME"), 
        os.getenv("NEO4J_PASSWORD")
    )
)

# Create embedder
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")

# Define retrieval query
retrieval_query = """
MATCH (node)-[:FROM_DOCUMENT]->(d)-[:PDF_OF]->(lesson)
RETURN
    node.text as text, score,
    lesson.url as lesson_url,
    collect { 
        MATCH (node)<-[:FROM_CHUNK]-(entity)-[r]->(other)-[:FROM_CHUNK]->()
        WITH toStringList([
            labels(entity)[2], 
            entity.name, 
            entity.type, 
            entity.description, 
            type(r), 
            labels(other)[2], 
            other.name, 
            other.type, 
            other.description
            ]) as values
        RETURN reduce(acc = "", item in values | acc || coalesce(item || ' ', ''))
    } as associated_entities
"""

# Create vector retriever
vector_retriever = VectorCypherRetriever(
    driver,
    neo4j_database=os.getenv("NEO4J_DATABASE"),
    index_name="chunkEmbedding",
    embedder=embedder,
    retrieval_query=retrieval_query,
)

# Crete LLM for Text2CypherRetriever
llm = OpenAILLM(
    model_name="gpt-4o", 
    model_params={"temperature": 0}
)

# Cypher examples as input/query pairs
examples = [
    "USER INPUT: 'Find a node with the name $name?' QUERY: MATCH (node) WHERE toLower(node.name) CONTAINS toLower($name) RETURN node.name AS name, labels(node) AS labels",
]

# Build the retriever
text2cypher_retriever = Text2CypherRetriever(
    driver=driver,
    neo4j_database=os.getenv("NEO4J_DATABASE"),
    llm=llm,
    examples=examples,
)


# Define functions for each tool in the agent

@tool("Get-graph-database-schema")
def get_schema():
    """Get the schema of the graph database."""
    results, summary, keys = driver.execute_query(
        "CALL db.schema.visualization()",
        database_=os.getenv("NEO4J_DATABASE")
    )
    return results

# Define a tool to retrieve lesson content
@tool("Search-lesson-content")
def search_lessons(query: str):
    """Search for lesson content related to the query."""
    # Use the vector to find relevant chunks
    result = vector_retriever.search(
        query_text=query, 
        top_k=5
    )
    context = [item.content for item in result.items]
    return context

# Define a tool to query the database
@tool("Query-database")
def query_database(query: str):
    """A catchall tool to get answers to specific questions about lesson content."""
    result = text2cypher_retriever.get_search_results(query)
    return result



# Define a list of tools for the agent
tools = [get_schema, search_lessons, query_database]

# Create the agent with the model and tools
agent = create_agent(
    model, 
    tools
)

# Run the application
query = "How many lessons are there?"

for step in agent.stream(
    {
        "messages": [{"role": "user", "content": query}]
    },
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Run the agent. The agent should use the new database query tool to answer the question.

You can see the Cypher query generated in the tool context’s metadata.

Experiment

Experiment with the agent, modify the query to ask different questions, for example:

  • "Each lesson is part of a module. How many lessons are in each module?"

  • "Search the graph and return a list of challenges."

  • "What benefits are associated to the technologies described in the knowledge graph?"

Asking questions related to the graph schema or lesson content should still use the other tools, for example:

  • "What entities exist in the graph?"

  • "What are the benefits of using GraphRAG?"

You may find that the agent will execute multiple tools to answer some questions.

Specific tools

You can create multiple Text to Cypher tools that are specialized for different types of queries.

For example, you could create one tool for querying lessons and another for querying technologies.

Each tool could have different prompt templates or examples to help the LLM generate more accurate Cypher for specific domains.

Lesson Summary

In this lesson, you added a query database tool to your agent using a text to Cypher retriever.

In the next optional challenge, you will create your own agent with a custom set of tools.

Chatbot

How can I help you today?