Shapes Before Queries

Introduction

Most people start with "what is my money query?" - and end up thinking in SQL: tables, joins, one result set. This workshop trains the opposite reflex: decide what shape the context needs first, write a spec for that shape, and only then derive the query logic that materializes it.

In this lesson, you will learn the containment model the library is built on, and read the spec for the first shape you will build: the outline.

One containment relationship

The load pipeline parsed the PDF library into a tree with a single relationship type, HAS:

mermaid
graph TD
    L((Library)) -->|HAS| F((Folder <br> bulletins/))
    F -->|HAS| D((Document <br> TSB-21-114))
    D -->|HAS| S1((Section <br> Condition))
    D -->|HAS| S2((Section <br> Repair <br> Procedure))

Why one type instead of HAS_FOLDER / HAS_DOCUMENT / HAS_SECTION? Every containment edge means the same thing - "parent in the tree" - and the node labels already say what kind of child it is. One type lets every tree walk be written as [:HAS*], whatever it passes through.

Two more structural relationships complete the document half:

  • (Section)-[:NEXT_SECTION]→(Section) threads every document’s sections in reading order

  • (Section)-[:LINKS_TO]→(…​) crosses trees: {citation: true} where a section explicitly cites another document ("per safety recall RC-2021-04"), {derived: true, sharedKeys, strength} where sections in different documents reference the same part or code

URIs carry the hierarchy

Every node has a uri - a hierarchical slug that encodes its place in the tree:

technical-library
technical-library/bulletins
technical-library/bulletins/tsb-21-114.pdf
technical-library/bulletins/tsb-21-114.pdf#condition
technical-library/manuals/man-fal-3.pdf#engine/ignition-coil-replacement

Because URIs sort hierarchically, any subtree is a string prefix - scoping a search to the bulletins is STARTS WITH 'technical-library/bulletins'. Section fragments carry the full heading path, so a URI is always copy-pasteable back into any tool.

One more rule worth knowing: a section’s content holds only its own body text, followed by uri: pointer lines for its children. Wherever an agent sees a uri: line, deeper content exists - retrievable by exactly one more lookup, never duplicated upward.

Read the spec first

Open docs/outline-format.md in your Codespace. That document is not documentation of something that exists - it is the spec you are about to build against. It pins down the shape an agent’s navigation context must take:

  • a ToC with display names on the left and full, verbatim URIs on the right

  • rows for outbound links that never expand their target

  • sibling order: folders and documents alphabetical, sections in reading order

Read it and ask the shape-first question: what graph reasoning materializes this? The answer - one variable-length [:HAS*] walk emitting flat rows the renderer assembles - is the heart of the next challenge. Notice what is impossible here in fixed-join thinking: the tree’s depth is not known in advance, so no fixed number of joins can walk it.

Summary

In this lesson, you learned the model behind the shapes:

  • One HAS type - every containment edge means the same thing; trees walk as [:HAS*]

  • Hierarchical URIs - any subtree is a prefix; fragments carry the full heading path

  • Shallow content with uri: pointers - deeper content is one lookup away, never duplicated

  • Shape-first - the spec (docs/outline-format.md) comes before any Cypher

In the next challenge, you and your agent build the outline and search tools from their specs.

Chatbot

How can I help you today?

Data Model

Your data model will appear here.