Reading Themes Like an Agent

Introduction

Your themes tool prints evidence blocks, not answers. That is deliberate - and it is the part of the shape that makes agent decisions trustworthy.

In this lesson, you will read the blocks the way an agent does, name the themes yourself, and see what the view unlocks for Morgan’s operations questions.

The tool never names a theme

Run python skill/scripts/themes.py --gamma 2.0 and look at one block:

T2  3 docs (30%) · moderately interlinked
    top shared targets   [IC-2042-A] in 3 docs · [IC-2042-B] in 2 docs · [P0301] in 2 docs
    most-linked docs     Revised Ignition Coil for Repeated Misfire .. D   technical-library/bulletins/tsb-21-114.pdf
                         Falcon Service Manual (3rd Edition) ........ D   technical-library/manuals/man-fal-3.pdf

A generated label like "Theme: coils" would anchor you to whatever the tool guessed. Instead the block carries naming evidence: the shared parts and codes, and the member titles. You - or the agent - name it: ignition misfire. Bad labels anchor worse than no labels; evidence travels.

Every line defends itself

Notice the format’s honesty rules, straight from the spec:

The header reconciles: 10 docs · 9 grouped into 4 themes · 1 ungrouped - an agent can never claim "the library is about these 4 things" when one document sits outside them
Counts carry units inline (in 3 docs), cohesion is a word backed by conductance, never a float to misread
Every document row ends in a full URI - the drill handle. Follow one with python skill/scripts/outline.py <uri> and you are inside that document’s tree

This is what "a view the agent can reason over" means: small enough for one prompt, and every claim traceable one tool-call deeper.

Morgan’s questions

With themes as data, operations questions become one look:

Where are issues concentrated? The ignition theme spans a manual, a bulletin, and a recall - the shop’s highest-stakes topic by document gravity
Where is documentation thin? A theme held together by strong shared targets but few member documents is a training-material gap
Where does a new bulletin belong? Its part and code references connect it to an existing theme the moment it is loaded - no manual tagging

And next module, the warehouse joins the reasoning: "which theme generates the most comebacks?" becomes one more hop.

Summary

In this lesson, you read the themes shape like an agent:

Evidence, not labels - shared targets + member titles are the naming signal; the agent does the naming
Honest accounting - headers reconcile, ungrouped is always visible, URIs make every claim drillable
Operations view - concentration, gaps, and automatic placement of new documents

In the next module, your agent crosses into the warehouse - judging live with Text2SQL it grounds in the connections graph.

AI on Your Lakehouse: Context Comes in Shapes, Not Queries

The Context Problem

Connections - the structured shape

Navigate What’s There - Trees

Surface Themes - Communities

Put It Together - the federated finale

Port the Pattern