Optional Practice
You built the themes tool and read its evidence blocks.
In this lesson, you will work with the written-back themeId directly - three questions that combine themes with the rest of the graph.
Advanced learners: Skip to Module 5.
Beginners: These exercises deepen your feel for what doc-level themes mean before the warehouse joins in.
Exercise 1: Theme membership, by hand
List every theme with its member documents - the raw data under the tool’s blocks.
Details
MATCH (d:Document) WHERE d.themeId IS NOT NULL
RETURN d.themeId AS theme, collect(d.id) AS documents
ORDER BY size(documents) DESCThe ungrouped remainder is just as informative:
MATCH (d:Document) WHERE d.themeId IS NULL
RETURN d.id AS document, d.displayName AS titleTry experimenting:
-
Re-run
python skill/scripts/themes.py --gamma 2.0and watch both results change - then remember the contract: store URIs, never theme numbers
Exercise 2: Which themes carry a safety recall?
Recalls are the highest-stakes documents. Which theme does each recall sit in, and how much supporting material surrounds it?
Details
MATCH (r:RecallNotice) WHERE r.themeId IS NOT NULL
MATCH (member:Document {themeId: r.themeId})
RETURN r.themeId AS theme, r.title AS recall,
count(member) AS documentsInTheme,
collect(member.id) AS membersFor Morgan’s team, a recall in a well-documented theme is routine; a recall in a thin theme is a documentation gap with safety stakes.
Try experimenting:
-
Check whether the two recalls share a theme at gamma 1.0 versus gamma 2.0
Exercise 3: Route a trouble code to its theme
A technician scans P0562.
Which theme should the agent read - and what is in it?
Details
MATCH (s:Section)-[:REFERENCES_CODE]->(:DTC {code: 'P0562'})
MATCH (d:Document {uri: split(s.uri, '#')[0]})
WHERE d.themeId IS NOT NULL
MATCH (member:Document {themeId: d.themeId})
RETURN DISTINCT d.themeId AS theme,
collect(DISTINCT member.displayName) AS readingListTwo hops from a scanned code to a theme’s full reading list - and note the middle hop: ownership by URI prefix, the same collapse your projection used. At gamma 1.0 this code routes to two themes - the code itself sits on a theme boundary, and that is information: a symptom that two clusters of documentation care about.
Try experimenting:
-
Try
C0035- does it route to a different theme? -
Append the outline drill: pick a member document and
python skill/scripts/outline.py <its uri>
Summary
In this optional practice, you put themeId to work:
-
Membership and ungrouped - the raw data behind the evidence blocks
-
Risk concentration - where the recalls sit, and how thick their themes are
-
Code routing - scanned code → owning document → theme → reading list
In Module 5, the warehouse joins - and themes gain an outcomes dimension.