Workshop Overview

Introduction

Your agent can reach your data but still cannot use it reliably. In this workshop, you will fix that by building three reusable graph shapes from lakehouse data - and a service-advisor skill that lets your coding agent decide and act on them.

The scenario - AutoFix Group

AutoFix Group is a national auto-repair chain running on a cloud lakehouse:

Repair manuals, service bulletins, and recall notices live as PDFs in cloud storage
Every repair, part, and vehicle lives in BigQuery tables (vehicles, work_orders, parts, procedures)

Dani Reyes, a master technician, has a 2020 Falcon in the bay throwing misfire code P0301. There was a bulletin about a revised ignition coil - but which model years? The manual portal returns 40 keyword hits that all look right; none say whether the fix applies to this car. The shop system knows every misfire ever booked and the part that fixed it - but that is a different screen, and you cannot ask it "show me cars like this one".

The knowledge exists. It is scattered across documents that do not talk to the data.

The three-layer wall

Morgan Tao, VP of Service Ops, wants a copilot technicians can ask in plain language. Sam Okafor, the AI engineer, builds the obvious solution - and hits three walls:

Vector search returns the right meaning in the wrong shape. It finds passages similar to the question, but the answer needs a connected set of bulletins, parts, work orders, and vehicles - not a pile of paragraphs.
Text2SQL is quietly wrong. It works on one table, but on multi-table, multi-hop joins ("prior repairs on similar vehicles where this part fixed this code") it fails silently - plausible but subtly wrong.
Neither tool crosses the boundary. The bulletin is in a PDF; the vehicle is in a BigQuery table. The part number and trouble code that connect them appear in both - but nothing treats that as a link.

That is not a model problem or a query problem. It is a context problem.

Context comes in shapes

The fix is to give the agent the shapes its answers need. You will build three reusable shapes:

Connections (Paths) - how the warehouse tables join, in Module 2 - neocarta reads the foreign keys from BigQuery
Table of Contents (Trees) - navigate the documents, in Module 3
Themes (Communities) - surface patterns nobody named, in Module 4

And you learn a second axis: where each shape should live. The documents and the warehouse’s connections become graphs in Neo4j; the warehouse rows stay in BigQuery; and the agent crosses between them on the keys they already share - part numbers and trouble codes.

The goal

By the end of this workshop, a work order opens for Dani’s Falcon and your agent handles it end to end: ground the symptom in the documents, read the real repair history from the warehouse, pick the evidence-backed fix, catch an open recall, order the part, and leave an auditable trail.

The hard question at its heart - "what fixed this code on cars like this one?" - is the one Text2SQL gets quietly wrong. Your agent answers it by federating, never migrating:

Neo4j - which documents cover P0301, and which parts they name
BigQuery - for those parts, the real outcomes on similar vehicles: how often used, how often the car came back
join them - the top answer is the revised coil: five repairs, zero comebacks

The documents are a graph, the warehouse stays in BigQuery, and the agent crosses the boundary on the shared part number. You build it as a service-advisor skill, shape by shape.

Prerequisites and duration

You should be able to read and run basic Cypher queries. No Graph Data Science experience is required - you will learn what you need in Module 4.

Bring the coding agent you already use - Claude Code, Cursor, Codex, or similar. In each hands-on module, you and your agent build the skill’s tools together, and every query is also runnable by hand in the sandbox if you prefer.

The core path takes 90 minutes, with around 20 more minutes of optional practice - ideal for revisiting after a live session.

The patterns are Databricks-first but portable - Module 6 shows how to swap the connector for Snowflake, BigQuery, or anywhere your data lives.

Summary

In this workshop, you will:

Build navigation tools over a document tree parsed from real PDFs
Build theme tools with community detection - patterns nobody named
Watch the agent judge across the document-to-table boundary with live Text2SQL it grounds in the connections graph
Watch your agent decide and act on a live work order, leaving an auditable trail

In the next lesson, you will open your workshop environment and load the AutoFix lakehouse.

AI on Your Lakehouse: Context Comes in Shapes, Not Queries

The Context Problem

Connections - the structured shape

Navigate What’s There - Trees

Surface Themes - Communities

Put It Together - the federated finale

Port the Pattern