Introduction
The warehouse holds the answer to "what did we do last time on cars like this?" - across vehicles, work orders, and parts. The obvious tool, Text2SQL, can write the query, but on a multi-table join it is quietly wrong: it guesses which columns join, and a plausible-looking join on the wrong key returns plausible-looking nonsense.
In this lesson, you will learn the shape that fixes it - the connections graph - and the rule for where it should live.
The join paths are the missing context
A warehouse is a set of tables wired together by foreign keys: a work order’s vin points at a vehicle, its dtc_code at a trouble code, a parts line at both a work order and a part. Those foreign keys are the legal joins.
Text2SQL does not see them as a structure - it infers joins from column names and hopes. The connections shape makes them explicit and traversable: an agent that can read “work_orders` joins to vehicles on `vin” writes the right join instead of guessing it.
Not "the tool can’t" - "the tool guesses"
A modern Text2SQL system, including Databricks Genie, can write this join. The problem is reliability: on an independent enterprise benchmark, models hit a 78.57% error rate on queries touching four or more tables (Falcon, 2025) - the exact shape of this workshop’s finale join. The connections graph does not make the join possible; it makes it deterministic and auditable - the same correct path every time, one you can inspect.
graph LR
WOP((work_order_parts)) -->|wo_id| WO((work_orders))
WOP -->|part_number| P((parts))
WO -->|vin| V((vehicles))
WO -->|dtc_code| D((dtc_codes))
WO -->|procedure_id| PR((procedures))neocarta reads it from the warehouse
You do not hand-build this graph. neocarta - a Neo4j Labs library - reads a warehouse’s information schema and writes the metadata graph for you:
(:Database)-[:HAS_SCHEMA]->(:Schema)-[:HAS_TABLE]->(:Table)-[:HAS_COLUMN]->(:Column)
(:Column)-[:REFERENCES]->(:Column) // one per foreign key — the join pathsThe AutoFix warehouse lives in BigQuery with its primary and foreign keys declared. neocarta turns each foreign key into a REFERENCES edge - the connections shape, ready to query.
Migrate the map, not the territory
Here is the rule this module teaches, the second axis of the whole workshop: context comes in shapes - and each shape has a place it should live.
neocarta copies the warehouse’s metadata - table and column names, the foreign-key graph. It does not copy the rows. That is deliberate. Run any migration through the four-pains test:
| Pain | Warehouse rows vs. metadata |
|---|---|
Sync |
rows change constantly (resync forever); schema barely changes |
Performance |
millions of rows vs. a few dozen columns |
Modeling |
rows are already modeled in the warehouse; metadata models the connections |
Security |
rows are the sensitive layer; names and keys are not |
Rows fail all four - so they stay in BigQuery, and the finale queries them live with SQL. Metadata wins all four - so it migrates. That is what neocarta is for: you get the connections shape without the migration.
The security knob
neocarta also pulls sample column values by default - handy for routing ("model is Falcon, Heron, or `Osprey`"), but it is the one place metadata brushes the security pain. Naming that trade-off is part of using it honestly.
How the agent reads it
You do not hand-write a tool to query this graph. neocarta ships an MCP server that exposes the connections graph to your agent directly - get_full_metadata_schema returns every table with its columns and foreign-key references, and full-text tools find the right tables by keyword. No embeddings, no setup.
So the agent’s job at runtime is exactly the shape-first move: read the connections from the MCP, then write the SQL along those foreign keys instead of guessing them. In the next challenge you build the graph and watch your agent do this on a real question.
Summary
In this lesson, you learned the connections shape:
-
Join paths - foreign keys made explicit, so the agent routes instead of guessing
-
neocarta - reads the warehouse information schema into a Neo4j metadata graph
-
Migrate vs. federate - the four-pains test; metadata migrates, rows stay in BigQuery
-
The connections MCP - neocarta exposes the graph to your agent as schema tools, no embeddings
In the next challenge, you run neocarta and watch your agent query the connections graph through the MCP.