Skip to content

AI on Your Lakehouse: Context Comes in Shapes, Not Queries

Course Duration
2 hours
Categories
Developing Practitioner Skills in Neo4j

Course Description

Your agent can reach your data but still can’t use it reliably. Vector search and Text2SQL each hand it a slice, but not the view to know what is truly relevant and how to connect the right information. Without that, answers come back confident but wrong. That is not a model or query problem - it is a context problem, and thinking in terms of shapes is what cracks it.

In this hands-on workshop, you give an agent three reusable shapes - and learn a second axis: where each shape should live.

  • Connections (Paths) - how the warehouse tables join, read from BigQuery by neocarta

  • Table of Contents (Trees) - navigate the documents

  • Themes (Communities) - surface patterns nobody named

You will work with data from AutoFix Group, a fictional national auto-repair chain: service manuals, bulletins, and recall notices as PDFs in cloud storage, and vehicles, work orders, parts, and procedures in a BigQuery warehouse. The two halves share part numbers and diagnostic trouble codes - and that overlap is what you build on.

You build the shapes the way you would on the job: in a hosted Codespace, you and your coding agent author a service-advisor skill - tools and policy, module by module - and finish by handing it a live work order. The agent grounds the symptom in the documents, reads the real repair history from BigQuery, picks the evidence-backed fix, catches an open recall, orders the part through an API, and leaves an auditable trail in the graph.

At the heart of it is the question Text2SQL gets quietly wrong - "what fixed this code on cars like this one?" - answered by federating: the documents are a graph in Neo4j, the warehouse rows stay in BigQuery, and the agent crosses the boundary on the shared key. Nothing is migrated that does not need to be.

The pattern is BigQuery-first but portable: swap the connector and the same shapes work on Snowflake, Databricks, or anywhere your data lives.

Prerequisites

Before taking this workshop, you should have:

  • A basic understanding of graph databases and Neo4j

  • The ability to read and run basic Cypher and SQL queries

  • Familiarity with data warehouse or lakehouse concepts (tables, keys)

We recommend completing the Neo4j Fundamentals course first. No Graph Data Science experience is required.

For the hands-on path as designed, you will need a coding agent such as Claude Code, Cursor, Codex, or Gemini CLI. The Codespace provides read-only access to the workshop’s BigQuery dataset and a Neo4j sandbox; every challenge can also be completed in the integrated sandbox.

Duration

2 hours core path, plus around 20 minutes of optional practice.

What you will learn

  • Why agent context is a problem of shapes, not queries - and where vector search and Text2SQL fall short

  • How to build the connections shape with neocarta - the warehouse join paths, read from BigQuery as metadata

  • How to navigate documents as a tree and surface themes with Leiden community detection

  • When to derive a graph and when to federate - the four-pains decision

  • How an agent judges live - retrieving warehouse schema from the connections graph and writing Text2SQL grounded in Neo4j, on the shared key

  • How an agent decides and acts on policy - grounding, evidence, recalls, escalation - with an auditable trail

  • How to build an agent skill a coding agent runs through the Neo4j CLI, and port the pattern to another lakehouse

This workshop includes

  • 11 lessons

  • 6 hands-on challenges

  • 1 knowledge check

Get Support

If you find yourself stuck at any stage then our friendly community will be happy to help. You can reach out for help on the Neo4j Community Site, or head over to the Neo4j Discord server for real-time discussions.

Feedback

If you have any comments or feedback on this course you can email us on graphacademy@neo4j.com.