Introduction
You saw the three-layer wall in the previous lesson. You will break through it the way you would on the job: by building a service-advisor agent - a skill your coding agent runs to decide and act on AutoFix work orders.
In this lesson, you will open your workshop environment and connect it to both halves of the lakehouse: the Neo4j sandbox and the BigQuery warehouse.
Open your Codespace
The workshop environment runs in a GitHub Codespace - a cloud development environment in your browser with everything preinstalled: Python, the Neo4j CLI with its agent skills, neocarta, and the Claude Code coding agent.
While it builds (a few minutes), look at what is inside:
| Folder | What |
|---|---|
|
The document half: real PDF manuals, bulletins, and recall notices |
|
Builds the warehouse connections graph from BigQuery (Module 2) |
|
The pipeline that parses the PDFs into the document graph (Modules 3-4) |
|
The shape specs - connections, outline, and theme formats: the contracts you build against |
|
What you build - the service-advisor playbook and its tool scripts |
|
A mock parts-ordering API your agent will act through |
|
Incoming work orders for the finale |
|
Complete scripts if you need to catch up |
Prefer your own setup?
Everything runs locally too: clone the repository, pip install -r requirements.txt, install neo4j-cli, and run neo4j-cli skill install for the agent of your choice.
Connect to both halves
GraphAcademy has provisioned a Neo4j sandbox for you (with the Graph Data Science library you will use in Module 4). Paste its credentials into the .env file in your Codespace:
NEO4J_URI=bolt://18.212.18.244:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=filler-stem-lift
NEO4J_DATABASE=neo4jThe BigQuery warehouse is already wired up: your .env ships with GCP_PROJECT_ID and BIGQUERY_DATASET_ID pointing at the workshop’s read-only AutoFix dataset. You never write to it - you only read, and only its metadata crosses into the graph.
Claude access
If you are signed in to Claude Code, you are done.
Otherwise set ANTHROPIC_API_KEY in the same .env file - use the key provided in the workshop, or create your own at console.anthropic.com.
Load the documents
One command builds the document graph - documents only. The warehouse rows stay in BigQuery; you connect to them in the next module.
python load/load_graph.pyYou should see the pipeline parse 10 PDFs into one Library tree - 3 folders, 10 documents, 37 sections, with citation and shared-key links - and finish with warehouse rows stay in BigQuery. The document half comes from real PDFs; the parser is the same shape a production pipeline would run over cloud storage.
Then start the parts API in a second terminal - your agent acts through it in Module 5:
uvicorn api.parts_api:app --port 8800Meet your agent
Start your coding agent in the repository root and ask it something only the graph can answer:
claude
> What is in the Neo4j database described by .env? Give me counts by label.The agent uses the Neo4j skills to inspect the graph: 1 Library, 10 documents, 37 sections, and the parts and codes they reference. The warehouse rows are not here - they are in BigQuery, where you will reach them in the finale. The sandbox window on the right is your inspection surface throughout: when a tool returns ids or URIs, look at them there as a graph.
Summary
In this lesson, you set up your environment:
-
Codespace - Python, neo4j-cli with agent skills, neocarta, and Claude Code, preinstalled
-
Both halves connected - Neo4j sandbox credentials + read-only access to the BigQuery warehouse
-
Documents loaded - the PDFs parsed into the graph; the warehouse rows stay in BigQuery
-
Parts API + smoke test - your agent can already inspect the document graph
In the next module, you build the connections shape: neocarta reads the warehouse’s join paths from BigQuery.