Apply it to your data
You’ve built a complete pipeline from PDF extraction to graph import — on a specific dataset with specific quirks. The techniques generalize, but adapting them to a new corpus means investigating new data, handling new noise patterns, and making new modeling decisions.
The prompt pack is a project folder with LLM instructions that does this for you. Open it in Claude Code, Cursor, VS Code — or your IDE of choice — with an LLM extension, point the LLM it at your documents. It will use the best practices from this course to build a pipeline for your data.
What it does
-
Investigates your data — samples documents, identifies structure, assesses quality
-
Designs a pipeline — recommends extraction, parsing, and modeling strategies with tradeoffs
-
Builds the pipeline — implements each component with checkpoint files
-
Validates the result — verifies the graph answers your questions
The assistant adapts the techniques to your data — it does not assume your documents look like the Enron email corpus.
Download
Place your documents in data/source/, open the project in Claude Code, and tell the assistant what questions you want the graph to answer.
Getting started
-
Download and unzip the prompt pack
-
Place your documents in
data/source/ -
Open the project folder
-
Tell the assistant what questions you want the graph to answer
-
The assistant investigates your data before building anything
The LLM.md file contains the instructions the assistant follows. The reference/ folder contains the full technique reference, data model guide, and reusable code patterns from this course.
Summary
Course complete. The next course, Entity Extraction: Communication Networks, adds thread decomposition, chunking, and entity extraction.