Unstructured data

Unstructured data refers to information that doesn’t fit neatly into pre-defined structures and types. For example, text files, emails, social media posts, videos, photos, audio files, and web pages.

Unstructured data is often rich in information but challenging to analyze.

Vectors and Graphs

Vectors and embeddings can represent unstructured data, making it easier to identify similarities and search for related data.

Graphs are a powerful tool for representing and analyzing unstructured data.

For example, you can use vectors to find the correct documentation to support a customer query and a graph to understand the relationships between different products and customer feedback.

Chunking

When dealing with large amounts of data, breaking it into smaller, more manageable chunks is helpful. This process is called chunking.

There are countless strategies for splitting data into chunks, and the best approach depends on the data and the problem you are trying to solve.

Later in this workshop, you will import some unstructured data and split it into chunks.

You can store embeddings for individual chunks and create relationships between chunks to capture context and relationships.

Continue

When you are ready, you can move on to the next task.

Summary

You learned about how you can store unstructured data in a graph.

In the next task, you will use Python and the GraphRAG Python package to split split text into chunks and create embeddings from them.