Eliminating Complex Data in Nodes

Example: Complex data

Since nodes are used to store data about specific entities, you may have initially modeled, for example, a Production node to contain the details of the address for the production company.

Complex data in nodes

Storing complex data in the nodes like this may not be beneficial for a couple of reasons:

  1. Duplicate data. Many nodes may have production companies in a particular location and the data is repeated in many nodes.

  2. Queries related to the information in the nodes require that all nodes be retrieved.

Refactoring complex data

If there is a high amount of duplicate data in the nodes or if key questions of your use cases would perform better if all nodes need not be retrieved to get at the complex data, then you might consider refactoring the graph as shown here.

Complex data their own nodes

In this refactoring, if there are queries that need to filter production companies by their state, then it will be faster to query based upon the State.name value, rather than evaluating all of the state properties for the Production nodes.

How you refactor your graph to handle complex data will depend upon the performance of the queries when your graph scales.

Check your understanding

1. Refactoring complex data?

Why do you refactor a graph that has complex data in nodes?

  • ✓ Eliminate duplication of data in multiple nodes.

  • ✓ Improve query performance.

  • ❏ Reduce the number of relationships in the graph.

Hint

There are two main reasons you refactor a graph to model complex data.

Solution

Refactoring a graph allows you to eliminate duplication of data and to improve query performance.

Summary

In this lesson, you learned why it is important to model complex data to eliminate duplication and also improve query performance. In the next module, you will learn about refactoring to create specific relationships.