Modeling Situations

This lesson covers some tips and tricks to working with Spring Data Neo4j, especially when it comes to modeling and designing your domain in an application.

Greedy findAll()

The findAll() method is a very useful method for initially retrieving your graph. It is quick, available out-of-the-box, and works great for small play demo data sets. However, it is designed to be greedy, so it is not always the best choice.

The built-in findAll() method in Spring Data Neo4j will attempt to retrieve the entire modeled graph reachable from the domain entity of the repository, so if you have a large number of nodes and relationships, it can easily take a long time to retrieve or even error out. Instead, a better practice with larger data sets (perhaps more than 100 nodes) is to use custom queries, as you will see later in this course. This means using the @Query annotation with a Cypher query that limits the return results in some way.

For instance, you could define a findMoviesSubset() query, which limits the results to 20 nodes:

java
@Query("MATCH (m:Movie)<-[r:ACTED_IN]-(p:Person)" +
        "RETURN m, collect(r), collect(p) LIMIT 20;")
    Iterable<Movie> findMoviesSubset();

Retrieving all data and connections likely isn’t helpful anyway. Most use cases only need a subset of data to answer a question or solve a problem, so this is a good practice to get into.

Defining bidirectional relationships

Another common mistake is to define unnecessary bidirectional relationships. In Neo4j, nodes are connected by relationships that can be traversed in either direction (Movie ← Person, Person → Movie). Creating bidirectional relationships in the application model may seem tempting, but they are more difficult to debug and maintain and may not actually provide any value to your use case.

For example, this course’s application currently has a relationship defined from Movie class to Person class. You could define the reciprocal relationship from Person to Movie as well, so that when you query persons, you would retrieve their related movies. However, you could then end up in situations where you pull a movie and then pull the persons related to that movie, and then pull other movies those persons acted in, and so on. This can lead fetching the entire graph and cause performance issues. While you can define some safe guards to limit this "rabbit hole" behavior, it is better to avoid it altogether by not defining the bidirectional relationship.

The use case also probably does not require retrieving more than a minimal depth. You probably don’t want to look up a movie and get everyone each actor has ever worked with and their coworkers' resumés, as well! If you do need this behavior, you can always define a custom query to retrieve up to a specific level of depth. As an alternative, you could also use Spring Data Neo4j projections, which we cover in a later lesson of this course.

Define your use case and build your model to support that use case. As with most advanced features, in the correct use case and designed properly, bidirectional relationships are extremely powerful.

Building the entire model into the application

A similar temptation is to build the entire model into the application. This is not a good practice, as it can lead to a lot of overhead and unnecessary data retrieval.

For instance, this course’s application has main data classes for Movie, Person, and Role entities. However, you could define classes, repositories, and controllers for Actor, Director, as well as other entities in the graph such as User and Genre. However, are all these truly needed?

From a business perspective, you likely do not need to build in functionality for multiple use cases into the same app. Most business cases need an app for users to look up movies (MovieRepository) or to find details on users providing movie ratings (UserRepository). Therefore, you wouldn’t need to define repositories and controllers for User, Person, or another entity on top of the existing Movie ones.

Ensure you determine the use case and build the model to support what it needs to achieve. A larger, more complete app is absolutely possible, but it is better to start small and build up than to start big and have to refactor later - or worse, debug functionality that isn’t being used in a large application because it impacts main functionality.

Next Steps

The Spring Data Neo4j documentation contains more information on these ideas:

Summary

In this optional lesson, you learned about some of the modeling decisions you may encounter and issues that can arise in certain situations. If these decisions are made without understanding the implications, they can lead to performance issues and other problems. However, if they are made purposely, then Spring Data Neo4j with a Neo4j graph database can be very a powerful and useful combination.

Next, you will learn how to write data to the database from the application.