Load your own CSV file

In this optional challenge, you will load your own CSV file into Neo4j.

To complete the challenge, you will have to:

  1. Create or find a CSV file to load into Neo4j

  2. Determine the field terminator used in your CSV file

  3. Whether the file contains headers

  4. Upload the file to a cloud-hosting service (Google Drive, Dropbox, S3, etc.)

  5. Construct and run a LOAD CSV statement to load the file into Neo4j

Obtain a CSV file

Your first task is to obtain a CSV file to load into Neo4j. You could:

  • Export a CSV file from an existing spreadsheet or data application.

  • Create a simple CSV file from scratch using a text editor.

  • Download a CSV file from a public data source. I recommend the Kaggle datasets site, which has a range of public datasets available for download.

Inspect the file

Once you have your CSV file, you should determine the following:

  1. What is the field terminator? A comma, or another character?

  2. Are headers included in the file?

You may have set the field terminator (delimiter) and headers when exporting the CSV file.

If unsure, you can open the CSV file in a text editor and inspect it manually.

Upload the file

LOAD CSV can access files on the Neo4j server’s file system or a remote machine.

The course Sandbox and Aura DB cloud service only allow access to remote files. You will need to host your CSV file in a location where you can provide a direct download link.

You could upload your CSV file to a cloud-hosting service like Google Drive, Dropbox, or GitHub and get a direct download link.

Google Drive
  1. Upload your CSV file to Google Drive.

  2. Share your file with anyone with the link.

  3. Get the share link for the file.

  4. Use gdocs2direct to get a direct download link to your file.

Dropbox
  1. Upload your CSV file to Dropbox.

  2. Share your file with anyone with the link.

  3. Get the share link for the file.

  4. Add ?dl=1 to the end of the link to get a direct download link to your file.

See the DropBox documentation for more information

GitHub
  1. Push your CSV file to a public GitHub repository.

  2. Navigate to the file on GitHub.

  3. Add ?raw=true to the end of the link to get a direct download link to your file.

You can find more information on file access in the Reading CSV Files section of the Neo4j documentation.

Construct and run the LOAD CSV statement

The structure of your LOAD CSV statement should take into account the following:

  1. The location of your CSV file.

  2. The field terminator.

  3. Whether the file contains headers.

The LOAD CSV syntax is:

cypher
LOAD CSV [WITH HEADERS] FROM url [AS alias] [FIELDTERMINATOR char]
RETURN alias

Summary

In this optional challenge, you loaded your own CSV file into Neo4j.

In the next module, you will learn how to create nodes and relationships from CSV files.