Community Detection with GDS

Course Duration: 2 hours
Categories: Data Scientist

Identify communities of people based on their survey responses

Course Description

This course teaches you how to preprocess and explore a dataset in order to complete a user segmentation task based on their survey response.

You will prepare and import the results of a survey conducted in 2013 by a Statistics class at FSEV UK, containing 1010 responses to 150 questions on all aspects of life, including music and movie preferences, hobbies, and phobias.

This course will teach you how to encode categorical variables and perform essential dimensionality reduction, such as eliminating low variance and highly correlating features.

You will learn how to use the following community detection algorithms provided by the Graph Data Science library:

KMeans Algorithm: KMeans is a centroid-based clustering algorithm that partitions data into distinct, non-overlapping subsets or 'clusters'. In community detection, it identifies groups of individuals with similar attributes or behaviors.
kNN, or k Nearest Neighbors: kNN is a distance-based algorithm that classifies a data point based on the majority class of its 'k' nearest neighbors. In community detection, it allows for the identification of closely related groups based on feature similarity.
The Leiden Algorithm: The Leiden algorithm is a community detection method renowned for its resolution sensitivity and performance. It partitions a network into communities by optimizing a modularity score, making it valuable for detecting compact groups within complex networks.

Prerequisites

This course is intended for analysts and data scientists who have basic knowledge of:

Duration

2 hours

What you will learn

Encoding categorial features
Basic dimensionality reduction techniques
How to use KMeans algorithm
Constructing a nearest neighbor graph with kNN algorithm
How to use Leiden algorithm

Get Support

If you find yourself stuck at any stage then our friendly community will be happy to help. You can reach out for help on the Neo4j Community Site, or head over to the Neo4j Discord server for real-time discussions.