Introduction
You’ve learned how Louvain finds communities by optimizing modularity.
Now let’s put it to work. You’ll project the fraud network, run Louvain, and see how it reduces your search space by 98%.
What You’ll Learn
By the end of this lesson, you’ll be able to:
-
Create a heterogeneous graph projection for fraud analysis
-
Run Louvain in stats and write modes
-
Identify communities containing known fraudsters
-
Focus investigation on high-priority communities
Step 1: Project the Graph
Create a projection of Users, Cards, and Devices:
MATCH (source:UserP2P)-[r:HAS_CC|USED|P2P]->(target:User|Card|Device)
RETURN gds.graph.project(
'fraud-graph',
source,
target,
{relationshipType: type(r)}, // (1)
{undirectedRelationshipTypes: ['HAS_CC', 'USED']} // (2)
)-
Preserves original relationship types (HAS_CC, USED, P2P) in the projection so they can be referenced later
-
Makes sharing relationships undirected — if User A and User B share a card, the connection propagates both ways
Understanding the Projection
This projection makes several important choices:
Why include Cards and Devices?
Fraudsters share infrastructure. Including these nodes lets Louvain find communities based on shared cards and devices—not just direct transactions.
Why undirected for HAS_CC and USED?
Sharing is bidirectional—if User A and User B both have the same card, that connection works both ways. Making these undirected ensures the relationship propagates through the shared infrastructure back to other users.
Why keep P2P directed?
Transaction direction matters. Money flows from sender to receiver, and that asymmetry can reveal fraud patterns.
Step 2: Run Louvain in Stats Mode
Before writing results, let’s preview what Louvain will find:
CALL gds.louvain.stats('fraud-graph', {})
YIELD communityCount, communityDistribution, modularity
RETURN communityCount, communityDistribution, modularityInterpreting Stats Results
You should see approximately:
-
communityCount: ~11,500 communities
-
modularity: ~0.98
Remember from Lesson 2: modularity above 0.4 indicates useful structure. A score of 0.98 means extremely well-defined communities—nodes within communities are far more connected to each other than to outsiders.
The communityDistribution shows size statistics (min, max, mean, percentiles).
Step 3: Run Louvain in Write Mode
Now write the community IDs back to the database:
CALL gds.louvain.write('fraud-graph', {
writeProperty: 'louvainCommunityId' // (1)
})
YIELD communityCount, modularity // (2)
RETURN communityCount, modularity-
Each node receives a
louvainCommunityIdproperty — nodes in the same community share the same ID -
Returns summary stats to confirm the algorithm ran successfully
What Just Happened?
Louvain analyzed the projected nodes and found natural groupings.
Each node now has a louvainCommunityId property indicating which community it belongs to.
Nodes in the same community are more densely connected to each other than to the rest of the network.
A Note on Community IDs
Your community IDs will differ from any examples shown.
Louvain is non-deterministic—the specific IDs assigned depend on processing order. What matters is the grouping, not the ID numbers.
When following along, always use the IDs from your results.
Step 4: Visualize Communities
See the community structure:
MATCH path = (u:UserP2P)-[*1..2]-(n:Card|Device)
WHERE u.louvainCommunityId = n.louvainCommunityId // (1)
RETURN path
LIMIT 100-
Filters to only show nodes that Louvain assigned to the same community — confirming they cluster around shared infrastructure
Click on nodes to see their louvainCommunityId. Nodes in the same visual cluster should share the same community ID.
Notice how users cluster around shared cards and devices—this is exactly the infrastructure sharing we want to detect.
Step 5: Count Fraudulent Communities
How many communities contain known fraudsters?
MATCH (u:UserP2P)
WITH u.louvainCommunityId AS community,
sum(u.fraudMoneyTransfer) AS flaggedCount // (1)
RETURN
sum(CASE WHEN flaggedCount > 0 THEN 1 ELSE 0 END) AS communitiesWithFraud, // (2)
sum(CASE WHEN flaggedCount = 0 THEN 1 ELSE 0 END) AS communitiesWithoutFraud-
Aggregates fraud flags per community —
fraudMoneyTransferis 1 for known fraudsters, 0 otherwise -
Uses conditional aggregation to split communities into those containing fraud vs. clean ones
The Power of Community Detection
You should find approximately:
-
~200 communities with at least one flagged fraudster
-
~11,500 communities with no flagged fraudsters
That’s roughly 1.7% of communities containing known fraud.
Louvain just reduced your search space by 98%.
Why This Matters
Before Louvain: 204,000 users to investigate
After Louvain: ~200 communities worth examining
The vast majority of users are in communities with no fraud flags. We can deprioritize them entirely and focus on the suspicious minority.
Step 6: Rank Communities by Fraud
Not all fraudulent communities are equal. Find the most suspicious ones:
MATCH (u:UserP2P)
WITH u.louvainCommunityId AS community,
count(u) AS userCount,
sum(u.fraudMoneyTransfer) AS flaggedCount
WHERE flaggedCount > 0 // (1)
RETURN community,
userCount,
flaggedCount,
round(100.0 * flaggedCount / userCount, 1) AS flaggedPercent // (2)
ORDER BY flaggedCount DESC
LIMIT 10-
Filters to only communities with at least one known fraudster
-
Calculates the fraud concentration — a community where 50% of users are flagged is more suspicious than one where 1% are
Interpreting the Rankings
The results show:
-
community — The Louvain community ID
-
userCount — Total users in that community
-
flaggedCount — Known fraudsters in that community
-
flaggedPercent — Percentage of community that’s flagged
High flaggedCount = More known fraud (larger rings)
High flaggedPercent = More concentrated fraud (tighter rings)
Note the community ID at the top of your results—you’ll investigate it in the next steps. Remember, your ID will differ from others'.
Step 7: Set a Parameter for Investigation
Pick the top community from your results and set it as a parameter:
:param louvainCommunityId => 179061The :param command is specific to Neo4j Browser. If you’re using a different client, you may need to pass parameters differently.
Replace 179061 with the community ID from the top of your results.
Step 8: Examine the Community
See the breakdown of flagged vs unflagged users:
MATCH (u:UserP2P)
WHERE u.louvainCommunityId = $louvainCommunityId // (1)
RETURN u.fraudMoneyTransfer AS isFlagged, // (2)
count(*) AS userCount
ORDER BY isFlagged-
Uses the parameter set in the previous step to filter to a single community
-
Groups by fraud flag to show how many users are flagged (1) vs unflagged (0) — unflagged users in fraud-heavy communities are our investigation targets
What This Tells Us
You should see two rows:
-
Users with
fraudMoneyTransfer = 0(unflagged) -
Users with
fraudMoneyTransfer = 1(flagged)
The unflagged users are our investigation targets—they’re in a fraud-heavy community but haven’t been identified yet.
Are they accomplices? Victims? Mules? That’s what we need to find out.
Step 9: Visualize the Community
See how users in this community connect:
MATCH path = (u1:UserP2P)-[:HAS_CC|USED|P2P*1..4]-(u2:UserP2P) // (1)
WHERE u1.louvainCommunityId = $louvainCommunityId
AND u2.louvainCommunityId = $louvainCommunityId
AND u1 <> u2 // (2)
RETURN path
LIMIT 200-
Traverses up to 4 hops through shared cards, devices, and P2P transactions to reveal the full community structure
-
Prevents self-matching — ensures we only see paths between distinct users
Expand nodes to explore the connections. Look for:
-
Flagged users (fraudMoneyTransfer = 1) clustered together
-
Unflagged users connected to multiple flagged users
-
Shared cards or devices linking suspicious accounts
These patterns suggest which unflagged users deserve closer scrutiny.
What We’ve Achieved
| Stage | Scale |
|---|---|
Starting point |
~790,000 nodes, 204,000 users |
After Louvain |
~200 suspicious communities |
Focused community |
A few hundred users |
We’ve gone from an impossible manual task to a focused investigation.
The Remaining Question
We’ve found communities containing fraud. But within each community:
-
Which users are most suspicious?
-
Who should we investigate first?
-
How do we prioritize hundreds of potential suspects?
What’s Next
In Lesson 4, you’ll learn two algorithms that help with formal community assignment:
-
Degree Centrality — Identify high-connection nodes (potential hubs or noise)
-
Weakly Connected Components (WCC) — Deterministic community assignment for auditable results
These tools will help you move from exploration to actionable suspect lists.
Cleanup
Drop the projection:
CALL gds.graph.drop('fraud-graph')You can keep the projection if you want to experiment further. The louvainCommunityId property is already written to nodes, so the projection is no longer needed for the analysis we’ve done.
Summary
You’ve used Louvain to dramatically reduce your search space:
-
Created a heterogeneous projection capturing users and shared infrastructure
-
Found ~11,500 communities with modularity of 0.98
-
Identified ~200 communities (1.7%) containing known fraud
-
Focused on the most fraudulent community for investigation
Louvain transformed an impossible 204,000-user investigation into a manageable set of suspicious communities.