[Update 2] Louvain-Performance and Self-Loops
Since I decided not to modify the performance metric to account for self-loops, we are left with a Louvain-performance algorithm that always arrives at a very unhelpful “best” clustering: put all of the nodes in the same community. Luckily, the nature…
[Update 1] Louvain-Performance and Self-Loops
For background, please see my earlier post. Recall that we sometimes want the presence of self-loops to be stronger than inter-community links. That is, despite the weight of some inter-community edge (e_1, e_2), the presence of heavy self-loops (e_1, e_1) and (e_2,…
Louvain Method Based on the Silhouette Index Metric
Anastasia and I recently finished implementing the Louvain algorithm based on the silhouette index (SI) metric. This post discusses our process. For a full implementation of any metric in the Louvain algorithm, the algorithm must be able to calculate the…
Louvain-Performance and Self-Loops
In phase 2 of each pass, the Louvain algorithm creates a collapsed version G’ of the current clustering G. Nodes in a community C in G are represented as a single vertex V_c in G’. Information about the weights of…
[Update 1] Louvain-Performance: Weights of Nonexistent Edges
Earlier, I wrote a post on how performance on weighted graphs uses a meaningful maximum M to account for the weights of nonexistent edges. At that time, my challenges were (1) figuring out how to calculate M, and (2) changing my…
Creating the Adjacency Matrix
We are now able to construct an adjacency matrix of the network. As Anastasia explained in a previous post, we will use the adjacency matrix in silhouette index to find the shortest path between all pairs of vertices. Our matrix…
Input/Output Issue
The input file represents the network. It should be formatted as a space-delimited file, where each line contains the start node, destination node, and an optional edge weight. The test network. It is undirected and unweighted. The input file for…
Louvain-Performance: Weights of Nonexistent Edges
One of the tricky things about calculating performance is that it factors in nonexistent edges between vertices in different communities. The more of these edges exist, the higher the overall performance score, because they count as correctly-interpreted edges. That is, since the two…
Switch to Java
We recently switched from the C++ to the Java implementation of the Louvain method. Our main concern with the C++ implementation was how it handled serialization. With C++, we had to deal with the serialized version of the network to…
Louvain Algorithm
At our meeting on 09/18/15, I presented the Louvain Algorithm as outlined in the paper “Fast unfolding of communities in large networks”. This method centers around optimizing modularity, which measures the density of the links inside of the communities as…