Since I decided not to modify the performance metric to account for self-loops, we are left with a Louvain-performance algorithm that always arrives at a very unhelpful “best” clustering: put all of the nodes in the same community. Luckily, the nature…
For background, please see my earlier post. Recall that we sometimes want the presence of self-loops to be stronger than inter-community links. That is, despite the weight of some inter-community edge (e_1, e_2), the presence of heavy self-loops (e_1, e_1) and (e_2,…
Anastasia and I recently finished implementing the Louvain algorithm based on the silhouette index (SI) metric. This post discusses our process. For a full implementation of any metric in the Louvain algorithm, the algorithm must be able to calculate the…
In phase 2 of each pass, the Louvain algorithm creates a collapsed version G’ of the current clustering G. Nodes in a community C in G are represented as a single vertex V_c in G’. Information about the weights of…
Earlier, I wrote a post on how performance on weighted graphs uses a meaningful maximum M to account for the weights of nonexistent edges. At that time, my challenges were (1) figuring out how to calculate M, and (2) changing my…
We are now able to construct an adjacency matrix of the network. As Anastasia explained in a previous post, we will use the adjacency matrix in silhouette index to find the shortest path between all pairs of vertices. Our matrix…
The input file represents the network. It should be formatted as a space-delimited file, where each line contains the start node, destination node, and an optional edge weight. The test network. It is undirected and unweighted. The input file for…
One of the tricky things about calculating performance is that it factors in nonexistent edges between vertices in different communities. The more of these edges exist, the higher the overall performance score, because they count as correctly-interpreted edges. That is, since the two…
We recently switched from the C++ to the Java implementation of the Louvain method. Our main concern with the C++ implementation was how it handled serialization. With C++, we had to deal with the serialized version of the network to…