Community Detection in Social Networks
Not peer reviewed
MetadataShow full item record
Social networks usually display a hierarchy of communities and it is the task of community detection algorithms to detect these communities and preferably also their hierarchical relationships. One common class of such hierarchical algorithms are the agglomerative algorithms. These algorithms start with one community per vertex in the network and keep agglomerating vertices together to form increasingly larger communities. Another common class of hierarchical algorithms are the divisive algorithms. These algorithms start with a single community comprising all the vertices of the network and then split the network into several connected components that are viewed as communities. We start this thesis by giving an introductory overview of the field of com- munity detection in part I, including complex networks, the basic groups of com- munity definitions, the modularity function, and a description of common com- munity detection techniques, including agglomerative and divisive algorithms. Then we proceed, in part II, with community detection algorithms that have been implemented and tested, with refined use of data structures, as part of this thesis. We start by describing, implementing and testing against benchmark graphs the greedy hierarchical agglomerative community detection algorithm proposed by Aaron Clauset, M. E. J. Newman, and Cristopher Moore in 2004 in the article Finding community structure in very large networks . We continue with describing and implementing the hierarchical divisive algorithm proposed by Filippo Radicchi, Claudio Castellano, Federico Cecconi, Vittorio Loreto, and Domenico Parisi in 2004 in the article Defining and identifying communities in networks . Instead of testing this algorithm against benchmark graphs we present a community detection web service that runs the algorithm by Radicchi et al. on the collaboration networks in the DBLP database of scientific publi- cations and co- authorships in the area of computer science. We allow the user to freely set the many parameters that we have defined for this algorithm. The final judgment on the results is measured by the modularity value or can be left to the knowledgeable user. A rough description of the design of the algorithms and of the web service is given, and all code is available at GitHub  . Lastly, a few improvements both to the algorithm by Radicchi et al. and to the web service are presented.