文摘
Community detection is a classic and very difficult task in complex network analysis. As the increasingly explosion of social media, scaling community detection methods to large networks has attracted considerable recent interests. In this paper, we propose a novel SIMPLifying and Ensembling (SIMPLE) framework for parallel community detection. It employs the random link sampling to simplify the network and obtain basic partitionings on every sampled graphs. Then, the K-means-based Consensus Clustering is used to ensemble a number of basic partitionings to get high-quality community structures. All of phases in SIMPLE, including random sampling, sampled graph partitioning, and consensus clustering, are encapsulated into MapReduce for parallel execution. Experiments on six real-world social networks analyze key parameters and factors inside SIMPLE, and demonstrate both effectiveness and efficiency of the SIMPLE.