Community detection in social networks using consensus clustering

  • Masoumeh Kheirkhahzadeh Iran University of Science and Technology
  • Morteza Analoui Iran University of Science and Technology
Keywords: Community detection, machine learning, Consensus clustering, Ensemble clustering, networks, graphs

Abstract

Community detection is one of the most appealing research fields in computer science. Although many different methods have been proposed to cluster the nodes of a graph, none of these methods is complete. Each method has strengths and weaknesses in extracting highly coherent groups of nodes (i.e. communities or clusters). The differences that various methods show are typically due to two main factors: 1) structure of the network they operate on, and 2) the strategy they use to find clusters. Since none of these methods is optimal, it seems a good idea to combine them to take advantage of their strengths while minimizing their weaknesses. In this paper, we present a new approach for the community detection problem by considering an ensemble of community detection methods. We refer to our approach as “Mitra”. The base methods employed in Mitra, use different techniques and strategies to find communities for different applications of network data analysis. Mitra employs some known base community detection methods, receives their results on a network and builds a bipartite network to combine the communities found by the base methods. Then the fast projection technique compresses and summarizes the bipartite network to a new unipartite one. Then we find the communities of the unipartite network in the final step. We evaluate our approach against real and artificial datasets and compare our method with each one of the base methods. Artificial datasets include a diverse collection of large scale benchmark graphs. In this work, the main experimental evaluation function is Normalized Mutual Information (NMI ). We also use several measures to compare the quality and properties of final community structures of the partitions found by all methods.

References

http://micans.org/mcl/

http://networkrepository.com/polbooks.php

https://cs.unm.edu/ aaron/research/fastmodularity.htm

https://perso.uclouvain.be/vincent.blondel/research/louvain.html

https://sites.google.com/site/andrealancichinetti/files

https://sites.google.com/site/andrealancichinetti/software

https://www-complexnetworks.lip6.fr/ latapy/pp/walktrap.html

https://www.cs.bris.ac.uk/ steve/networks/software/copra.html

http://www.emilio.ferrara.name/conclude/

http://www.mapequation.org/code.html

http://www.oslom.org/software.html

Abbe, E.: Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research 18(1), 6446–6531 (2017)

Almeida, H., Guedes, D., Meira Jr, W., Zaki, M.J.: Is there a best quality metric for graph clusters? In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 44–59. Springer (2011)

Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008(10),P10,008 (2008)

Brandes, U., Delling, D., Gaertler, M., G¨orke, R., Hoefer, M., Nikoloski, Z., Wagner, D.:Maximizing modularity is hard. arXiv preprint physics/0608255 (2006)

Chakraborty, T., Dalmia, A., Mukherjee, A., Ganguly, N.:Metrics for community analysis:A survey. arXiv preprint arXiv:1604.03512 (2016)

Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Physical review E 70(6), 066,111 (2004)

Dahlin, J., Svenson, P.: Ensemble approaches for improving community detection methods.arXiv preprint arXiv:1309.0242 (2013)

Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment 2005(09), P09,008(2005)

De Meo, P., Ferrara, E., Fiumara, G., Provetti, A.: Mixing local and global information for community detection in large networks. Journal of Computer and System Sciences80(1), 72–87 (2014)

Dongen, S.M.: Graph clustering by flow simulation (2000)

Duan, L., Binbasioglu, M.: An ensemble framework for community detection. Journal of Industrial Information Integration 5, 1–5 (2017)

Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure.Bioinformatics 19(9), 1090–1099 (2003) Community detection in social networks using consensus clustering 25

Emmons, S., Kobourov, S., Gallant, M., B¨orner, K.: Analysis of network clustering algorithms and cluster quality metrics at scale. arXiv preprint arXiv:1605.05797 (2016)

Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on Machine learning,p. 36. ACM (2004)

Fortunato, S.: Community detection in graphs. Physics reports 486(3), 75–174 (2010)

Fortunato, S., Barthelemy, M.: Resolution limit in community detection. Proceedings of the National Academy of Sciences 104(1), 36–41 (2007)

Fortunato, S., Castellano, C.: Community structure in graphs. In: Computational Complexity, pp. 490–512. Springer (2012)

Fortunato, S., Hric, D.: Community detection in networks: A user guide. Physics Reports 659, 1–44 (2016)

Fortunato, S., Latora, V., Marchiori, M.: Method to find community structures based on information centrality. Physical review E 70(5), 056,104 (2004)

Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proceedings of the national academy of sciences 99(12), 7821–7826 (2002)

Good, B.H., de Montjoye, Y.A., Clauset, A.: Performance of modularity maximization in practical contexts. Physical Review E 81(4), 046,106 (2010)

Gregory, S.: Finding overlapping communities in networks by label propagation. New Journal of Physics 12(10), 103,018 (2010)

Harenberg, S., Bello, G., Gjeltema, L., Ranshous, S., Harlalka, J., Seay, R., Padmanabhan, K., Samatova, N.: Community detection in large-scale networks: a survey and empirical evaluation. Wiley Interdisciplinary Reviews: Computational Statistics 6(6), 426–439(2014)

Jeub, L.G., Sporns, O., Fortunato, S.: Multiresolution consensus clustering in networks.Scientific reports 8(1), 3259 (2018)

Kanawati, R.: Community detection in social networks: the power of ensemble methods.In: Data Science and Advanced Analytics (DSAA), 2014 International Conference on, pp.46–52. IEEE (2014)

Kanawati, R.: Ensemble selection for community detection in complex networks. In: International Conference on Social Computing and Social Media, pp. 138–147. Springer (2015)

Kannan, R., Vempala, S., Vetta, A.: On clusterings: Good, bad and spectral. Journal of the ACM (JACM) 51(3), 497–515 (2004)

Khatoon, M., Banu, W.A.: A survey on community detection methods in social networks.International Journal of Education and Management Engineering (IJEME) 5(1), 8 (2015)

Kheirkhahzadeh, M., Lancichinetti, A., Rosvall, M.: Efficient community detection of network flows for varying markov times and bipartite networks. Physical Review E 93(3),032,309 (2016)

Lancichinetti, A., Fortunato, S.: Consensus clustering in complex networks. Scientific reports 2 (2012)

Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Physical review E 78(4), 046,110 (2008)

Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Finding statistically significant communities in networks. PloS one 6(4), e18,961 (2011)

Leskovec, J., Lang, K.J., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World wide web, pp. 631–640. ACM (2010)

MacKay, D.J.: Information theory, inference and learning algorithms. Cambridge university press (2003)

Meil˘a, M.: Comparing clusteringsan information based distance. Journal of multivariate analysis 98(5), 873–895 (2007)

Newman, M.: Networks: an introduction. Oxford university press (2010)

Newman, M.E.: Fast algorithm for detecting community structure in networks. Physical review E 69(6), 066,133 (2004)

Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks.Physical review E 69(2), 026,113 (2004)

Ovelg¨onne, M., Geyer-Schulz, A.: An ensemble learning strategy for graph clustering.Graph Partitioning and Graph Clustering 588, 187 (2012)

Pizzuti, C., Socievole, A.: A genetic algorithm for community detection in attributed graphs. In: International Conference on the Applications of Evolutionary Computation,pp. 159–170. Springer (2018)

Pons, P., Latapy, M.: Computing communities in large networks using andom walks. J.Graph Algorithms Appl. 10(2), 191–218 (2006)

Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America 101(9), 2658–2663 (2004)

Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Physical review E 76(3), 036,106 (2007)

Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences 105(4), 1118–1123(2008)

Schaub, M.T., Delvenne, J.C., Rosvall, M., Lambiotte, R.: The many facets of community detection in complex networks. Applied network science 2(1), 4 (2017)

Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence 22(8), 888–905 (2000)

Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of machine learning research 3(Dec), 583–617 (2002)

Tagarelli, A., Amelio, A., Gullo, F.: Ensemble-based community detection in multilayer networks. Data Mining and Knowledge Discovery 31(5), 1506–1543 (2017)

Touretzky, D.S.: Advances in Neural Information Processing Systems 8: Proceedings of the 1995 Conference, vol. 8. Mit Press (1996)

Waltman, L., van Eck, N.J.: A smart local moving algorithm for large-scale modularitybased community detection. The European Physical Journal B 86(11), 1–14 (2013)

Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Computing Surveys (csur) 45(4), 43 (2013)

Yang, J., Leskovec, J.: Defining and evaluating network communities based on groundtruth. Knowledge and Information Systems 42(1), 181–213 (2015)

Zachary, W.W.: An information flow model for conflict and fission in small groups. Journal of anthropological research 33(4), 452–473 (1977)

Published
2019-12-01
How to Cite
Kheirkhahzadeh, M., & Analoui, M. (2019). Community detection in social networks using consensus clustering. Statistics, Optimization & Information Computing, 7(4), 864-884. https://doi.org/10.19139/soic-2310-5070-801
Section
Research Articles