A Consensus Clustering Method for Clustering Social Networks
AbstractDetecting Communities in networks is one of the appealing fields in computer science. A wide range of methods are proposed for this problem. These methods employ different strategies and optimization functions to detect communities (or clusters). Therefore, it seems a good idea to combine these strategies to take advantage of the strengths of the methods and overcome their problems. This is the idea behind consensus clustering technique which combines several clustering results into one. In this paper, we propose a very good-performing method based on consensus clustering to detect communities of a network. Our method, called “Azar”, employed several community detection methods as base methods. Then Azar generates a new compressed network based on the common views of the used base methods and, gives this new compressed network to the last community detection method to find the final partition. We evaluate our approach by employing real and artificial datasets. The implementation results compare the base methods with Azar according to accuracy measures such as modularity and Normalized Mutual Information (NMI). The results show the good-performing behavior of Azar even for the most difficult networks. The results show the brilliant power of Azar in comparison with all the other methods.
M. E. Newman, Fast algorithm for detecting community structure in networks, Physical review E 69 (6) (2004) 066133.
S. Fortunato, M. Barthelemy, Resolution limit in community detection, Proceedings of the National Academy of Sciences 104 (1)(2007) 36–41.
M. Rosvall, C. T. Bergstrom, Maps of random walks on complex networks reveal community structure, Proceedings of the National Academy of Sciences 105 (4) (2008) 1118–1123.
V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfolding of communities in large networks, Journal of statistical mechanics: theory and experiment 2008 (10) (2008) P10008.
E. Abbe, Community detection and stochastic block models: recent developments, The Journal of Machine Learning Research 18 (1)(2017) 6446–6531.
E. Macedo , Two-Step Semidefinite Programming approach to clustering and dimensionality reduction, Statistics, Optimization & Information Computing 3 (3) (2015) 294–311.
R. Alguliyev,R. Aliguliyev, L Sukhostat, Anomaly detection in Big data based on clustering, Statistics, Optimization & Information Computing 5 (4) (2017) 325–340.
F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, D. Parisi, Defining and identifying communities in networks, Proceedings of the National Academy of Sciences of the United States of America 101 (9) (2004) 2658–2663.
S. Fortunato, V. Latora, M. Marchiori, Method to find community structures based on information centrality, Physical review E 70 (5)(2004) 056104.
S. Fortunato, Community detection in graphs, Physics reports 486 (3) (2010) 75–174.
J. Xie, S. Kelley, B. K. Szymanski, Overlapping community detection in networks: The state-of-the-art and comparative study, ACM Computing Surveys (csur) 45 (4) (2013) 43.
S. Harenberg, G. Bello, L. Gjeltema, S. Ranshous, J. Harlalka, R. Seay, K. Padmanabhan, N. Samatova, Community detection in large-scale networks: a survey and empirical evaluation, Wiley Interdisciplinary Reviews: Computational Statistics 6 (6) (2014) 426-439.
M. Khatoon, W. A. Banu, A survey on community detection
methods in social networks, International Journal of Education and Management Engineering (IJEME) 5 (1) (2015) 8.
S. Fortunato, C. Castellano, Community structure in graphs, in: Computational Complexity, Springer, 2012, pp. 490–512.
S. Dudoit, J. Fridlyand, Bagging to improve the accuracy of a clustering procedure, Bioinformatics 19 (9) (2003) 1090–1099.
M. Kheirkhahzadeh, A. Lancichinetti, M. Rosvall, Efficient community detection of network flows for varying markov times and bipartite networks, Physical Review E 93 (3) (2016) 032309.
D. S. Touretzky, Advances in Neural Information Processing Systems 8: Proceedings of the 1995 Conference, Vol. 8, Mit Press, 1996.
A. Lancichinetti, S. Fortunato, Consensus clustering in complex networks, Scientific reports 2.
M. E. Newman, M. Girvan, Finding and evaluating community structure in networks, Physical review E 69 (2) (2004) 026113.
L. Danon, A. Diaz-Guilera, J. Duch, A. Arenas, Comparing community structure identification, Journal of Statistical Mechanics: Theory and Experiment 2005 (09) (2005) P09008.
S. Fortunato, D. Hric, Community detection in networks: A user guide, Physics Reports 659 (2016) 1–44.
L. G. Jeub, O. Sporns, S. Fortunato, Multiresolution consensus clustering in networks, Scientific reports 8 (1) (2018) 3259.
A. Strehl, J. Ghosh, Cluster ensembles—a knowledge reuse framework for combining multiple partitions, Journal of machine learning research 3 (Dec) (2002) 583–617.
M. Newman, Networks: an introduction, Oxford university press, 2010.
M. Ovelg¨onne, A. Geyer-Schulz, An ensemble learning strategy for graph clustering., Graph Partitioning and Graph Clustering 588 (2012) 187.
R. Kanawati, Community detection in social networks: the power of ensemble methods, in: Data Science and Advanced Analytics (DSAA), 2014 International Conference on, IEEE, 2014, pp. 46–52.
R. Kanawati, Ensemble selection for community detection in complex networks, in: International Conference on Social Computing and Social Media, Springer, 2015, pp. 138–147.
A. Tagarelli, A. Amelio, F. Gullo, Ensemble-based community detection in multilayer networks, Data Mining and Knowledge Discovery 31 (5) (2017) 1506–1543.
C. Pizzuti, A. Socievole, A genetic algorithm for community detection in attributed graphs, in: International Conference on the Applications of Evolutionary Computation, Springer, 2018, pp. 159-170.
A. Clauset, M. E. Newman, C. Moore, Finding community structure in very large networks, Physical review E 70 (6) (2004) 066111.
P. Pons, M. Latapy, Computing communities in large networks using random walks., J. Graph Algorithms Appl. 10 (2) (2006) 191-218.
P. De Meo, E. Ferrara, G. Fiumara, A. Provetti, Mixing local and global information for community detection in large networks,Journal of Computer and System Sciences 80 (1) (2014) 72–87.
S. Gregory, Finding overlapping communities in networks by label propagation, New Journal of Physics 12 (10) (2010) 103018.
A. Lancichinetti, F. Radicchi, J. J. Ramasco, S. Fortunato, Finding statistically significant communities in networks, PloS one 6 (4)(2011) e18961.
U. N. Raghavan, R. Albert, S. Kumara, Near linear time algorithm to detect community structures in large-scale networks, Physical review E 76 (3) (2007) 036106.
S. M. Dongen, Graph clustering by flow simulation, 2000.
J. Huang, H. Sun, J. Han, H. Deng, Y. Sun,Y. Liu, Shrink: a structural clustering algorithm for detecting hierarchical communities in networks. In: Proceedings of the 19th ACM international conference on Information and knowledge management, pp. 219–228.ACM (2010)
W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of anthropological research 33 (4) (1977) 452–473.
M. Girvan, M. E. Newman, Community structure in social and biological networks, Proceedings of the national academy of sciences 99 (12) (2002) 7821–7826.
A. Lancichinetti, S. Fortunato, F. Radicchi, Benchmark graphs for testing community detection algorithms, Physical review E 78 (4)(2008) 046110.
L. Waltman, N. J. van Eck, A smart local moving algorithm for large-scale modularity-based community detection, The European Physical Journal B 86 (11) (2013) 1–14.
S. Emmons, S. Kobourov, M. Gallant, K. B¨orner, Analysis of network clustering algorithms and cluster quality metrics at scale, arXiv preprint arXiv:1605.05797.
U. Brandes, D. Delling, M. Gaertler, R. G¨orke, M. Hoefer, Z. Nikoloski, D. Wagner, Maximizing modularity is hard, arXiv preprint physics/0608255.
B. H. Good, Y.-A. de Montjoye, A. Clauset, Performance of modularity maximization in practical contexts, Physical Review E 81(4)(2010) 046106.
T. Chakraborty, A. Dalmia, A. Mukherjee, N. Ganguly, Metrics for community analysis: A survey, arXiv preprint arXiv:1604.03512.
M. Meil˘a, Comparing clusteringsłan information based distance, Journal of multivariate analysis 98 (5) (2007) 873–895.
D. J. MacKay, Information theory, inference and learning algorithms, Cambridge university press, 2003.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).