DINGA: A Genetic-algorithm-based Method for Finding Important Nodes in Social Networks

Rahmani, H.; Kamali, H.; Shah-Hosseini, H.

doi:10.22044/jadm.2020.9025.2040

Document Type : Applied Article

Authors

¹ Department of Computer Engineering, Faculty of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.

² Department of Computer Engineering, Faculty of Mechanic, Electrical and Computer, Science and Research Branch, Islamic Azad University, Tehran, Iran.

https://doi.org/10.22044/jadm.2020.9025.2040

Abstract

Nowadays, a significant amount of studies are devoted to discovering important nodes in graph data. Social networks as graph data have attracted a lot of attention. There are various purposes for discovering the important nodes in social networks such as finding the leaders in them, i.e. the users who play an important role in promoting advertising, etc. Different criteria have been proposed in discovering important nodes in graph data. Measuring a node’s importance by a single criterion may be inefficient due to the variety of graph structures. Recently, a combination of criteria has been used in the discovery of important nodes. In this paper, we propose a system for the Discovery of Important Nodes in social networks using Genetic Algorithms (DINGA). In our proposed system, important nodes in social networks are discovered by employing a combination of eight informative criteria and their intelligent weighting. We compare our results with a manually weighted method, that uses random weightings for each criterion, in four real networks. Our method shows an average of 22% improvement in the accuracy of important nodes discovery.

Keywords

References

[1] Zhang, X., & Dong, D. (2008). Ways of identifying the opinion leaders in virtual communities. International Journal of Business and Management, nol. 3, no.7, pp. 21-27.

[2] Karimi Zandian, Z., & Keyvanpour, M. R. (2019). MEFUASN: a helpful method to extract features using analyzing social network for fraud detection. Journal of AI and Data Mining, vol. 7, no. 2, pp. 213-224.

[3] Yada, K., Motoda, H., Washio, T., & Miyawaki, A. (2006). Consumer behavior analysis by graph mining technique. New Mathematics and Natural Computation, vol. 2, no. 01, pp. 59-68.

[4] Parthasarathy, S., Tatikonda, S., & Ucar, D. (2010). A survey of graph mining techniques for biological datasets. In Managing and mining graph data, Springer, Boston, MA, pp. 547-580.

[5] Rahmani, H., Blockeel, H., & Bender, A. (2016). Using a human drug network for generating novel hypotheses about drugs. Intelligent Data Analysis, vol. 20, no. 1, pp. 183-197.

[6] Klopman, G. (1984). Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules. Journal of the American Chemical Society, vol. 106, no. 24, pp. 7315-7321.

[7] Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A comprehensive survey of data mining-based fraud detection research. arXiv preprint arXiv: 1009.6119.

[8] Xiaojun, L., Song, H., & Weikun, X. (2017). The Analysis of Logistics Influence of the Important Node Cities of Beijing-Tianjin-Hebei. International Journal of Business and Economics Research, vol. 6, no.5, pp. 88.

[9] Rahmani H, Blockeel H, Bender A. (2009). Predicting the functions of proteins in protein-protein interaction networks from global information. In: Machine Learning in Systems Biology, Ljubljana, Slovenia. pp. 82-97.

[10] Cook, D. J., Holder, L. B., & Ketkar, N. (2006). Unsupervised and supervised pattern learning in graph data. Mining Graph Data, pp. 159-180.

[11] Liu, J. G., Ren, Z. M., & Guo, Q. (2013). Ranking the spreading influence in complex networks. Physica A: Statistical Mechanics and its Applications, vol. 392, no. 18, pp. 4154-4159.

[12] Blondel, V. D., Decuyper, A., & Krings, G. (2015). A survey of results on mobile phone datasets analysis. EPJ data science, vol. 4, no. 1, pp. 10.

[13] Garton, L., Haythornthwaite, C., & Wellman, B. (1997). Studying online social networks. Journal of computer-mediated communication, vol. 3, no. 1, JCMC313.

[14] Hanneman, R. A., & Riddle, M. (2005). Introduction to social network methods. University of California, Riverside. CA (Online book).

[15] Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications, Cambridge university press, vol. 8.

[16] Borgatti, S. P. (2006). Identifying sets of key players in a social network. Computational & Mathematical Organization Theory, vol. 12, no. 1, pp. 21-34.

[17] Yang WS, Dia JB, Cheng HC, Lin HT.) 2006(. Mining social networks for targeted advertising. In: Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06), Kauia, HI, USA, pp. 137a-137a.

[18] Kazienko, P., & Musial, K. (2007). On utilising social networks to discover representatives of human communities. IJIIDS, vol. 1, no. 3/4, pp. 293-310.

[19] Brandes, U. (2001). A faster algorithm for betweenness centrality. Journal of mathematical sociology, vol. 25, no. 2, pp. 163-177.

[20] Noble CC, Cook DJ. (2003). Graph-based anomaly detection. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining (ACM), New York, NY, USA, pp. 631-636.

[21] Shetty, J., & Adibi, J. (2005). Discovering important nodes through graph entropy the case of enron email database. In Proceedings of the 3rd international workshop on Link discovery, ACM, pp. 74-81.

[22] Kajdanowicz, T., & Morzy, M. (2016). Using graph and vertex entropy to compare empirical graphs with theoretical graph models. Entropy, vol. 18, no. 9, pp. 320.

[23] Bashiri v, Rahmani H, Bashiri H. (2017). Finding Important Nodes in Social Networks. In: Third International Conference on Web Research (ICWR), Tehran, Iran.

[24] Hu, J., Han, Y., & Hu, J. (2010). Topological potential: modeling node importance with activity and local effect in complex networks. In 2010 Second International Conference on Computer Modeling and Simulation, IEEE, vol. 2, pp. 411-415.

[25] Xu, Y., Gao, Z., Xiao, B., Meng, F., & Lin, Z. (2013). Key nodes evaluation with multi-criteria in complex networks based on AHP analysis. In 2013 5th IEEE International Conference on Broadband Network & Multimedia Technology, IEEE, pp. 105-109.

[26] Bian T, Hu J, Deng Y. (2017). Identifying influential nodes in complex networks based on AHP. Physica A: Statistical Mechanics and its Applications, vol. 479, pp. 422-36.

[27] Yu, H., Liu, Z., & Li, Y. J. (2013). Key nodes in complex networks identified by multi-attribute decision-making method.

[28] Du, Y., Gao, C., Hu, Y., Mahadevan, S., & Deng, Y. (2014). A new method of identifying influential nodes in complex networks based on TOPSIS. Physica A: Statistical Mechanics and its Applications, vol. 399, pp. 57-69.

[29] Yang, Y., & Xie, G. (2016). Efficient identification of node importance in social networks. Information Processing & Management, vol. 52, no. 5, pp. 911-922.

[30] Kaur, M., & Singh, S. (2016). Analyzing negative ties in social networks: A survey. Egyptian Informatics Journal, vol. 17, no. 1, pp. 21-43.

[31] Li, M., Wang, J., Wang, H., & Pan, Y. (2012). Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 4, pp. 1070-1080.

[32] Li, M., Wang, J., Chen, X., Wang, H., & Pan, Y. (2011). A local average connectivity-based method for identifying essential proteins from the network level. Computational biology and chemistry, vol. 35, no. 3, pp. 143-150.

[33] Crandall, D., Cosley, D., Huttenlocher, D., Kleinberg, J., & Suri, S. (2008). Feedback effects between similarity and social influence in online communities. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 160-168.

[34] Wang, S., Du, Y., & Deng, Y. (2017). A new measure of identifying influential nodes: Efficiency centrality. Communications in Nonlinear Science and Numerical Simulation, vol. 47, pp. 151-163.

[35] Chen, D., Lu, L., Shang, M. S., Zhang, Y. C., & Zhou, T. (2012). Identifying influential nodes in complex networks. Physica a: Statistical mechanics and its applications, vol. 391, no. 4, pp. 1777-1787.

[36] Newman, M. E. (2000). Who is the best connected scientist? A study of scientific coauthorship networks, arXiv preprint cond-mat/0011144.

[37] Getoor, L., Segal, E., Taskar, B., & Koller, D. (2001). Probabilistic models of text and link structure for hypertext classification. In IJCAI workshop on text learning: beyond supervision, pp. 24-29.

[38] Emshoff, J. R., & Saaty, T. L. (1982). Applications of the analytic hierarchy process to long range planning processes. European Journal of Operational Research, vol. 10, no. 2, pp. 131-143.

[39] Boldi, P., Luongo, A., & Vigna, S. (2017). Rank monotonicity in centrality measures. Network Science, vol. 5, no. 4, 529-550.

[40] Boldi, P. (2015). Large-scale Network Analytics: Diffusion-based Computation of Distances and Geometric Centralities. In Proceedings of the 24th International Conference on World Wide Web, pp. 1313-1313.

[41] De Meo, P., Levene, M., & Provetti, A. (2019). Potential gain as a centrality measure. In IEEE/WIC/ACM International Conference on Web Intelligence, pp. 418-422.

[42] De Meo, P., Levene, M., Messina, F., & Provetti, A. (2019). A general centrality framework based on node navigability. IEEE Transactions on Knowledge and Data Engineering.

[43] Roy, M., & Pan, I. (2018). Most Influential Node Selection in Social Network using Genetic Algorithm. In 2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), IEEE, pp. 214-220.

[44] Weskida, M., & Michalski, R. (2019). Finding influentials in social networks using evolutionary algorithm. Journal of Computational Science, vol. 31, pp. 77-85.

[45] Gen M, Cheng R. (2000). Genetic algorithms and engineering optimization, Hoboken, NJ, USA, John Wiley \& Sons.

[46] Eiben, A. E., & Smith, J. E. (2003). Introduction to evolutionary computing, Berlin, springer, vol. 53, pp. 18.

[47] Vertan, C., Vertan, C. I., & Buzuloiu, V. (1997, July). Reduced computation genetic algorithm for noise removal, In: 1997 Sixth International Conference on Image Processing and Its Applications, IET, vol. 1, pp. 313-316.

[48] Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of anthropological research, vol. 33, no. 4, pp. 452-473.

[49] Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the national academy of sciences, vol. 99, no. 12, pp. 7821-7826.

[50] Auerbach, D. M., Darrow, W. W., Jaffe, H. W., & Curran, J. W. (1984). Cluster of cases of the acquired immune deficiency syndrome: Patients linked by sexual contact. The American journal of medicine, vol. 76, no. 3, pp. 487-492.

[51] Rahmani, H., Weiss, G., Mendez-Lucio, O., & Bender, A. (2016). ARWAR: A network approach for predicting Adverse Drug Reactions. Computers in biology and medicine, vol. 68, 101-108.

[52] Creamer, G., Rowe, R., Hershkop, S., & Stolfo, S. J. (2007). Segmentation and automated social hierarchy detection through email network analysis. In International Workshop on Social Network Mining and Analysis, Springer, Berlin, Heidelberg, pp. 40-58.

[53] Gilbert, E. (2012). Phrases that signal workplace hierarchy. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, ACM, pp. 1037-1046.

[54] Peri, S., Navarro, J. D., Kristiansen, T. Z., Amanchy, R., Surendranath, V., Muthusamy, B., ... & Rashmi, B. P. (2004). Human protein reference database as a discovery resource for proteomics. Nucleic acids research, vol. 32, D497-D501.

[55] Stark, C., Breitkreutz, B. J., Reguly, T., Boucher, L., Breitkreutz, A., & Tyers, M. (2006). BioGRID: a general repository for interaction datasets. Nucleic acids research, vol. 34, D535-D539.

[56] Radivojac, P., Peng, K., Clark, W. T., Peters, B. J., Mohan, A., Boyle, S. M., & Mooney, S. D. (2008). An integrated approach to inferring gene–disease associations in humans. Proteins: Structure, Function, and Bioinformatics, vol. 72 no. 3, 1030-1037.

[57] Chen, W. H., Lu, G., Chen, X., Zhao, X. M., & Bork, P. (2016). OGEE v2: an update of the online gene essentiality database with special focus on differentially essential genes in human cancer cell lines. Nucleic acids research, gkw1013.

[58] Lu, Z. M., & Feng, Y. P. (2015). Critical Nodes and Links Evaluation with Multi-Criteria Based on Entropy-Weighted Method.

DINGA: A Genetic-algorithm-based Method for Finding Important Nodes in Social Networks

References

References

Volume 8, Issue 4October 2020Pages 545-555

Volume 8, Issue 4
October 2020
Pages 545-555