Document Type : Technical Paper

Authors

Department of Computer Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran.

Abstract

Link prediction (LP) has become a hot topic in the data mining, machine learning, and deep learning community. This study aims to implement bibliometric analysis to find the current status of the LP studies and investigate it from different perspectives. The present study provides a Scopus-based bibliometric overview of the LP studies landscape since 1987 when LP studies were published for the first time. Various kinds of analysis, including document, subject, and country distribution are applied. Moreover, author productivity, citation analysis, and keyword analysis is used, and Bradford’s law is applied to discover the main journals in this field. Most documents were published by conferences in the field. The majority of LP documents have been published in the computer science and mathematics fields. So far, China has been at the forefront of publishing countries. In addition, the most active sources of LP publications are lecture notes in Computer Science, including subseries lecture notes in Artificial Intelligence (AI) and lecture notes in Bioinformatics, and IEEE Access. The keyword analysis demonstrates that while social networks had attracted attention in the early period, knowledge graphs have attracted more attention, recently. Since the LP problem has been approached recently using machine learning (ML), the current study may inform researchers to concentrate on ML techniques. This is the first bibliometric study of “link prediction” literature and provides a broad landscape of the field.

Keywords

Main Subjects

[1] J. Tang, et al., “Line: Large-scale information network embedding”, in Proceedings of the 24th international conference on the world wide web, 2015.
 
[2] J. Han, M. Kamber and J. Pei, “Data Mining: Concepts and Techniques” (3rd Ed), Morgan Kauffman, 2011.
 
[3] U., G. Fayyad, Piatetsky-Shapiro, and P. Smyth, “From data mining to knowledge discovery in databases”. AI magazine, vol. 17, no. 3: p. 37-37, 1996.
 
[4] P. Cui, et al., “A survey on network embedding”. IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 5, pp. 833-852, 2018.
 
[5] M.N. Hallquist, and F.G. Hillary, “Graph theory approaches to functional network organization in brain disorders: A critique for a brave new small-world”. Network Neuroscience,. vol. 3, no. 1, pp. 1-26, 2018.
 
[6] P.Shah, et al., “Characterizing the role of the structural connectome in seizure dynamics”. Brain, vol. 142, no. 7, pp. 1955-1972, 2019..
 
[7] B. Yoon and C.L. Magee, “Exploring technology opportunities by visualizing patent information based on generative topographic mapping and link prediction”, Technological Forecasting and Social Change, vol. 132, Part C, pp.105–117, 2018.
 
[8] L. Bornmann, and H.-D. Daniel, “Does the h-index for ranking of scientists really work?” Scientometrics,. vol. 65, no. 3, pp. 391-392, 2005.
 
[9] JA. Morente-Molinera, G. Kou, K. Samuylov, R. Ureña, and E. Herrera-Viedma. “Carrying out consensual group decision-making processes under social networks using sentiment analysis over comparative expressions”. Knowledge-Based Systems; 165: 335-45, 2019.
 
[10] JC. Mitchell, “Social networks in urban situations: analyses of personal relationships in Central African towns”. Manchester University Press; ISBN-13: 978-0719010354, 1969.
 
[11] Y-S. Su, C-L Lin, S-Y Chen, and C-F Lai. “Bibliometric study of social network analysis literature”. Library Hi Tech.; vol. 38, no. 2, 420-33, 2020.
 
[12]  J. Ding, et al., "Member structure and sharing behavior: social network analysis of CALIS online cataloging data in China". The Journal of Academic Librarianship, vol. 46, no. 2, p. 102115, 2020.
 
[13] J. Horkoff, F. B. Aydemir, E. Cardoso, T. Li, Maté, A., Paja, E. and P. Giorgini, "Goal-oriented requirements engineering: an extended systematic mapping study". Requirements engineering, vol. 24, pp. 133-160, ‏ 2019.
 
[14] Z. Bu, Y. Wang, H-J. Li, J. Jiang, Z. Wu, and J. Cao." Link prediction in temporal networks: Integrating survival analysis and game theory". Information Sciences.; 498: 41-61, 2019.
 
[15] S. Rafiee, C. Salavati, A. Abdollahpouri "CNDP: Link prediction based on common neighbor’s degree penalization". Physica A: Statistical Mechanics and its Applications. 539: 122950, 2020.
 
[16] M. F. Jamaludin, and F. Hashim, 2017. "Corporate governance, institutional characteristics, and director networks in Malaysia. Jamaludin, MF and Hashim, F. Corporate governance, institutional characteristics, and director network in Malaysia". Asian Academy of Management Journal of Accounting and Finance, vol. 13, no. 2, pp. 135-154, 2017.
 
[17] I. Kuznetcova, M. Glassman and T. J. Lin, "Multi-user virtual environments as a pathway to distributed social networks in the classroom". Computers & Education, vol. 130, pp. 26-39, 2019.
 
[18] K. Lewis, and J. Kaufman, "The conversion of cultural tastes into social network ties". American Journal of Sociology, vol. 123, no. 6, pp. 1684-1742, 2018. ‏
 
[19] L. Kirichenko, T. Radivilova, and A. Carlsson, "Detecting cyber threats through social network analysis: short survey". arXiv preprint arXiv:1805.06680, ‏(2018).
 
[20] S. Nahhas, O. Bamasag, M. Khemakhem and N. Bajnaid, "Added values of linked data in education: A survey and roadmap". Computers, vol. 7, no. 3, 45, ‏(2018).
 
[21] N.J. Van Eck, and L. Waltman, "VOS: A new method for visualizing similarities between objects, in Advances in data analysis", Springer. p. 299-306, 2007.
 
[22] J. Han, M. Kamber and J. Pei, "Data Mining: Concepts and Techniques" (3rd Ed), Morgan Kauffman, 2011.
 
[23] L. Bornmann, and H.-D. Daniel, "Does the h-index for ranking of scientists really work?" Scientometrics, vol. 65, no. 3, pp. 391-392, 2005.
 
[24] G. Sondossi, A. Saebi, SA. Hashemi. "A Neighbor-based Link Prediction Method for Bipartite Networks". Journal of Information and Communication Technology. 13(47, 48), pp. 178-186, 2021.
 
[25] B. Bilecen, M. Gamper, and MJ. Lubbers. "The missing link: Social network analysis in migration and transnationalism". Social Networks. 53: 1-3, 2018.
 
[26] A. Kumari, RK. Behera, KS. Sahoo, A Nayyar, A Kumar Luhach, S Prakash Sahoo. "Supervised link prediction using structured‐based feature extraction in social network. Concurrency and Computation", Practice and Experience. vol. 34, no. 13: e5839, 2022.
 
[27] I. Ahmad, MU. Akhtar, S. Noor, and A. Shahnaz. "Missing link prediction using common neighbor and centrality based parameterized algorithm". Scientific reports. 10: 364, 2020.
 
[28] X. Ma, P. Sun, and Y. Wang. "Graph regularized nonnegative matrix factorization for temporal link prediction in dynamic networks". Physica A: Statistical mechanics and its applications. 496: pp. 121-36, 2018.
 
[29] PM. Chuan, LH. Son, M. Ali, TD. Khang, LTH. uong, and N. Dey. "Link prediction in co-authorship networks based on hybrid content similarity metric". Applied Intelligence. vol. 48, no. 8: 2470-86, 2018.
 
[30] B. Bilecen, M. Gamper, MJ Lubbers."The missing link: Social network analysis in migration and transnationalism". Social Networks. 53: 1-3, 2018.
 
[31] M. E. Falagas, A.I. Karavasiou, and I.A. Bliziotis, "A bibliometric analysis of global trends of research productivity in tropical medicine". Acta tropica, 99(2-3), pp. 155-159, (2006).  ‏
 
[32] C. Chen and Song," Visualizing a field of research: A methodology of systematic scientometric reviews". PLOS ONE. 14(10): e0223994, 2019.
 
[33] B. González-Pereira, V.P. Guerrero-Bote, and F. Moya-Anegón, "A new approach to the metric of journals’ scientific prestige: The SJR indicator". Journal of Informetrics, vol. 4, no.3, pp. 379-391, 2010.
 
[34] A. Keramatfar and H. Amirkhani," Bibliometrics of sentiment analysis literature". Journal of Information Science. vol. 45, no. 1, pp. 3-15, 2019.
 
[35] D. R. Raban and A. Gordon, " The evolution of data science and big data research: A bibliometric analysis". Scientometrics, vol. 122, no. 3, pp. 1563-1581, ‏(2020).
 
[36] J. Li, Peng, Liu, S., X. Ji, X. Li, and X. Hu, "Link prediction in directed networks utilizing the role of reciprocal links". IEEE Access, vol. 8, pp. 28668-28680, (2020).
 
[37] S. Behrouzi, Z. S. Sarmoor, K. Hajsadeghi, and K Kavousi, "predicting scientific research trends based on link prediction in keyword networks". Journal of Informetrics, vol. 14, no. 4, pp. 101079, (2020).
 
[38] J.F. Burnham, Scopus database: a review. Biomedical Digital Libraries, vol. 3, no. 1: p. 1, 2006.
 
[39] H. Bingjie, N. Ling, and S. J. Abbas, "Research on Drug-Target Interactions Prediction: Network similarity-based approaches". In 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC) (pp. 168-173). IEEE, ‏(2020, July).
 
[40] G.P. Khiste and R.R. Paithankar, "Analysis of Bibliometric term in Scopus". International Journal of Library Science and Information Management (IJLSIM), vol. 3, no. 3, pp. 81-88, 2017.
 
[41] G.P. Khiste, "Publication productivity of ‘consortia’by scopus during 1989-2016". International Journal of Current Innovation Research, vol. 3, no. 11, pp. 879-882, 2017.
 
[42] G.P. Khiste, D.B. Maskeand R.K. Deshmukh, "Knowledge management output in Scopus during 2007 to 2016". Asian Journal of Research in Social Sciences and Humanities, vol. 8, no. 1, pp. 10-19, 2018.
 
[43] S.A. Ebrahim, et al., "Quantitative and Qualitative Analysis of Time-Series Classification Using Deep Learning". IEEE Access, 8: pp. 90202-90215, 2020.
 
[44] E. K. Pedersen, "Embeddings of topological manifolds". Illinois Journal of Mathematics, vol. 19, no. 3, pp. 440-447, (1975).
 
[45] N.J. Van Eck and L. Waltman, "Software survey: VOSviewer, a computer program for bibliometric mapping". scientometrics, vol. 84, no. 2, pp. 523-538, 2010.
 
[46] O. Persson, R. Danell and J.W. Schneider,"How to use Bibexcel for various types of bibliometric analysis". Celebrating scholarly communication studies: A Festschrift for Olle Persson at his 60th Birthday, vol. 5, pp. 9-24, 2009.
 
[47] E-SM. El-Alfy and SA. Mohammed, "A review of machine learning for big data analytics: bibliometric approach".Technology Analysis & Strategic Management. vol. 32, no. 8, pp. 984-1005, 2020.
 
[48] S.C. Bradford," Sources of information on specific subjects". Engineering, vol. 137, pp. 85-86, 1934.
 
[49] D.Liben‐Nowell and J. Kleinberg, "The link‐prediction problem for social networks". Journal of the American Society for Information Science and Technology, vol. 58, no. 7, pp. 1019-1031, 2007.
 
[50] M. Mao, Z. Li, Z. Zhao and L. Zeng, "Bibliometric analysis of the deep learning research status with the data from Web of Science". In Data Mining and Big Data: Third International Conference, DMBD 2018, Shanghai, China, June 17–22, 2018, Proceedings 3 (pp. 585-595). Springer International Publishing, ‏(2018).
 
[51] Y. Zhang, M. Wu, GY. Tian, G. Zhang, and J. Lu, "Ethics and privacy of artificial intelligence: Understandings from bibliometrics". Knowledge-Based Systems. 222: 106994, 2021.
 
[52] K. Berahmand, E. Nasiri, M. Rostami, and S. Forouzandeh, "A modified DeepWalk method for link prediction in attributed social network". Computing. vol. 103, no. 10, pp.
2227-49, 2021.
 
[53] A. Keramatfar and H. Amirkhani, "Bibliometrics of sentiment analysis literature". Journal of Information Science, vol. 45, no. 1, pp. 3-15, 2019.
 
[54] R. Cohen and S. Havlin, "Complex networks: structure, robustness, and function". Cambridge University Press, ‏(2010).
 
[55] J.E. Hirsch, "Does the h index have predictive power?" Proceedings of the National Academy of Sciences, vol. 104, no. 49, pp. 19193-19198, 2007.
 
[56] T. Pradhan and S. Pal, "A hybrid personalized scholarly venue recommender system integrating social network analysis and contextual similarity". Future Generation Computer Systems, vol. 110, pp. 1139-1166, (2020).
 
[57] E. Roldan-Valadez, et al.," Current concepts on bibliometrics: a brief review about impact factor, Eigenfactor score, CiteScore, SCImago journal rank, source-normalized impact per paper, H-index, and alternative metrics". Irish Journal of Medical Science (1971-),  vol. 188, no. 3, pp. 939-951, 2019.
 
[58] Tang J, Qu M, Wang M, Zhang M, Yan J, and Mei Q, editors. "Line: Large-scale information network embedding". Proceedings of the 24th International Conference on World Wide Web, 2015.
 
[59] Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O, editors. "Translating embeddings for modeling multi-relational data". Neural Information Processing Systems (NIPS), 2013.
 
[60] L. Lü, and T. Zhou, "Link prediction in complex networks: A survey". Physica A: statistical mechanics and its applications, vol. 390, no. 6, pp. 1150-1170, ‏(2011).
 
[61] D. Wang, P. Cui, and W. Zhu, " Structural deep network embedding". Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016.
 
[62] L. Lü and T. Zhou," Link prediction", ‏(2013).
 
[63] Wang, Z., Zhang, J., Feng, J., and Chen, Z. "Knowledge graph embedding by translating on hyperplanes". In Proceedings of the AAAI conference on artificial intelligence (vol. 28, no. 1), ‏(2014, June).
 
[64] L. Backstrom and J. Leskovec. "Supervised random walks: predicting and recommending links in social networks". In Proceedings of the fourth ACM International Conference on Web Search and data mining, 2011.
[65] F. Heimerl, S. Lohmann, S. Lange, and T. Ertl, "Word cloud explorer: Text analytics based on word clouds". 47th Hawaii international conference on system sciences; IEEE, 2014.
 
[66] RJ. Hyndman and G. Athanasopoulos," Forecasting: principles and practice". 2nd edition. OTexts, 2018.
 
[67] A. Dogan and D. Birant, "Machine learning and data mining in manufacturing". Expert Systems with Applications. 166: 114060, 2021;
 
[68] A. Stolcke, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D Et al. "Dialogue act modeling for automatic tagging and recognition of conversational speech". Computational linguistics. vol. 26, no. 3, pp. 339-73, 2000.
 
[69] Obar JA and Wildman S. "Social media definition and the governance challenge-an introduction to the special issue". Telecommunications policy. vol. 39, no. 9, pp. 745-50, 2015.
 
[70] Kazemi SM and Poole D. "Simple embedding for link prediction in knowledge graphs". Advances in neural information processing systems. vol. 31, 2018.
 
[71] J. P. McCusker, S. M. Rashid, N. Agu, K. P. Bennettand, D. L.McGuinness, "Developing Scientific Knowledge Graphs Using Why is". In SemSci@ ISWC (pp. 52-58), ‏(2018, October).
 
[72] Z, editors. "Knowledge graph embedding by translating on hyperplanes". Proceedings of the AAAI conference on artificial intelligence; DOI: https://DOI.org/10.1609/aaai.v28i1.8870, 2014.
 
[73] Y. Lin, et al. "Learning entity and relation embeddings for knowledge graph completion". In Proceedings of the AAAI Conference on Artificial Intelligence. 2015.
 
[74] T. Zhou, L. Lü, and Y.-C. Zhang, "Predicting missing links via local information". The European Physical Journal B, vol. 71, no. 4,  pp. 623-630, 2009.
 
[75] S. Wasserman, and F.K. aust, Social network analysis: Methods and applications (1994). ‏
 
[76] S. Chakrabarti, et al., "Data mining curriculum: A proposal (Version 1.0)". Intensive Working Group of ACM SIGKDD Curriculum Committee, vol. 140: pp. 1-10. 2006.
 
[77] C. Clifton, "Encyclopædia Britannica: definition of data mining". Retrieved on, 2010. vol. 9, no. 12, p. 2010.
 
[78] M. Nilufar, and A. Abhari, "Incremental Text Clustering Algorithm for Cloud-based Data Management in Scientific Research Papers". In 2022 Annual Modeling and Simulation Conference (ANNSIM) (pp. 778-789). IEEE. (2022, July).  ‏
 
[79] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, and M. Sun, "Graph neural networks: A review of methods and applications". AI open, vol. 1, pp. 57-81. ‏ (2020).
 
[80] X. Liu, X. Li, G. Fiumara, & P. De Meo," Link prediction approach combined graph neural network with capsule network". Expert Systems with Applications212, 118737. ‏(2023).
 
[81] Q. Tan, X. Zhang, N. Liu, D. Zha, L. Li, R. Chen, and X. Hu, "Bring your own view: Graph neural networks for link prediction with personalized subgraph selection". In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining (pp. 625-633). ‏ (2023, February).
 
[82] B. P. Chamberlain, S. Shirobokov, E. Rossi, F. Frasca, T. Markovich, N. Hammerla, and M. Hansmire, "Graph Neural Networks for Link Prediction with Subgraph Sketching". arXiv preprint arXiv:2209.15486. ‏(2022).
 
[83] B.W. Wu, C. Li, Luo, and W. Nejdl, "Hashing-accelerated graph neural networks for link prediction". In Proceedings of the Web Conference 2021 (pp. 2910-2920). ‏Skarding, J., Hellmich, M., Gabrys, B., & Musial, K. A Robust Comparative Analysis of Graph Neural Networks on Dynamic Link Prediction. IEEE Access10, 64146-64160. ‏(2022).
 
[84] X. Kong, Y. Shi, S. Yu, J. Liu, and F. Xia, "Academic social networks: Modeling, analysis, mining, and applications". Journal of Network and Computer Applications, vol. 132, pp. 86-103. ‏(2019).
 
[85] I. Makarov, and O. Gerasimova, "Predicting collaborations in co-authorship network". In 2019 14th international workshop on semantic and social media adaptation and personalization (SMAP) (pp. 1-6), IEEE. ‏
 
[86] Sisay Fissaha Adafre and Maarten de Rijke. "Discovering missing links in wikipedia". In Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD ’05, pages 90–97, New York, NY, USA, ACM, 2005.
 
[87] Lars Backstrom and Jure Leskovec. "Supervised random walks: Predicting and recommending links in social networks". In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM ’11, pages 635–644, New York, NY, USA, 2011.
 
[88] Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles. "Context aware citation recommendation". In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pages 421–430, New York, NY, USA, 2010.
 
[89] David Liben-Nowell and Jon Kleinberg. "The link-prediction problem for social networks". Journal of the Association for Information Science and Technology, vol. 58, no. 7, pp. 1019–1031, 2007.
 
[90] V. Smojver, Štorga, M., and G. Zovak, "Exploring knowledge flow within a technology domain by conducting a dynamic analysis of a patent co-citation network". Journal of Knowledge Management, vol. 25, no. 2, pp. 433-453, (2021).
 
[91] A. Velez-Estevez, I. J. Perez, P. García-Sánchez, J. A. Moral-Munoz & M. J. Cobo, "New trends in bibliometric apis: A comparative analysis". Information Processing & Management, vol. 60, no. 4, 103385, (2023).
 
[92] V. K. Singh, P. Singh, M. Karmakar, J. Leta, and P. Mayr, The journal coverage of Web of Science, "Scopus and Dimensions: A comparative analysis". Scientometrics, vol. 126, pp. 5113-5142, (2021).  ‏
 
[93] M. Khazaei, & N. Ashrafi-Payaman, "An Unsupervised Anomaly Detection Model for Weighted Heterogeneous Graph". Journal of AI and Data Mining, (2023). ‏