Optimizing CELF Algorithm for Influence Maximization Problem in Social Networks

Taherinia, M.; Esmaeili, M.; Minaei Bidgoli, B.

doi:10.22044/jadm.2021.10964.2254

Document Type : Original/Review Paper

Authors

¹ Department of Computer, Kashan Branch, Islamic Azad University, Kashan, Iran.

² School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.

https://doi.org/10.22044/jadm.2021.10964.2254

Abstract

The Influence Maximization Problem in social networks aims to find a minimal set of individuals to produce the highest influence on other individuals in the network. In the last two decades, a lot of algorithms have been proposed to solve the time efficiency and effectiveness challenges of this NP-Hard problem. Undoubtedly, the CELF algorithm (besides the naive greedy algorithm) has the highest effectiveness among them. Of course, the CELF algorithm is faster than the naive greedy algorithm (about 700 times). This superiority has led many researchers to make extensive use of the CELF algorithm in their innovative approaches.
However, the main drawback of the CELF algorithm is the very long running time of its first iteration. Because it has to estimate the influence spread for all nodes by expensive Monte-Carlo simulations, similar to the naive greedy algorithm. In this paper, a heuristic approach is proposed, namely Optimized-CELF algorithm, to improve this drawback of the CELF algorithm by avoiding unnecessary Monte-Carlo simulations. It is found that the proposed algorithm reduces the CELF running time, and subsequently improves the time efficiency of other algorithms that employed the CELF as a base algorithm. Experimental results on the wide spectral of real datasets showed that the Optimized-CELF algorithm provided better running time gain, about 88-99% and 56-98% for k=1 and k=50, respectively, compared to the CELF algorithm without missing effectiveness.

Keywords

20.1001.1.23225211.2022.10.1.3.1

References

[1] Y. Li, J. Fan, Y. Wang, and K.-L. L. Tan, “Influence Maximization on Social Graphs: A Survey,” IEEE Trans. Knowl. Data Eng., vol. 30, no. 10, pp. 1852–1872, Oct. 2018.

[2] K. Li, L. Zhang, and H. Huang, “Social Influence Analysis: Models, Methods, and Evaluation,” Engineering, vol. 4, no. 1, pp. 40–46, Feb. 2018.

[3] P. Domingos and M. Richardson, “Mining the network value of customers,” in Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001, pp. 57–66.

[4] D. Kempe, J. Kleinberg, and É. Tardos, “Maximizing the spread of influence through a social network,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003, pp. 137–146.

[5] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. Vanbriesen, and N. Glance, “Cost-effective outbreak detection in networks,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007, pp. 420–429.

[6] A. Goyal, W. Lu, and L. V. S. Lakshmanan, “CELF++: Optimizing the greedy algorithm for influence maximization in social networks,” in Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011, 2011, pp. 47–48.

[7] W. Chen, Y. Wang, and S. Yang, “Efficient influence maximization in social networks,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 199–207.

[8] S. Cheng, H. Shen, J. Huang, G. Zhang, and X. Cheng, “StaticGreedy: Solving the scalability-accuracy dilemma in influence maximization,” in International Conference on Information and Knowledge Management, Proceedings, 2013, pp. 509–518.

[9] C. Zhou, P. Zhang, W. Zang, and L. Guo, “On the Upper Bounds of Spread for Greedy Algorithms in Social Network Influence Maximization,” IEEE Trans. Knowl. Data Eng., vol. 27, no. 10, pp. 2770–2783, Oct. 2015.

[10] Y. Tang, X. Xiao, and Y. Shi, “Influence maximization: Near-optimal time complexity meets practical efficiency,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 2014, pp. 75–86.

[11] Y. Tang, Y. Shi, and X. Xiao, “Influence maximization in near-linear time: A martingale approach,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 2015, vol. 2015-May, pp. 1539–1554.

[12] S. Banerjee, M. Jenamani, and D. K. Pratihar, “A survey on influence maximization in a social network,” Knowl. Inf. Syst., vol. 62, no. 9, pp. 3417–3455, Sep. 2020.

[13] S. Peng, Y. Zhou, L. Cao, S. Yu, J. Niu, and W. Jia, “Influence analysis in social networks: A survey,” J. Netw. Comput. Appl., vol. 106, pp. 17–32, Mar. 2018.

[14] L. Page, L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank Citation Ranking: Bringing Order to the Web,” -, 1998.

[15] J. M. Kleinberg, “Authoritative sources in a hyperlinked environment,” J. ACM, vol. 46, no. 5, pp. 604–632, 1999.

[16] A. Goyal, W. Lu, and L. V. S. Lakshmanan, “SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model,” in 2011 IEEE 11th International Conference on Data Mining, 2011, pp. 211–220.

[17] M. Kimura, K. Saito, R. Nakano, and H. Motoda, “Extracting influential nodes on a social network for information diffusion,” Data Min. Knowl. Discov., vol. 20, no. 1, pp. 70–97, Jan. 2010.

[18] W. Chen, C. Wang, and Y. Wang, “Scalable influence maximization for prevalent viral marketing in large-scale social networks,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2010, pp. 1029–1038.

[19] J. Kim, S. K. Kim, and H. Yu, “Scalable and parallelizable processing of influence maximization for large-scale social networks?,” in Proceedings - International Conference on Data Engineering, 2013, pp. 266–277.

[20] W. Chen, Y. Yuan, and L. Zhang, “Scalable influence maximization in social networks under the linear threshold model,” in Proceedings - IEEE International Conference on Data Mining, ICDM, 2010, pp. 88–97.

[21] R. Narayanam and Y. Narahari, “A shapley value-based approach to discover influential nodes in social networks,” IEEE Trans. Autom. Sci. Eng., vol. 8, no. 1, pp. 130–147, Jan. 2011.

[22] K. Jung, W. Heo, and W. Chen, “IRIE: Scalable and robust influence maximization in social networks,” in Proceedings - IEEE International Conference on Data Mining, ICDM, 2012, pp. 918–923.

[23] Q. Liu, B. Xiang, E. Chen, H. Xiong, F. Tang, and J. X. Yu, “Influence maximization over large-scale social networks: A bounded linear approach,” in CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management, 2014, pp. 171–180.

[24] S. Cheng, H.-W. Shen, J. Huang, W. Chen, and X.-Q. Cheng, “IMRank: Influence Maximization via Finding Self-Consistent Ranking,” SIGIR 2014 - Proc. 37th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 475–484, Feb. 2014.

[25] N. Ohsaka, T. Akiba, Y. Yoshida, and K. Kawarabayashi, “Fast and accurate influence maximization on large networks with pruned Monte-Carlo simulations,” in Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014, pp. 138–144.

[26] S. Galhotra, A. Arora, and S. Roy, “Holistic Influence Maximization: Combining Scalability and Efficiency with Opinion-Aware Models,” Proc. ACM SIGMOD Int. Conf. Manag. Data, vol. 26-June-20, pp. 743–758, Feb. 2016.

[27] N. Sumith, B. Annappa, and S. Bhattacharya, “Influence maximization in large social networks: Heuristics, models and parameters,” Futur. Gener. Comput. Syst., vol. 89, pp. 777–790, Dec. 2018.

[28] M. Zarezade, E. Nourani, and A. Bouyer, “Community Detection using a New Node Scoring and Synchronous Label Updating of Boundary Nodes in Social Networks,” J. AI Data Min., vol. 8, no. 2, pp. 201–212, 2020.

[29] Y. Wang, G. Cong, G. Song, and K. Xie, “Community-based Greedy Algorithm for Mining top-K Influential Nodes in Mobile Social Networks,” in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2010, pp. 1039–1048.

[30] H. Li, S. S. Bhowmick, A. Sun, and J. Cui, “Conformity-aware influence maximization in online social networks,” VLDB J., vol. 24, no. 1, pp. 117–141, Jan. 2015.

[31] A. Bozorgi, H. Haghighi, M. Sadegh Zahedi, and M. Rezvani, “INCIM: A community-based algorithm for influence maximization problem under the linear threshold model,” Inf. Process. Manag., vol. 52, no. 6, pp. 1188–1199, Nov. 2016.

[32] Y. Y. Ko, K. J. Cho, and S. W. Kim, “Efficient and effective influence maximization in social networks: A hybrid-approach,” Inf. Sci. (Ny)., vol. 465, pp. 144–161, Oct. 2018.

[33] J. Shang, H. Wu, S. Zhou, J. Zhong, Y. Feng, and B. Qiang, “IMPC: Influence maximization based on multi-neighbor potential in community networks,” Phys. A Stat. Mech. its Appl., vol. 512, pp. 1085–1103, Dec. 2018.

[34] S. Jendoubi, A. Martin, L. Liétard, H. Ben Hadji, and B. Ben Yaghlane, “Two evidential data based models for influence maximization in Twitter,” Knowledge-Based Syst., vol. 121, pp. 58–70, Apr. 2017.

[35] G. Corneuejols, M. L. Fisher, and G. L. Nemhauser, “Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms.,” Manage. Sci., vol. 23, no. 8, pp. 789–810, Apr. 1977.

[36] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher, “An analysis of approximations for maximizing submodular set functions-I,” Math. Program., vol. 14, no. 1, pp. 265–294, Dec. 1978.

[37] R. A. Rossi and N. K. Ahmed, “The Network Data Repository with Interactive Graph Analytics and Visualization,” in AAAI, 2015. [Online]. Available: https://networkrepository.com/. [Accessed: 25-Oct-2020].

[38] J. Leskovec and A. Krevl, “SNAP Datasets,” Stanford, Jun-2014. [Online]. Available: http://snap.stanford.edu/data. [Accessed: 25-Oct-2020].