Document Type : Methodologies

Authors

1 Computer Engineering, Sheikh Bahaei University, Isfahan, Iran.

2 Computer Science and Computer Engineering, University of Isfahan, Khansar Campus, Isfahan, Iran.

3 MSE, University Canada West, Vancouver, Canada.

Abstract

Efficient regular-frequent pattern mining from sensors-produced data has become a challenge. The large volume of data leads to prolonged runtime, thus delaying vital predictions and decision makings which need an immediate response. So, using big data platforms and parallel algorithms is an appropriate solution. Additionally, an incremental technique is more suitable to mine patterns from big data streams than static methods. This study presents an incremental parallel approach and compact tree structure for extracting regular-frequent patterns from the data of wireless sensor networks. Furthermore, fewer database scans have been performed in an effort to reduce the mining runtime. This study was performed on Intel 5-day and 10-day datasets with 6, 4, and 2 nodes clusters. The findings show the runtime was improved in all 3 cluster modes by 14, 18, and 34% for the 5-day dataset and by 22, 55, and 85% for the 10-day dataset, respectively.

Keywords

Main Subjects

[1] W. Gan, J. C.-W. Lin, P. Fournier-Vige, H.-C. Chao, and P. S. Yu, "A survey of parallel sequential pattern mining," ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 13, No. 3, pp. 1-34, 2019.
 
[2] J. Han, M. Kamber, and J. Pei, Data mining: concepts and techniques, 3 ed., Morgan Kaufmann, 2012.
 
[3] S. K. Tanbeer, M. M. Hassan, A. Almogren, M. Zuair, and B.-S. Jeong, "Scalable regular pattern mining in evolving body sensor data," Future Generation Computer Systems, vol. 75, pp. 172-186, 2017.
 
[4] "Apache Storm," [Online]. Available: http://storm.apache.org/. [Accessed 26 1 2023].
 
[5] K.-M. Yu, J. Zhou, and W. C. Hsiao, "Load balancing approach parallel algorithm for frequent pattern mining," in International Conference on Parallel Computing Technologies, Berlin, Heidelberg, 2007.
 
[6] J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM,  vol. 55, no. 1, p. 2008, 107-113.
 
[7] M. Rashid, I. Gondal,, and J. Kamruzzaman, "Dependable large scale behavioral patterns mining from sensor data using Hadoop platform," Information Sciences, vol. 379, pp. 128-145, 2017.
 
[8] V. M. Nofong, "Discovering productive periodic frequent patterns in transactional databases," Annals of Data Science, pp. 235-249, 2016.
 
[9] S. K. Tanbeer, C. Farhan Ahmed, B.-S. Jeong, and Y.-K. Lee, "Rp-tree: A tree structure to discover regular patterns in transactional database," in Conference on Intelligent Data Engineering and Automated Learning, Berlin, Heidelberg, 2008.
 
[10] S. K. Tanbeer, C. Farhan Ahmed, and B.-S. Jeong, "Mining regular patterns in incremental transactional databases," in Web Conference (APWEB), 2010 12th International Asia-Pacific, Busan, Korea (South), 2010.
 
[11] S. K. Tanbeer, C. Farhan Ahmed, B.-S. Jeong and Y.-K. Lee, "Efficient single-pass frequent pattern mining using a prefix-tree," Information Sciences, vol. 179, no. 5, pp. 559-583, 2009.
 
[12] P. Goyal, J. S. Challa, S. Shrivastava, and N. Goyal, "Anytime Frequent Itemset Mining of Transactional Data Streams," Big Data Research, vol. 21, 2020.
 
[13] Y. Xun, X. Cui, J. Zhang, and Q. Yin, "Incremental frequent itemsets mining based on frequent pattern tree and multi-scale," Expert Systems With Applications, vol. 163, 2021.
 
[14] M. Rashid, R. Karim, B.-S. Jeong, and H.-J. Choi, "Efficient mining regularly frequent patterns in transactional databases," in International Conference on Database Systems for Advanced Applications, 2012.
 
[15] M.-Y. Lin, P.-Y. Lee, and S.-C. Hsueh, "Apriori-based frequent itemset mining algorithms on MapReduce," in Proceedings of the 6th international conference on ubiquitous information management and communication, 2012.
 
[16] M. Riondato, J. A. DeBrabant, R. L. C. Fonseca, and E. Upfal, "PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce," in Proceedings of the 21st ACM international conference on Information and knowledge management, 2012.
 
[17] S. Aridhi, L. d'Orazio, M. Maddouri, and E. Mephu, "A novel MapReduce-based approach for distributed frequent subgraph mining," in Reconnaissance de Formes et Intelligence Artificielle (RFIA) , 2014.
 
[18] M. A. Bhuiyan and M. Al Hasan, "An iterative MapReduce based frequent subgraph mining algorithm," IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 3, p. 2014, 608-620.
 
[19] C. K.-S. Leung and Y. Hayduk , "Mining frequent patterns from uncertain data with MapReduce for big data analytics," in International Conference on Database Systems for Advanced Applications, 2013.
 
[20] Y. Djenouri, A. Belhadi, G. Srivastava, and J. Chun-Wei Lin, "A Secure Parallel Pattern Mining System for Medical Internet of Things," IEEE/ACM Transactions on Computational Biology and Bioinformatics, pp. 1-12, 2023.
 
[21] A. B. Can, M. Zaval, M. Uzun-Per, and M. Aktas S., "On the big data processing algorithms for finding frequent sequences," Concurrency and Computation: Practice and Experience, 2023.
 
[22] "Intel Lab Data," [Online]. Available: http://db.csail.mit.edu/labdata/labdata.html. [Accessed 26 1 2023].
 
[23] "Frequent Itemset Mining Dataset Repository.," [Online]. Available: http://fimi.ua.ac.be/data/. [Accessed 26 1 2023].
 
[24] "Apache Hadoop," [Online]. Available: https://hadoop.apache.org/. [Accessed 26 1 2023].