Document Type : Original/Review Paper

Authors

1 Department of Computer Engineering, Science and Research branch, Islamic Azad University,Tehran, Iran.

2 Department of Computer Engineering Amirkabir University of Technology,Tehran, Iran.

Abstract

Spatio-temporal (ST) clustering is a relatively new field in data mining with great popularity, especially in geographic information. Moving objects are a type of ST data where the available information on these objects includes their last position. The strategy of performing the clustering operation on all-time sequences is used for clustering moving objects. The problem with density-based clustering, which uses this strategy, is that the density of clusters may change at any point in time because of the displacement of points. Hence, the input parameters of an algorithm like DBSCAN used to cluster moving objects will change and have to be determined again. The DBSCAN-based methods have been proposed so far, assuming that the value of input parameters is fixed over time and does not provide a solution for their automatic determination. Nonetheless, with the objects moving and the density of the clusters changing, these parameters have to be determined appropriately again at each time interval. The paper used a dynamic multi-objective genetic algorithm to determine the parameters of the DBSCAN algorithm dynamically and automatically to solve this problem. The proposed algorithm in each time interval uses the clustering information of the previous time interval to determine the parameters. Beijing traffic control data was used as a moving dataset to evaluate the proposed algorithm. The experiments show that using the proposed algorithm for dynamic determination of DBSCAN input parameters outperforms DBSCAN with fixed input parameters over time in terms of the Silhouette and Outlier indices.

Keywords

[1] S. Kisilevich, F. Mansmann, M. Nanni, and S. Rinzivillo, "Spatio-temporal clustering," in Data mining and knowledge discovery handbook: Springer, 2009, pp. 855-874.
 
[2] Z. Falahiazar, A. BAGHERF, and M. Reshadi, "Determining the Parameters of DBSCAN Automatically Using the Multi-Objective Genetic Algorithm," Journal of Information Science & Engineering, vol. 37, no. 1, 2021.
 
[3] P. Kalnis, N. Mamoulis, and S. Bakiras, "On discovering moving clusters in spatio-temporal data," in International Symposium on Spatial and Temporal Databases, 2005, pp. 364-381: Springer.
 
[4] C. S. Jensen, D. Lin, and B. C. Ooi, "Continuous clustering of moving objects," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 9, 2007.
 
[5] M. Ester, H.-P. Kriegel, J. Sander, M. Wimmer, and X. Xu, "Incremental clustering for mining in a data warehousing environment," in VLDB, 1998, vol. 98, pp. 323-333: Citeseer.
 
[6] N. Goyal, P. Goyal, K. Venkatramaiah, P. Deepak, and P. Sanoop, "An efficient density based incremental clustering algorithm in data warehousing environment," in 2009 International Conference on Computer Engineering and Applications, IPCSIT, 2011, Vol. 2, pp. 482-486.
 
[7] A. M. Bakr, N. M. Ghanem, and M. A. Ismail, "Efficient incremental density-based algorithm for clustering large datasets," Alexandria engineering journal, Vol. 54, No. 4, pp. 1147-1154, 2015.
 
[8] P. Yadav and P. Sharma, "An Efficient Incremental Density based Clustering Algorithm Fused with Noise Removal and Outlier Labelling Technique," Indian Journal of Science and Technology, Vol. 9, No. 48, 2016.
 
[9] L. Pradeep and A.M. Sowjanya, "Multi-Density based Incremental Clustering" International Journal of Computer Applications, Vol. 116, No. 17, pp. 0975–8887, 2015.
 
[10] Y. Gong, R. O. Sinnott, and P. Rimba, "RT-DBSCAN: Real-Time Parallel Clustering of Spatio-Temporal Data Using Spark-Streaming," in International Conference on Computational Science, 2018, pp. 524-539: Springer.
 
[11] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Kdd, 1996, Vol. 96, No. 34, pp. 226-231.
 
[12] L. Falahiazar, V. Seydi, and M. Mirzarezaee, "Sequential Multi-objective Genetic Algorithm," Journal of AI and Data Mining, vol. 9, no. 3, pp. 369-381, 2021.
 
[13] U. Maulik, S. Bandyopadhyay, and A. Mukhopadhyay, Multiobjective Genetic Algorithms for Clustering: Applications in Data Mining and Bioinformatics. Springer Science & Business Media, 2011.
 
[14] C. A. C. Coello, G. B. Lamont, and D. A. Van Veldhuizen, Evolutionary algorithms for solving multi-objective problems. Springer, 2007.
 
[15] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, "A fast and elitist multiobjective genetic algorithm: NSGA-II," IEEE transactions on evolutionary computation, vol. 6, no. 2, pp. 182-197, 2002.
 
[16] B. Delaunay, "Sur la  sphère  vide," Izvestia  Akademii  Nauk  SSSR,  Otdelenie  Matematicheskikh  i  Estestvennykh Nauk, pp. 793–800, 1934.
 
[17] P. Roy and J. Mandal, "A novel spatial fuzzy clustering using delaunay triangulation for large scale gis data (nsfcdt)," Procedia Technology, Vol. 6, pp. 452-459, 2012.
 
[18] K. M. Ramachandran and C. P. Tsokos, Mathematical statistics with applications in R. Elsevier, 2014.
 
[19] R. L. Ott and M. T. Longnecker, An introduction to statistical methods and data analysis. Nelson Education, 2015.
 
[20] P. J. Rousseeuw, "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis," Journal of computational and applied mathematics, vol. 20, pp. 53-65, 1987.
 
[21] J. Yuan et al., "T-drive: driving directions based on taxi trajectories," in Proceedings of the 18th SIGSPATIAL International conference on advances in geographic information systems, 2010, pp. 99-108: ACM.
 
[22] J. Yuan, Y. Zheng, X. Xie, and G. Sun, "Driving with knowledge from the physical world," in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011, pp. 316-324: ACM.