Document Type : Original/Review Paper
Authors
- J. Tayyebi ^{1}
- E. Hosseinzadeh ^{} ^{2}
^{1} Department of Industrial Engineering, Birjand University of Technology, Birjand, Iran.
^{2} Department of Mathematics, Kosar University of Bojnord, Bojnord, Iran.
Abstract
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is presented to cluster incomplete fuzzy data. The method substitutes missing attribute by a trapezoidal fuzzy number to be determined by using the corresponding attribute of q nearest-neighbor. Comparisons and analysis of the experimental results demonstrate the capability of the proposed method.
Keywords
[1] Bellman, R. E. & Zadeh, L. A. (1970). Decision making in a fuzzy environment, Manag. Sci, vol. 17, pp. 141-164.
[2] Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms, Plenum, New York.
[3] Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B, vol. 39, pp. 1-38.
[4] Dixon, J. K. (1979). Pattern recognition with partly missing data, IEEE Trans Syst Man Cybern, vol. 9, pp. 617-621.
[5] Dutt, A., Ismail, M. A., & Herawan, T. (2017). A systematic review on educational data mining, IEEE Access, vol. 5, pp. 15991-16005.
[6] Fang, S. C., Hu, C. F., Wang, H. F., & Wu, S. Y. (1999). Linear programming with fuzzy coefficients in constraints, Computers &Mathematics with Applications, vol. 37, no. 10, pp. 63-76.
[7] Farhangfar, A., Kurgan, L. A., & Pedrycz, W. (2007). A novel framework for imputation of missing values in databases, IEEETransactions on Systems, Man, and Cybernetics-Part A: System sand Humans, vol. 37, no. 5, pp. 692-709.
[8] Garcia-Aguado, C., & Verdegay, J. L. (1993). On the sensitivity of membership functions for fuzzy linear programming problems, Fuzzy Sets and Systems, vol. 56, no. 1, pp. 47-49.
[9] Hathaway, R. J. &Bezdek, J. C. (2001). Fuzzy c-means clustering of incomplete data, IEEE Transactions on systems, Man, and Cybernetics Part B: Cybernetics, vol. 31, no. 5, pp. 735-744.
[10] Hettich, S., Blake, C. L. & Merz, C. J. (1998). UCI repository of machine learning database, Department of Information and Computer Science, University of California, Irvine, CA. http.
[11] Lai, Y. J. & Hwang, C. L. (1992). Fuzzy Mathematical Programming Methods and Applications, Springer, Berlin.
[12] Li, D., Gu, H., & Zhang, L. (2010). A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data, Expert Systems with Applications, vol. 37, no. 10, pp. 6942-6947.
[13] Li, D., Gu, H., & Zhang, L. (2013). A hybrid genetic algorithm fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals, Soft Computing, vol. 17, no. 10, pp.1787-1796.
[14] Li, T., Zhang, L., Lu, W., Hou, H., Liu, X., Pedrycz, W. & Zhong,C. (2017). Interval kernel Fuzzy C-Means clustering of incomplete data, Neurocomputing, vol. 237, pp. 316-331.
[15] Liu, L., Sun, S. Z., Yu, H., Yue, X. & Zhang, D. (2016). A modified Fuzzy C-Means (FCM) Clustering algorithm and its application on carbonate fluid identification, Journal of Applied Geophysics, vol. 129, pp. 28-35.
[16] Luenberger, D. G. (1984). Linear and Nonlinear Programming, 2^{nd}ed. Addison-Wesley.
[17] Maleki, H. R. (2002). Ranking functions and their applications to fuzzy linear programming, Far East J. Math. Sci, vol. 4, pp. 283-301.
[18] Mclachlan, G. J. & Basford, K. E. (1988). Mixture models: inference and applications to clustering, Marcel Dekker, New York.
[19] Mesquita, D. P., Gomes, J. P., Junior, A. H. S., &Nobre, J. S.(2017). Euclidean distance estimation in incomplete datasets. Neurocomputing, vol. 248, pp. 11-18.
[20] Miyamoto, S., Takata, O. & Umayahara, K. (1998). Handling missing values in fuzzy c-means. In Proceedings of the third Asian fuzzy systems symposium, Masan, Korea, pp. 139-142.
[21] Owhadi-Kareshki, M. (2019). Entropy-based Consensus for Distributed Data Clustering, Journal of AI and Data Mining, vol. 7, no. 4, pp. 551-561.
[22] Sebestyen, G. S. (1962). Decision-making process in pattern recognition, NY: Macmillan Press.
[23] Shaocheng, T. (1994). Interval number and fuzzy number linear programming, Fuzzy sets and systems, vol. 66, no. 3, pp. 301-306.
[24] Shen, J., Zheng, E., Cheng, Z. & Deng, C. (2017). Assisting attraction classification by harvesting web data, IEEE Access, vol. 5, pp.1600-1608.
[25] Li, J., Struzik, Z., Zhang, L., & Cichocki, A. (2015). Feature learning from incomplete EEG with denoising auto encoder, Neurocomputing, vol. 165, pp. 23-31.
[26] Tan, P. N., Steinbach, M. & Kumar, V. (2005). Introduction to Datamining, Addison- Wesley.
[27] Tanaka, H. &Ichihashi, H. (1984). A formulation of fuzzy linear programming problem based on comparison of fuzzy numbers, Control Cyber, vol. 13, pp. 185-194.
[28] Teodoridis, S. & Koutroumbas, K. (2006). Pattern recognition, Third ed. Academic press, San Diego.
[29] Wang, Z. (2017). Determining the clustering centers by slope difference distribution, IEEE Access, vol. 5, pp. 10995-11002.
[30] Wang, X., Ruan, D. & Kerre, E. E. (2009). Mathematics of Fuzziness ˝U Basic Issues, Springer-Verlag Berlin Heidelberg.
[31] Wu, S., Pang, Y., Shao, S. & Jiang, K. (2018). Advanced fuzzy C-means algorithm based on local Density and Distance, Journal of Shanghai Jiaotong university (Science), vol. 23, no. 5, pp. 636-642.
[32] Yager, R.R. (1981). A procedure for ordering fuzzy sets of the unit interval, Information Sciences, vol. 24, pp. 143-161.
[33] Yang, M. S. & Nataliani, Y. (2017). Robust-learning fuzzy c-means clustering algorithm with unknown number of clusters, Pattern Recognition, vol. 71, pp. 45-59.
[34] Zhang, T. T. & Yuan, B. (2018). Density-based multiscale analysis for clustering in strong noise settings with varying densities, IEEE Access, vol. 6, pp. 25861-25873.