Document Type : Original/Review Paper


1 Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran.

2 Faculty of Electronic and Computer Engineering Department, K.N Toosi University of Technology, Tehran, Iran.


Due to the growing number of data-driven approaches, especially in artificial intelligence and machine learning, extracting appropriate information from the gathered data with the best performance is a remarkable challenge. The other important aspect of this issue is storage costs. The principal component analysis (PCA) and autoencoders (AEs) are samples of the typical feature extraction methods in data science and machine learning that are widely used in various approaches. The current work integrates the advantages of AEs and PCA for presenting an online supervised feature extraction selection method. Accordingly, the desired labels for the final model are involved in the feature extraction procedure and embedded in the PCA method as well. Also, stacking the nonlinear autoencoder layers with the PCA algorithm eliminated the kernel selection of the traditional kernel PCA methods. Besides the performance improvement proved by the experimental results, the main advantage of the proposed method is that, in contrast with the traditional PCA approaches, the model has no requirement for all samples to feature extraction. As regards the previous works, the proposed method can outperform the other state-of-the-art ones in terms of accuracy and authenticity for feature extraction.


Main Subjects

[1] R. Zebari,A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, "A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction", Journal of Applied Science and Technology Trends, vol. 1 ,pp. 56-70, 2020.
[2] Juvonen A, Sipola T, and Hämäläinen T, Online anomaly detection using dimensionality reduction techniques for HTTP log analysis, Computer Networks, vol. 91, pp. 46-56 ,2015.
[3] S.A. Alsenan, I.M. Al-Turaiki, and A.M. Hafez, "Auto-KPCA: A Two-Step Hybrid Feature Extraction Technique for Quantitative Structure–Activity Relationship Modeling", IEEE Access 9, pp. 2466-2477, 2020.
[4] A.U. Khan, "Descriptors and their selection methods in QSAR analysis: paradigm for drug design", Drug discovery today,vol 21, pp. 1291-1302, 2016.
[5] N. Sukumar,G. Prabhu, and P. Saha, "Applications of genetic algorithms in QSAR/QSPR modeling, in: Applications of metaheuristics in process engineering" Springer, pp. 315-324, 2014.
[6] E. Bonabeau,M. Dorigo,G. Theraulaz, "Inspiration for optimization from social insect behaviour", Nature, vol. 406, pp. 39-42, 2000.
[7] C. Yoo, M. Shahlaei, "The applications of PCA in QSAR studies: A case study on CCR5 antagonists", Chemical biology & drug design, vol. 91, pp. 137-152, 2018.
[8] S. Nanga, A.T. Bawah, B.A. Acquaye, M-I Billa, F.D. Baeta, N.A. Odai, S.K. Obeng, and A.D. Nsiah, "Review of dimension reduction methods", Journal of Data Analysis and Information Processing, vol 9, pp. 189-231, 2021.
[9] N. Kambhatla and T.K. Leen, "Dimension reduction by local principal component analysis", Neural computation,vol 9, pp. 1493-1516, 1997.
[10] C. Yin, Y. Zhu, J. Fei, and X. He, "A deep learning approach for intrusion detection using recurrent neural networks", Ieee Access 5, pp. 21954-21961, 2017.
[11] N. Shone, T.N. Ngoc, V.D. Phai, and Q. Shi, "A deep learning approach to network intrusion detection", IEEE transactions on emerging topics in computational intelligence, vol 2, pp. 41-50, 2018.
[12] K. Singh, L. Kaur, and R. Maini, "Comparison of principle component analysis and stacked autoencoder on NSL-KDD dataset", Computational Methods and Data Engineering: Proceedings of ICMDE 2020, Volume 1, pp. 223-241, Springer, 2021.
[13] J. Oh, N. Kwak, "Generalized mean for robust principal component analysis", Pattern Recognition, vol. 54, pp.116-127, 2016.
[14] Q. Wang, Q. Gao, X. Gao, and F. Nie, "Optimal mean two-dimensional principal component analysis with F-norm minimization", Pattern recognition, vol. 68, pp. 286-294, 2017.
[15] F. Farahnakian and J. Heikkonen, "A deep auto-encoder based approach for intrusion detection system", 20th International Conference on Advanced Communication Technology (ICACT): IEEE, pp. 178-183, 2018.
[16] B. Lee, S. Amaresh, C. Green, and D. Engels,  "Comparative study of deep learning models for network intrusion detection", SMU Data Science Review, 1 p. 8, 2018.
[17] N. Shahid, V. Kalofolias, X. Bresson, M. Bronstein, and P. Vandergheynst, "Robust principal component analysis on graphs", Proceedings of the IEEE International Conference on Computer Vision,  pp. 2812-2820, 2015.
[18] Z. Kang, H. Liu, J. Li, X. Zhu, and L. Tian, "Self-paced principal component analysis", Pattern Recognition, vol. 142, p. 109692, 2023.
[19] C. Peng, C. Chen, Z. Kang, J. Li, and Q. Cheng, "RES-PCA: A scalable approach to recovering low-rank matrices", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7317-7325, 2019.
[20] E.J. Candès, X. Li, Y. Ma, and J. Wright, "Robust principal component analysis?", Journal of the ACM (JACM), vol. 58, pp. 1-37, 2011.
[21] B.K. Bao, G. Liu, C. Xu, and A. Yan, "Inductive robust principal component analysis", IEEE transactions on image processing, vol. 21, pp. 3794-3800, 2012.
[22] S.R.S. Malladi, S. Ram, and J.J. Rodríguez, "Image denoising using superpixel-based PCA", IEEE Transactions on Multimedia, vol. 23, pp. 2297-2309, 2020.
[23] Z. Zhu, X. Li, S. Zhang, Z. Xu, L. Yu, and C. Wang, "Graph PCA hashing for similarity search", IEEE Transactions on Multimedia, vol. 19, pp. 2033-2044, 2017.
[24] Q. Ke and T. Kanade, "Robust l/sub 1/norm factorization in the presence of outliers and missing data by alternative convex programming", IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05): IEEE, pp. 739-746, 2005.
[25] C. Ding, D. Zhou, X. He, and H. Zha, "R 1-pca: rotational invariant l 1-norm principal component analysis for robust subspace factorization", Proceedings of the 23rd international conference on Machine learning, pp. 281-288, 2006.
[26] R. Wang, F. Nie, X. Yang, F. Gao, and M. Yao, "Robust 2DPCA With Non-greedy $\ell _ {1} $-Norm Maximization for Image Analysis", IEEE transactions on cybernetics, vol. 45, pp. 1108-1112, 2014.
[27] F. Ju, Y. Sun, J. Gao, Y. Hu, and B. Yin, "Image outlier detection and feature extraction via L1-norm-based 2D probabilistic PCA", IEEE Transactions on Image Processing, vol. 24, pp. 4834-4846, 2015.
[28] Y. Liu and D.A. Pados, "Compressed-sensed-domain l 1-pca video surveillance", IEEE Transactions on Multimedia, vol. 18, pp. 351-363, 2016.
[29] F. Nie, J. Yuan, and H. Huang, "Optimal mean robust principal component analysis", International conference on machine learning: PMLR, pp. 1062-1070, 2014.
[30] Z. Song, D.P. Woodruff, and P. Zhong, "Low rank approximation with entrywise l1-norm error", Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp. 688-701, 2017.
[31] M. Luo, F. Nie, X. Chang, Y. Yang, A. Hauptmann, and Q. Zheng, "Avoiding optimal mean robust PCA/2DPCA with non-greedy l1-norm maximization", Proceedings of International Joint Conference on Artificial Intelligence, pp. 1802-1808, 2016.
[32] M. Kwak and S.B. Kim, "Unsupervised Abnormal Sensor Signal Detection With Channelwise Reconstruction Errors", IEEE Access, vol. 9, pp. 39995-40007, 2021.
[33] A. Tyagi, V.P. Singh, and M.M. Gore, "Towards artificial intelligence in mental health: a comprehensive survey on the detection of schizophrenia", Multimedia Tools and Applications, vol. 82, pp. 20343-20405, 2023.
[34] S. Hosseini, M. Khorashadizadeh, "Efficeint feature selection method using binary teaching-learning-based optimizatin algorithm", Journal of artificial intelligence and data mining (JAIDM), vol. 11, No.1, pp. 29-37, 2023.
[35] J. Kiefer and J. Wolfowitz, "Stochastic estimation of the maximum of a regression function", The Annals of Mathematical Statistics, pp. 462-466, 1952.
[36] F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, "Optimization with sparsity-inducing penalties", Foundations and Trends® in Machine Learning, vol. 4, pp. 1-106, 2012.
[37] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift", International conference on machine learning: pmlr, pp. 448-456, 2015.
[38] J. Bridle, "Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters", Advances in neural information processing systems, vol. 2, 1989.