Document Type : Original/Review Paper


1 Department of Computer Science, Yazd University, Yazd, Iran.

2 Department of Computer Engineering, Yazd University, Yazd, Iran.



Today, feature selection, as a technique to improve the performance of the classification methods, has been widely considered by computer scientists. As the dimensions of a matrix has a huge impact on the performance of processing on it, reducing the number of features by choosing the best subset of all features, will affect the performance of the algorithms. Finding the best subset by comparing all possible subsets, even when n is small, is an intractable process, hence many researches approach to heuristic methods to find a near-optimal solutions. In this paper, we introduce a novel feature selection technique which selects the most informative features and omits the redundant or irrelevant ones. Our method is embedded in PSO (Particle Swarm Optimization). To omit the redundant or irrelevant features, it is necessary to figure out the relationship between different features. There are many correlation functions that can reveal this relationship. In our proposed method, to find this relationship, we use mutual information technique. We evaluate the performance of our method on three classification benchmarks: Glass, Vowel, and Wine. Comparing the results with four state-of-the-art methods, demonstrates its superiority over them.


[1] B. Tang, S. Kay and H. He, "Toward optimal feature selection in naive Bayes for text categorization," IEEE transactions on knowledge and data engineering, vol. 28, pp. 2508-2521, 2016.

[2] K. Yurtkan and H. Demirel, "Feature selection for improved 3D facial expression recognition," Pattern Recognition Letters, vol. 38, pp. 26-33, 2014.

[3] S. Tabakhi, A. Najafi, R. RAnjbar and P. Moradi, "Gene selection for microarray data classification using a novel ant colony optimization," Neurocomputing, vol. 168, pp. 1024-1036, 2015.

[4] M. Salehi, J. Razmara and Sh. Lotfi, "Development of an Ensemble Multi-stage Machine for Prediction of Breast Cancer Survivability," Journal of AI and Data Mining, vol. 8, pp. 371-378, 2020.

[5] S. Beigi and  M. R. Amin Naseri, "Credit Card Fraud Detection using Data mining and Statistical Methods," Journal of AI and Data Mining, vol. 2,pp. 149-160, 2020.

[6] Mirzadeh, Nader and Ricci, Francesco and Bansal, Mukesh, "Feature selection methods for conversational recommender systems," in 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service, 2005, pp. 157-165.

[7] Y.S. Jeong, K. S. Shin, and M. K. Jeong, "An evolutionary algorithm with the partial sequential forward floating search mutation for large scale feature selection problems," Journal of The Operational research society vol. 66, pp. 529-538, 2014.

[8] D. P. Muni, N. R. Pal, and J. Das, "Genetic programming for simultaneous feature selection and classifier design," IEEE Trans. Syst, vol. 36, pp. 106-117, 2006.

[9] A. Unler and A. Murat, "A discrete particle swarm optimization method for feature selection in binary classification problems," European Journal of Operational Research, vol. 206, pp. 528-539, 2010.

[10] B. Chen, L.  Chen, Ling and Y. Chen, "Efficient ant colony optimization for image feature selection," Signal processing, vol. 93, pp. 1566-1576, 2013.

[11] Z. Zhu, Y. S. Ong and M.  Dash, "Markov blanket-embedded genetic algorithm for gene selection," Pattern Recognition,vol. 40,  pp. 3236-3248, 2007.

[12] M. Marinaki and Y. Marinakis, Yannis, "A bumble bees mating optimization algorithm for the feature selection problem," International Journal of Machine Learning and Cybernetics, vol. 7, pp. 519-538, 2016.

[13] H. Yu, G.  Gu, Guochang , H Liu, J. Shen and J. Zhao, "A modified ant colony optimization algorithm for tumor marker gene selection," Genomics, proteomics and bioinformatics,vol. 7,pp. 200-208, 2009.

[14] S. W. Lin,K. Ying, Sh. Chen, and Z. Lee, , "Particle swarm optimization for parameter determination and feature selection of support vector machines," Expert systems with applications, vol. 35,pp. 1817-1824, 2008.

[15] S. M. Vieira, L. F. Mendon, "Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients," Applied Soft Computing,vol. 13,  pp. 3494-3504, 2013.

[16] C. L. Huang and J. F. Dun, "A distributed PSO-SVM hybrid system with feature selection and parameter optimization," Applied Soft Computing, vol. 8, pp. 1381-1391, 2008.

[17] B. Xue, L. Cervante, L.  Shang, Lin and  W. Browne, "A multi-objective particle swarm optimisation for filter-based feature selection in classification problems," Connection Science, vol. 24,pp. 91-116, 2012.

[18] L. Cervante, B.  Xue, L.  Shang and M.  Zhang, "A multi-objective feature selection approach based on binary pso and rough set theory," European Conference on Evolutionary Computation in Combinatorial Optimization, 2013, pp. 25-36.

[19] C. E. Shanon , "A mathematical theory of communication," Bell system technical journal, vol. 27,pp. 379-423, 1948.

[20] H. Peng, F.  Long and Ch. Ding, "Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, pp. 1226-1238, 2005.

[21] M. Rahmaninia, P. Moradi, "Osfsmi: Online stream feature selection method based on mutual information," Applied Soft Computing, vol. 68, pp. 733-746, 2018.

[22] N. Bi, J.  Tan, J. H.  Lai and Ch. Suen, "High-dimensional supervised feature selection via optimized kernel mutual information," Expert Systems with Applications,vol. 108,pp. 81-95, 2018.

[23] J. Kennedy, R.C. Eberhart, "A discrete binary version of the particle swarm algorithm, In: Systems, Man, and Cybernetics," in 1997 IEEE International conference on systems, man, and cybernetics. Computational cybernetics and simulation, 1997, pp. 4104-4108.