D. Data
M. Hajizadeh-Tahan; M. Ghasemzadeh
Abstract
Learning models and related results depend on the quality of the input data. If raw data is not properly cleaned and structured, the results are tending to be incorrect. Therefore, discretization as one of the preprocessing techniques plays an important role in learning processes. The most important ...
Read More
Learning models and related results depend on the quality of the input data. If raw data is not properly cleaned and structured, the results are tending to be incorrect. Therefore, discretization as one of the preprocessing techniques plays an important role in learning processes. The most important challenge in the discretization process is to reduce the number of features’ values. This operation should be applied in a way that relationships between the features are maintained and accuracy of the classification algorithms would increase. In this paper, a new evolutionary multi-objective algorithm is presented. The proposed algorithm uses three objective functions to achieve high-quality discretization. The first and second objectives minimize the number of selected cut points and classification error, respectively. The third objective introduces a new criterion called the normalized cut, which uses the relationships between their features’ values to maintain the nature of the data. The performance of the proposed algorithm was tested using 20 benchmark datasets. According to the comparisons and the results of nonparametric statistical tests, the proposed algorithm has a better performance than other existing major methods.