Document Type : Original/Review Paper

Authors

Department of Electrical and Computer Engineering, Yazd University, Yazd, Iran.

Abstract

Artificial neural networks are among the most significant models in machine learning that use numeric inputs. This study presents a new single-layer perceptron model based on categorical inputs. In the proposed model, every quality value in the training dataset receives a trainable weight. Input data is classified by determining the weight vector that corresponds to the categorical values in it. To evaluate the performance of the proposed algorithm, we have used 10 datasets. We have compared the performance of the proposed method to that of other machine learning models, including neural networks, support vector machines, naïve Bayes classifiers, and random forests. According to the results, the proposed model resulted in a 36% reduction in memory usage when compared to baseline models across all datasets. Moreover, it demonstrated a training speed enhancement of 54.5% for datasets that contained more than 1000 samples. The accuracy of the proposed model is also comparable to other machine learning models.

Keywords

Main Subjects

[1] P. Dongare, S. Kannan, R. Garg, and S. S. Harsoor, “Describing and displaying numerical and categorical data,” Airway, vol. 2, no. 2, p. 64, 2019.
 
[2] A. Agresti, An introduction to categorical data analysis. Hoboken, New Jersey: John Wiley & Sons, Inc., 2018.
 
[3] D. Jurafsky and J. H. Martin, “Speech and language processing: An introduction to natural language processing, computational linguistics and speech recognition” (draft), 2023.
 
[4] G. Alfonso Perez and R. Castillo, “Categorical variable mapping considerations in classification problems: Protein application,” Mathematics, vol. 11, no. 2, p. 279, 2023.
 
[5] J. T. Hancock and T. M. Khoshgoftaar, “Survey on categorical data for neural networks,” J. Big Data, vol. 7, no. 1, 2020.
 
[6] A. R. Shabrandi, A. Rajabzadeh Ghatari, N. Tavakoli, M. Dehghan Nayeri, & S. Mirzaei. “Fast COVID-19 Infection Prediction with In-House Data Using Machine Learning Classification Algorithms: A Case Study of Iran,” Journal of AI and Data Mining, vol. 11, no. 4, pp. 573-585, 2023.
 
[7] J. Moeyersoms and D. Martens, “Including high-cardinality attributes in predictive models: A case study in churn prediction in the energy sector,” Decis. Support Syst., vol. 72, pp. 72–81, 2015.
 
[8] D. Reilly, M. Taylor, P. Fergus, C. Chalmers, and S. Thompson, “The categorical data conundrum: Heuristics for classification problems—A case study on domestic fire injuries,” IEEE Access, vol. 10, pp. 70113–70125, 2022.
 
[9] A. Alexandridis, E. Chondrodima, N. Giannopoulos, and H. Sarimveis, “A fast and efficient method for training categorical radial basis function networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 11, pp. 2831–2836, 2016.
 
[10] L. A. S. Cardona, H. D. Vargas-Cardona, P. Navarro González, D. A. Cardenas Peña, and Á. Á. Orozco Gutiérrez, “Classification of categorical data based on the Chi-square dissimilarity and t-SNE,” Computation (Basel), vol. 8, no. 4, p. 104, 2020.
 
[11] M. M. Arat, “Learning from high-cardinality categorical features in deep neural networks,” Journal of Advanced Research in Natural and Applied Sciences, 2022.
 
[12] D. B. Suits, “Use of dummy variables in regression equations,” J. Am. Stat. Assoc., vol. 52, no. 280, pp. 548–551, 1957.
 
[13] K. Potdar, T. S., and C. D., “A comparative study of categorical variable encoding techniques for neural network classifiers,” Int. J. Comput. Appl., vol. 175, no. 4, pp. 7–9, 2017.
 
[14] D. Micci-Barreca, “A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems,” SIGKDD Explor., vol. 3, no. 1, pp. 27–32, 2001.
 
[15] C. Seger, “An investigation of categorical variable encoding techniques in machine learning: binary versus one-hot and feature hashing,” DEGREE PROJECT TECHNOLOGY. Published online 2018.
 
[16] W. H. Riska, D. Permana, A. A. Putra, and Zilrahmi, “Categorical data clustering with K-Modes method on fire cases in DKI Jakarta Province,” ujsds, vol. 2, no. 1, pp. 56–63, 2024.
 
[17] H. Cho and Y. Chung, “Clustering high-cardinality categorical data using category embedding methods,” J. Korean Data Inf. Sci. Soc., vol. 31, no. 1, pp. 209–220, 2020.
 
[18] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and their Compositionality,” arXiv [cs.CL], 2013.
 
[19] J. Pennington, R. Socher, and C. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
 
[20] A. Asuncion and D. Newman, UCI machine learning repository, 2007.
 
[21] B. W. Matthews, “Comparison of the predicted and observed secondary structure of T4 phage lysozyme,” Biochim. Biophys. Acta, vol. 405, no. 2, pp. 442–451, 1975.
 
[22] M. T. Hagan and M. B. Menhaj, “Training feedforward networks with the Marquardt algorithm,” IEEE Trans. Neural Netw., vol. 5, no. 6, pp. 989–993, 1994.
 
[23] R. Fernandes de Mello and M. Antonelli Ponti, “Statistical learning theory,” in Machine Learning, Cham: Springer International Publishing, 2018, pp. 75–128.
 
[24] G. H. John and P. Langley, “Estimating continuous distributions in Bayesian classifiers,” arXiv [cs.LG], 2013.
 
[25] H. Zhang, B. Quost, and M.-H. Masson, “Cautious weighted random forests,” Expert Syst. Appl., vol. 213, no. 118883, p. 118883, 2023.
 
[26] K.-L. Du and M. N. S. Swamy, “Radial Basis Function Networks,” in Neural Networks and Statistical Learning, London: Springer London, pp. 299–335, 2014.