Document Type : Technical Paper

Authors

Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran.

Abstract

In recent years, sign language recognition has emerged as a major challenge in the fields of image processing and machine learning. People with hearing impairments use sign language to communicate, but the lack of automated tools to translate it has created significant communication barriers. This study presents a hybrid model based on convolutional neural networks (CNNs), transformers, and hidden Markov models (HMMs) to accurately recognize sign language gestures using the MNIST sign language dataset. The model first extracts image features from handwritten images using CNNs and then feeds these features into the Transformer model to process complex and long-term dependencies in the feature sequence. In the next step, to smooth the predictions and improve accuracy, a hidden Markov model is employed, which adjusts the final predictions based on previous sequences. The results show that the proposed model utilizing HMM achieves an accuracy of 99% and a sign error rate of 0.0098, demonstrating its high efficiency in recognizing hand gestures. This research represents an important step toward developing assistive devices for the deaf and enhancing human interaction.

Keywords

Main Subjects

[1] D. M. Cochran and S. L. Koster, “Challenges in automatic sign language recognition: A survey,” Journal of Signal Processing, Vol. 44, No. 5, pp. 812-825, 2020.
[2] X. Zhang and Z. Liu, “Deep learning methods for sign language recognition,” IEEE Transactions on Neural Networks and Learning Systems, Vol. 30, No. 4, pp. 1224-1237, 2019.
[3] S. Jordan and J. Hawkins, “A review of machine learning techniques for sign language recognition,” International Journal of Machine Learning and Computing, Vol. 9, No. 3, pp. 367-374, 2018.
[4] K. Lee and S. Choi, “Hand gesture recognition for sign language using convolutional neural networks,” Pattern Recognition Letters, Vol. 142, pp. 1-9, 2021.
[5] N. Mohamed, M. B. Mustafa and N. Jomhari, “A review of the hand gesture recognition system: Current progress and future directions,” IEEE access, Vol. 9, pp. 157422-157436, 2021.
[6] J. Shin, A. S. M. Miah, M. H. Kabir, M. A. Rahim and A. Al Shiam, “A methodological and structural review of hand gesture recognition across diverse data modalities,” IEEE Access, 2024.
[7] J. Qi, L. Ma, Z. Cui and Y. Yu, “Computer vision-based hand gesture recognition for human-robot interaction: a review,” Complex & Intelligent Systems, Vol. 10, No. 1, pp. 1581-1606, 2024.
[8] A. O. Hashi, S. Z. M. Hashim and A. B. Asamah, “A Systematic Review of Hand Gesture Recognition: An Update From 2018 to 2024,” IEEE Access, 2024.
[9] O. Koller, S. Zargaran, H. Ney and R. Bowden, “Deep sign: Enabling robust statistical continuous sign language recognition via hybrid CNN-HMMs,” International Journal of Computer Vision, Vol. 126, pp. 1311-1325, 2018.
[10] R. S. Sabeenian, S. S. Bharathwaj and M. M. Aadhil, “Sign language recognition using deep learning and computer vision,” Journal of Advanced Research in Dynamical and Control Systems, Vol. 12, No. 5, pp. 964-968, 2020.
[11] C. K. Lee et al., “American sign language recognition and training method with recurrent neural network,” Expert Systems with Applications, Vol. 167, 2021.
[12] J. Fregoso, C. I. Gonzalez and G. E. Martinez, “Optimization of convolutional neural networks architectures using PSO for sign language recognition,” Axioms, Vol. 10, No. 3, pp. 139, 2021.
[13] A. Mannan, A. Abbasi, A. R. Javed, A. Ahsan, T. R. Gadekallu and Q. Xin, “Hypertuned Deep Convolutional Neural Network for Sign Language Recognition,” Computational intelligence and neuroscience, Vol. 2022, No. 1, 2022.
[14] D. Kothadiya et al, “Deepsign: Sign language detection and recognition using deep learning,” Electronics, Vol. 11, No. 11, pp. 1780, 2022.
[15] M. A. Asari, N. A. Jasmin Sufri and G. Si Qi, “Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models,” International Journal of Advances in Intelligent Informatics, Vol. 10, No. 1, 2024.
[16] A. Baihan, A. I. Alutaibi, M. Alshehri, M. and S. K. Sharma, “Sign language recognition using modified deep learning network and hybrid optimization: a hybrid optimizer (HO) based optimized CNNSa-LSTM approach,” Scientific Reports, Vol. 14, No. 1, 2024.
[17] S. Mohsin, B. W. Salim, A. K. Mohamedsaeed, B. F. Ibrahim and S. R. Zeebaree, “American sign language recognition based on transfer learning algorithms,” International Journal of Intelligent Systems and Applications in Engineering, Vol. 12, No. 5, pp. 390–399, 2024.
[18] N. Aslam, K. Abid and S. Munir, “Robot assist sign language recognition for hearing impaired persons using deep learning,” VAWKUM Transactions on Computer Sciences, Vol. 11, No. 1, pp. 245–267, 2023.
[19] J. Zhang et al., “Sign language recognition based on dual-path background erasure convolutional neural network, Scientific Reports, Vol. 14, No. 1, 2024.
[20] ALKHORAIF, A. A., ALSULAIMAN, M., ABDUL, W., & BENCHERIF, M. (2025). Ensemble Transformer-based Word-Level Sign Language Recognition with Multi-Modal Input Fusion. Journal of Engineering Research.
[21] Zhang, Y., & Jiang, X. (2024). Recent Advances on Deep Learning for Sign Language Recognition. Computer Modeling in Engineering & Sciences (CMES), 139(3).
[22] Sign Language MNIST, Kaggle, Available: https://www.kaggle.com/datamunge/sign-language-mnist, 2017.
[23] Y. Zhao et al., “TRanspose: Towards Understanding and Simplifying Transformers in Computer Vision,” in Neural Information Processing Systems, NeurIPS, 2021.
[24] L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” in Proceedings of the IEEE, Vol. 77, No. 2, pp. 257-286, 1989.
[25] R. Hosseinzadeh and M. Sadeghzadeh, "Attention Mechanisms in Transformers: A General Survey," Journal of AI and Data Mining, Vol. 13, No. 3, pp. 359-368, 2025.
[26] M. Allahgholi , H. Rahmani and P. Soltanzadeh, "ConSPro: Context-Aware Stance Detection Using Zero-Shot Prompting," Journal of AI and Data Mining, Vol. 13 ,No. 2, pp. 251-260, 2025