H.5.9. Scene Analysis
Navid Raisi; Mahdi Rezaei; Behrooz Masoumi
Abstract
Human Activity Recognition (HAR) using computer vision is an expanding field with diverse applications, including healthcare, transportation, and human-computer interaction. While classical approaches such as Support Vector Machines (SVM), Histogram of Oriented Gradients (HOG), and ...
Read More
Human Activity Recognition (HAR) using computer vision is an expanding field with diverse applications, including healthcare, transportation, and human-computer interaction. While classical approaches such as Support Vector Machines (SVM), Histogram of Oriented Gradients (HOG), and Hidden Markov Models (HMM) rely on manually extracted features and struggle with complex motion patterns, deep learning-based models (e.g., Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), Transformer-based models) have improved performance but still face challenges in handling occlusions, noisy environments, and computational efficiency. This paper introduces Attention-HAR, a novel deep neural network model designed to enhance HAR performance through three key innovations: Conv3DTranspose for spatial upsampling, ConvLSTM2D for capturing spatiotemporal patterns, and a custom attention mechanism that prioritizes critical frames within sequences. Unlike conventional attention mechanisms, our approach dynamically assigns weights to key frames, reducing the impact of redundant frames and enhancing interpretability and computational efficiency. Experimental results on the UCF-101 dataset demonstrate that Attention-HAR outperforms state-of-the-art models, achieving an accuracy of 97.61%, a precision of 97.95%, a recall of 97.49%, an F1-score of 97.64, and an AUC of 99.9%. With only 1.26 million parameters, the model is computationally efficient and well-suited for deployment on lightweight platforms. These findings suggest that integrating temporal-spatial feature learning with attention mechanisms can significantly improve HAR in dynamic and complex environments.