H.3. Artificial Intelligence
M. Kurmanji; F. Ghaderi
Abstract
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement ...
Read More
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total frames of a video. So far, both 2D and 3D convolutional neural networks have been used to manipulate the temporal dynamics of the video frames. 3D CNNs can extract the changes in the consecutive frames and tend to be more suitable for the video classification task, however, they usually need more time. On the other hand, by using techniques like tiling it is possible to aggregate all the frames in a single matrix and preserve the temporal and spatial features. This way, using 2D CNNs, which are inherently simpler than 3D CNNs can be used to classify the video instances. In this paper, we compared the application of 2D and 3D CNNs for representing temporal features and classifying hand gesture sequences. Additionally, providing a two-stage two-stream architecture, we efficiently combined color and depth modalities and 2D and 3D CNN predictions. The effect of different types of augmentation techniques is also investigated. Our results confirm that appropriate usage of 2D CNNs outperforms a 3D CNN implementation in this task.
Mohammad Mehdi Hosseini; Jalal Hassanian
Abstract
Hand gesture recognition is very important to communicate in sign language. In this paper, an effective object tracking and hand gesture recognition method is proposed. This method is combination of two well-known approaches, the mean shift and the motion detection algorithm. The mean shift algorithm ...
Read More
Hand gesture recognition is very important to communicate in sign language. In this paper, an effective object tracking and hand gesture recognition method is proposed. This method is combination of two well-known approaches, the mean shift and the motion detection algorithm. The mean shift algorithm can track objects based on the color, then when hand passes the face occlusion happens. Several solutions such as the particle filter, kalman filter and dynamic programming tracking have been used, but they are complicated, time consuming and so expensive. The proposed method is so easy, fast, efficient and low cost. In the first step, the motion detection algorithm subtracts the previous frame from the current frame to obtain the changes between two images and white pixels (motion level) are detected by using the threshold level. Then the mean shift algorithm is applied for tracking the hand motion. Simulation results show this method is faster than two times to compared with the old common algorithms