Sports movements modification based on 2D joint position using YOLO to 3D skeletal model adaptation

Rahati, A.; Rahbar, K.

doi:10.22044/jadm.2022.11975.2344

Document Type : Original/Review Paper

Authors

Department of Computer Engineering, South Tehran Branch, Islamic Azad University, Tehran, Iran.

https://doi.org/10.22044/jadm.2022.11975.2344

Abstract

Doing sports movements correctly is very important in ensuring body health. In this article, an attempt has been made to achieve the movements correction through the usage of a different approach based on the 2D position of the joints from the image in 3D space. A person performing in front of the camera with landmarks on his/her joints is the subject of the input image. The coordinates of the joints are then measured in 2D space which is adapted to the extracted 2D skeletons from the reference skeletal sparse model modified movements. The accuracy and precision of this approach is accomplished on the standard Adidas dataset. Its efficiency has also been studied under the influence of cumulative Gaussian and impulse noises. Meanwhile, the average error of the model in detecting the wrong exercise in the set of sports movements is reported to be 5.69 pixels.

Keywords

20.1001.1.23225211.2022.10.4.9.3

References

[1] A. Mousavi, A. Sheikh Mohammad Zadeh, M. Akbari, and A. Hunter, “A New Ontology-Based Approach for Human Activity Recognition from GPS Data,” J. AI Data Min., vol. 5, no. 2, pp. 197–210, 2017, [Online]. Available: http://jad.shahroodut.ac.ir/article_889.html.

[2] S. Li and A. B. Chan, “3D human pose estimation from monocular images with deep convolutional neural network,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015, Vol. 9004, pp. 332–347, doi: 10.1007/978-3-319-16808-1_23.

[3] B. Tekin, I. Katircioglu, M. Salzmann, V. Lepetit, and P. Fua, “Structured Prediction of 3D Human Pose with Deep Neural Networks,” in Procedings of the British Machine Vision Conference 2016, 2016, vol. 2016-September, pp. 130.1-130.11, doi: 10.5244/C.30.130.

[4] J. Martinez, R. Hossain, J. Romero, and J. J. Little, “A simple yet effective baseline for 3d human pose estimation,” IEEE Int. Conf. Comput. Vis., May 2017, Accessed: Jun. 08, 2020. [Online]. Available: http://arxiv.org/abs/1705.03098.

[5] Y. Kudo, K. Ogaki, Y. Matsui, and Y. Odagiri, “Unsupervised adversarial learning of 3d human pose from 2d joint locations,” ‏arXiv:1803.08244v1, 2018, [Online]. Available: http://arxiv.org/abs/1803.08244.

[6] C. H. Chen et al., “Unsupervised 3D pose estimation with geometric self-supervision,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Vol. 2019-June, pp. 5707–5717, 2019, doi: 10.1109/CVPR.2019.00586.

[7] S. Tripathi, S. Ranade, A. Tyagi, and A. Agrawal, “PoseNet3D: Unsupervised 3D Human Shape and Pose Estimation,” arXiv:2003.03473v1, 2020.

[8] N. Pourdamghani, H. R. Rabiee, F. Faghri, and M. H. Rohban, “Graph based semi-supervised human pose estimation: When the output space comes to help,” Pattern Recognit. Lett., Vol. 33, No. 12, pp. 1529–1535, 2012, doi: 10.1016/j.patrec.2012.04.012.

[9] D. Pavllo, Z. Eth, and C. Feichtenhofer, “3D human pose estimation in video with temporal convolutions and semi-supervised training,” CVPR, 2019.

[10] R. Mitra, N. B. Gundavarapu, A. Sharma, A. Ai, and A. Jain, “Multiview-Consistent Semi-Supervised Learning for 3D Human Pose Estimation,” 2020.

[11] B. A. Olshausen and D. J. Field, “Sparse coding with an overcomplete basis set: A strategy employed by V1?,” Vision Res., Vol. 37, No. 23, pp. 3311–3325, Dec. 1997, doi: 10.1016/S0042-6989(97)00169-7.

[12] C. Wang, Y. Wang, Z. Lin, A. L. Yuille, and W. Gao, “Robust estimation of 3D human poses from a single image,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Sep. 2014, pp. 2369–2376, doi: 10.1109/CVPR.2014.303.

[13] V. Ramakrishna, T. Kanade, and Y. Sheikh, “Reconstructing 3D human pose from 2D image landmarks,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, Vol. 7575 LNCS, No. PART 4, pp. 573–586, doi: 10.1007/978-3-642-33765-9_41.

[14] X. Zhou, M. Zhu, S. Leonardos, and K. Daniilidis, “Sparse Representation for 3D Shape Estimation: A Convex Relaxation Approach,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 39, No. 8, pp. 1648–1661, Sep. 2017, doi: 10.1109/TPAMI.2016.2605097.

[15] E. J. Candès, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by reweightedℓ1 minimization,” J. Fourier Anal. Appl., Vol. 14, No. 5–6, pp. 877–905, Dec. 2008, doi: 10.1007/s00041-008-9045-x.

[16] X. Zhou, M. Zhu, S. Leonardos, K. G. Derpanis, and K. Daniilidis, “Sparseness meets deepness: 3D human pose estimation from monocular video,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Dec. 2016, Vol. 2016-Decem, pp. 4966–4975, doi: 10.1109/CVPR.2016.537.

[17] X. Fan, K. Zheng, Y. Zhou, and S. Wang, “Pose locality constrained representation for 3D human pose reconstruction,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, Vol. 8689 LNCS, No. PART 1, pp. 174–188, doi: 10.1007/978-3-319-10590-1_12.

[18] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process., Vol. 54, No. 11, pp. 4311–4322, Nov. 2006, doi: 10.1109/TSP.2006.881199.

[19] A. Rakotomamonjy, “Applying alternating direction method of multipliers for constrained dictionary learning,” Neurocomputing, Vol. 106, pp. 126–136, Apr. 2013, doi: 10.1016/j.neucom.2012.10.024.

[20] B. Di Liu, Y. X. Wang, B. Shen, X. Li, Y. J. Zhang, and Y. J. Wang, “Blockwise coordinate descent schemes for efficient and effective dictionary learning,” Neurocomputing, Vol. 178, pp. 25–35, Feb. 2016, doi: 10.1016/j.neucom.2015.06.096.

[21] W. Li et al., “Maxdenominator Reweighted Sparse Representation for Tumor Classification,” Sci. Rep., Vol. 7, No. 1, pp. 1–13, Apr. 2017, doi: 10.1038/srep46030.

[22] M. Jiang, Z. Yu, Y. Zhang, Q. Wang, C. Li, and Y. Lei, “Reweighted sparse representation with residual compensation for 3D human pose estimation from a single RGB image,” Neurocomputing, Vol. 358. pp. 332–343, 2019, doi: 10.1016/j.neucom.2019.05.034.

[23] H. Medvesek, “Most Common Exercise Mistakes: Are You Doing It Wrong?” https://www.runtastic.com/blog/en/bodyweight-exercise-mistakes/ (accessed Dec. 18, 2020).

[24] J. Redmon and A. Farhadi, “YOLO v.3,” Tech Rep., pp. 1–6, 2018, [Online]. Available: https://pjreddie.com/media/files/papers/YOLOv3.pdf.

[25] J. Xiao, “ExYOLO: A small object detector based on YOLOv3 Object Detector,” Procedia CIRP, Vol. 188, No. 2019, pp. 18–25, 2021, doi: 10.1016/j.procs.2021.05.048.

Journal of AI and Data Mining

Sports movements modification based on 2D joint position using YOLO to 3D skeletal model adaptation

References

References

Volume 10, Issue 4
November 2022
Pages 549-557

Sports movements modification based on 2D joint position using YOLO to 3D skeletal model adaptation

References

References

Volume 10, Issue 4November 2022Pages 549-557

Volume 10, Issue 4
November 2022
Pages 549-557