Document Type : Original/Review Paper

Authors

1 Electrical and Computer Engineering Faculty, K.N.Toosi University of Technology, Tehran, Iran.

2 Faculty of Engineering and Information Sciences, University of Wollongong, New South Wales, Australia.

Abstract

Stereo machine vision can be used as a Space Sampling technique and the cameras parameters and configuration can effectively change the number of Samples in each Volume of space called Space Sampling Density (SSD). Using the concept of Voxels, this paper presents a method to optimize the geometric configuration of the cameras to maximize the SSD which means minimizing the Voxel volume and reducing the uncertainty in localizing an object in 3D space. Each pixel’s field of view (FOV) is considered as a skew pyramid. The uncertainty region will be created from the intersection of two pyramids associated with any of the cameras. Then, the mathematical equation of the uncertainty region is developed based on the correspondence field as a criterion for the localization error, including depth error as well as X and Y axes error. This field is completely dependent on the internal and external parameters of the cameras. Given the mathematical equation of localization error, the camera’s configuration optimization is addressed in a stereo vision system. Finally, the validity of the proposed method is examined by simulation and empirical results. These results show that the localization error will be significantly decreased in the optimized camera configuration.

Keywords

[1] Wu, J., Sharma, R., and Huang, T. (1998). Analysis of uncertainty bounds due to quantization for three-dimensional position estimation using multiple cameras. Optical Engineering journal, Vol. 37, No. 1, pp. 280–292.
[2] Zhou, Y., Yan, F., and Zhou, Z. (2019). Handling pure camera rotation in semi-dense monocular SLAM. Visual Computer, Vol. 35, No. 1, pp. 123–132.
[3] Brown, M., Z., Burschka, D., and Hager, G. (2003). Advances in computational stereo. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 25, No. 8, pp. 993–1008.
[4] Ardakani, H., K., Mousavinia, A., and Safaei, F. (2020). Four points: one-pass geometrical camera calibration algorithm. Visual Computer, Vol. 36, pp. 413-424.
[5] Aliakbarpour, H. and Dias, J. (2012). Three-dimensional reconstruction based on multiple virtual planes by using fusion-based camera network. IET Computer Vision, Vol. 6, No. 4, pp. 355 – 369.
[6] Mandun, Z., Lichao, Q., Guodong, C., and Ming, Y. (2009). A triangulation method in 3D reconstruction from image sequences. Second International Conference on Intelligent Networks and Intelligent Systems, Tianjin, China.
[7] Kanatani, K., Sugaya, Y., and Niitsuma, H. (2008). Triangulation from two views revisited: Hartley-Sturm vs. Optimal correction. Proceedings of the British Machine Vision Conference, Leeds, United Kingdom.
[8] Weilharter, R. and Fraundorfer, F. (2021). HighRes-MVSNet: A Fast Multi-View Stereo Network for Dense 3D Reconstruction from High-Resolution Images. IEEE Access, Vol. 9, pp. 11306-11315.
[9] Hartley, R. & Sturm, P. (1997). Triangulation. Computer Vision Image Understanding, Vol. 68, No. 2, pp. 146–157.
[10] Zhang, C. (2019). CuFusion2: Accurate and Denoised Volumetric 3D Object Reconstruction Using Depth Cameras.  IEEE Access, Vol. 7, pp. 49882-49893.
[11] Liu, Z. -N., Cao, Y., Kuang, Z., Kobbelt, L., and Hu, S. (2021). High-Quality Textured 3D Shape Reconstruction with Cascaded Fully Convolutional Networks. IEEE Transactions on Visualization and Computer Graphics, Vol. 27, No. 1, pp. 83-97.
[12] Fooladgar, F., Samavi, S., Soroushmehr, S., and Shirani, S. (2013). Geometrical Analysis of Localization Error in Stereo Vision Systems. IEEE Sensors Journal, Vol. 13, No. 11, pp. 4236–4246.
[13] Nakabo, Y., Mukai, T., Hattori, Y., et al. (2005). Variable baseline stereo tracking vision system using high-speed linear slider. IEEE International Conference on Robotics and Automation, Barcelona, Spain.
[14] Gallup, D., Frahm, J., Mordohai, P., and Pollefeys, M. (2008). Variable baseline/resolution stereo. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA.
[15] Maliky, R. and Bajcsyz, P. (2008). Automated placement of multiple stereo cameras. The 8th Workshop on Omnidirectional Vision, Camera Networks and Non-classical Cameras, Marseille, France.
[16] Zhang, T. and Boult, T. (2011). Realistic stereo error models and finite optimal stereo baselines. IEEE Workshop on Applications of Computer Vision (WACV), Kona, HI, USA.
[17] Wenhardt, S., Denzler, J., and Niemann, H. (2007). On minimizing errors in 3D reconstruction for stereo camera systems. Pattern Recognition and Image Analysis, Vol. 17, No. 2, pp. 337–348.
[18] Safaei, F., Mokhtarian, P., Shidanshidi, H. et al. (2013). Scene-adaptive Configuration of Multiple Cameras using the Correspondence Field Function. IEEE International Conference on Multimedia and Expo (ICME), San Jose, CA, USA.
[19] Karami, M.,  Mousavinia, A., and  Ehsanian, M. (2017). A General Solution for Iso-Disparity Layers and Correspondence Field Model for Stereo Systems. IEEE Sensors Journal, Vol. 17, No. 12, pp. 3744–3753.
[20] Fu, S., Safaei, F., Li, W. (2017). Optimization of Camera Arrangement Using Correspondence Field to Improve Depth Estimation. IEEE Trans. on Image Processing, Vol. 26, No. 6, pp. 3038–3050.
[21] Karami, M.,  Mousavinia, A., and  Ehsanian, M. (2020). Camera Arrangement in Visual 3D Systems using Iso-disparity Model to Enhance Depth Estimation Accuracy. Journal of AI and Data Mining, Vol. 8, No. 1, pp. 1-12.
[22] Jarrousse, O. (2014). Modified Mass-spring System for Physically-based Deformation Modeling. KIT Scientific Publishing.
[23] The Stanford 3D repository website (2014), Available: http://graphics.stanford.edu/data/3Dscanrep/.