Document Type : Original/Review Paper


Electrical and Computer Engineering Faculty, Semnan University, Semnan, Iran.



Image colorization is an interesting yet challenging task due to the descriptive nature of getting a natural-looking color image from any grayscale image. To tackle this challenge and also have a fully automatic procedure, we propose a Convolutional Neural Network (CNN)-based model to benefit from the impressive ability of CNN in the image processing tasks. To this end, we propose a deep-based model for automatic grayscale image colorization. Harnessing from convolutional-based pre-trained models, we fuse three pre-trained models, VGG16, ResNet50, and Inception-v2, to improve the model performance. The average of three model outputs is used to obtain more rich features in the model. The fused features are fed to an encoder-decoder network to obtain a color image from a grayscale input image. We perform a step-by-step analysis of different pre-trained models and fusion methodologies to include a more accurate combination of these models in the proposed model. Results on LFW and ImageNet datasets confirm the effectiveness of our model compared to state-of-the-art alternatives in the field.


[1] G. Larsson, M. Maire, and G. Shakhnarovich, “Learning representations for automatic colorization,” ECCV, 2016.
[2] M.E. Valentinuzzi, “Understanding the human machine: a primer for bioengineering,” World Scientific, Vol. 4, 2004.
[3] V.K. Bagaria and K. Tatwawadi, “CS231N Project: Coloring black and white world using Deep Neural Nets”, Stanford University, 2016.
[4] X. Gu, M. He, and M. Gu, “Thermal image colorization using Markov decision processes,” Memetic Computing, Vol.  9, pp. 15-22, 2017.
[5] I. Virag, L. -Tivadar, and M. Crişan-Vida, “Client-side Medical Image Colorization in a Collaborative Environment,” Studies in health technology and informatics, pp. 904-908, 2017.
[6] T. Horiuchi, “Color image coding by colorization approach,” Journal on Image and Video Processing, Vol. 1, pp. 158273, 2018.
[7] R. Rastgoo and V. Sattari-Naeini, “A neurofuzzy QoS-aware routing protocol for smart grids” 22nd Iranian Conference on Electrical Engineering (ICEE), pp. 1080-1084, 2014. DOI: 10.1109/IranianCEE.2014.6999696.
[8] F. Bordbar, R.  Rastgoo, M.A. Askarzadeh, and M.S. Tavallali, “Prediction of Residential Natural Gas Consumption Using Artificial Neural Network,” The 9th International Chemical Engineering Congress & Exhibition (IChEC 2015), pp. 1-4, 2015.
[9] R. Rastgoo and V. Sattari-Naeini, “Tuning parameters of the QoS-aware routing protocol for smart grids using genetic algorithm,” Applied Artificial Intelligence, Vol. 30, No. 1, pp. 52-67, 2016.
[10] R. Rastgoo and V. Sattari-Naeini, “Multi-Constraint Optimal Path Finding for QoS-Enabled Smart Grids: A Combination Approach of Neural Network and Fuzzy System,” Journal of Computing and Security, Vol. 4, No. 2, pp. 47-61, 2017.
[11] R. Rastgoo and V. Sattari-Naeini, “Gsomcr: Multi-constraint genetic-optimized qos-aware routing protocol for smart grids,” Iranian Journal of Science and Technology, Transactions of Electrical Engineering, Vol. 42, No. 2, pp. 185-194, 2018.
[12] R. Rastgoo and K. Kiani, “Face recognition using fine-tuning of Deep Convolutional Neural Network and transfer learning,” Journal of Modeling in Engineering, Vol. 17, No, 58, pp. 103-111, 2019.
[13] R. Rastgoo, K. Kiani, and S. Escalera, “Hand sign language recognition using multi-view hand skeleton,” Expert Systems with Applications, Vol. 150, No. 113336, 2020. DOI:
[14] R. Rastgoo, K. Kiani, and S. Escalera, “Multi-Modal Deep Hand Sign Language Recognition in Still Images using Restricted Boltzmann Machine,” Entropy, Vol. 20, No. 809, 2018.
[15] R. Rastgoo, K. Kiani, and S. Escalera, “Video-based isolated hand sign language recognition using a deep cascaded model,” Multimedia Tools and Applications, Vol. 79, pp. 22965–22987, 2020. DOI:
[16] R. Rastgoo, K. Kiani, and S. Escalera, “Hand pose aware multi-modal isolated sign language recognition,” Multi-media Tools and Applications, Vol. 80, No. 1, pp. 127–163, 2021.
[17] R. Rastgoo, K. Kiani, and S. Escalera, “Real-time isolated hand sign language recognition using deep networks and SVD,” Journal of Ambient Intelligence and Humanized Computing, 2021.
[18] R. Rastgoo, K. Kiani, and S. Escalera, “Sign language recognition: A deep survey,” Expert Systems with Applications, Vol. 164, 2021.
[19] M. Kurmanji and F. Ghaderi, “Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study,” Journal of AI and Data Mining (JAIDM), Vol. 8, No. 2, pp. 177-188, 2020.
[20] M. Asadolahzade-Kermanshahi and M.M. Homayounpour, “Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM,” Journal of AI and Data Mining (JAIDM), Vol. 7, No. 1, pp. 137-147, 2019.
[21] A. Torfi, R.A. Shirvani, Y. Keneshloo, N. Tavaf, and E.A. Fox, “Natural Language Processing Advancements by Deep Learning: A Survey,” arXiv: 2003.01200v2, 2020.
[22] D. Varga and T. Szirányi, “Fully automatic image colorization based on Convolutional Neural Network,” 23rd International Conference in Pattern Recognition (ICPR), 2016.
[23] J. An, K.G. Kpeyiton, and Q. Shi, “Grayscale images colorization with convolutional neural networks,” Soft Comput. Vol. 24, pp. 4751–4758, 2020.
[24] J. Wang and Y. Zhou, “Image Colorization with Deep Convolutional Neural Networks,” Stanford report, 2016.
[25] S. Iizuka, E. Simo-Serra, and H. Ishikawa, “Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification,” ACM Transactions on Graphics (TOG), Vol. 35, No. 4, 2016.
[26] S. Titus and J. Rena, “Fast Colorization of Grayscale Images by Convolutional Neural Network,” International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET), 2018.
[27] R. Zhang, P. Isola, and A.A. Efros, “Colorful image colorization,” ECCV, 2016.
[28] Y. LeCun and Y. Bengio, “Convolutional Networks for Images, Speech, and Time-Series,” The handbook of brain theory and neural networks, Vol. 3361, No. 10, pp. 1, 1995.
[29] N. Majidi, K. Kiani, and R. Rastgoo, “A Deep Model for Super-resolution Enhancement from a Single Image,” Journal of AI and Data Mining (JAIDM), Vol. 8, No. 4, pp. 451-460, 2020.
[30] L. Yatziv and G. Sapiro, “Fast Image and Video Colorization using Chrominance Blending,” IEEE Trans. Image Process, Vol. 15, pp. 1120–1129, 2006.
[31] B. Li, Y.K. Lai, and P.L. Rosin, “Example-based Image Colorization via Automatic Feature Selection and Fusion,” Neurocomputing, Vol. 266, pp. 687–698, 2017.
[32] F. Baldassarre, D.G. Morín, and L. Rodés-Guirao, “Deep Koalarization: Image Colorization using CNNs and Inception-ResNet-v2,” arXiv:1712.03400, 2017.
[33] O. Russakovsky, et al. “ ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision, Vol. 115, pp. 211–252, 2015.
[34] G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments,” University of Massachusetts, Technical Report, pp. 7-49, 2008.