Document Type : Technical Paper

Authors

1 Department of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran.

2 Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran.

3 Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Nokhbegan Bollovard, Qazvin, Iran

Abstract

Deep convolutional neural networks (CNNs) have attained remarkable success in numerous visual recognition tasks. There are two challenges when adopting CNNs in real-world applications: a) Existing CNNs are computationally expensive and memory intensive, impeding their use in edge computing; b) there is no standard methodology for designing the CNN architecture for the intended problem. Network pruning/compression has emerged as a research direction to address the first challenge, and it has proven to moderate CNN computational load successfully. For the second challenge, various evolutionary algorithms have been proposed thus far. The algorithm proposed in this paper can be viewed as a solution to both challenges. Instead of using constant predefined criteria to evaluate the filters of CNN layers, the proposed algorithm establishes evaluation criteria in online manner during network training based on the combination of each filter’s profit in its layer and the next layer. In addition, the novel method suggested that it inserts new filters into the CNN layers. The proposed algorithm is not simply a pruning strategy but determines the optimal number of filters. Training on multiple CNN architectures allows us to demonstrate the efficacy of our approach empirically. Compared to current pruning algorithms, our algorithm yields a network with a remarkable prune ratio and accuracy. Despite the relatively high computational cost of an epoch in the proposed algorithm in pruning, altogether it achieves the resultant network faster than other algorithms.

Keywords

[1] N. Elyasi and M. Hosseini Moghadam, “Classification of Skin Lesions by Tda Alongside Xception Neural Network,” J. AI Data Min., vol. 10, no. 3, pp. 333–344, 2022.
 
[2] F. Salimian Najafabadi and M. T. Sadeghi, “AgriNet: a New Classifying Convolutional Neural Network for Detecting Agricultural Products’ Diseases,” J. AI Data Min., vol. 10, no. 2, pp. 285–302, 2022.
 
[3] R. Ranjan, V. M. Patel, and R. Chellappa, “HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 1, pp. 121–135, Jan. 2019.
 
[4] H. Filali, J. Riffi, I. Aboussaleh, A. M. Mahraz, and H. Tairi, “Meaningful Learning for Deep Facial Emotional Features,” Neural Process. Lett. 2021, pp. 1–18, Sep. 2021.
 
[5] M. Alam, J.-F. Wang, C. Guangpei, L. Yunrong, and Y. Chen, “Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images,” Mob. Networks Appl. 2021 261, vol. 26, no. 1, pp. 200–215, Feb. 2021.
 
[6] J. Guo, J. Yang, H. Yue, H. Tan, C. Hou, and K. Li, “CDnetV2: CNN-Based Cloud Detection for Remote Sensing Imagery with Cloud-Snow Coexistence,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 1, pp. 700–713, Jan. 2021.
 
[7] X. Zhang, G. Chen, K. Saruta, and Y. Terata, “A Guideline for Object Detection Using Convolutional Neural Networks,” Lect. Notes Electr. Eng., vol. 572 LNEE, pp. 157–164, 2020.
 
[8] BoukercheAzzedine and HouZhijun, “Object Detection Using Deep Learning Methods in Traffic Scenarios,” ACM Comput. Surv., vol. 54, no. 2, Mar. 2021.
 
[9] P. Wang, Q. Wu, C. Shen, A. Dick, and A. Van Den Hengel, “FVQA: Fact-Based Visual Question Answering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 10, pp. 2413–2427, Oct. 2018.
 
[10] N. Takahashi, M. Gygli, and L. van Gool, “AENet: Learning Deep Audio Features for Video Analysis,” IEEE Trans. Multimed., vol. 20, no. 3, pp. 513–524, Mar. 2018.
 
[11] N. Kruger et al., “Deep hierarchies in the primate visual cortex: What can we learn for computer vision?,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1847–1871, 2013.
 
[12] Y. Bengio, “Learning Deep Architectures for AI,” Found. Trends® Mach. Learn., vol. 2, no. 1, pp. 1–127, Nov. 2009.
 
[13] KrizhevskyAlex, SutskeverIlya, and H. E., “ImageNet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, pp. 84–90, May 2017.
 
[14] S. Liu and W. Deng, “Very deep convolutional neural network-based image classification using small training sample size,” Proc. - 3rd IAPR Asian Conf. Pattern Recognition, ACPR 2015, pp. 730–734, Jun. 2016.
 
[15] C. Szegedy et al., “Going deeper with convolutions,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 07-12-June-2015, pp. 1–9, Oct. 2015.
 
[16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-December, pp. 770–778, Dec. 2016.
 
[17] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-January, pp. 2261–2269, Nov. 2017.
 
[18] Z. Q. Zhao, P. Zheng, S. T. Xu, and X. Wu, “Object Detection with Deep Learning: A Review,” IEEE Trans. Neural Networks Learn. Syst., vol. 30, no. 11, pp. 3212–3232, Nov. 2019.
 
[19] T. Choudhary, V. Mishra, A. Goswami, and J. Sarangapani, “A comprehensive survey on model compression and acceleration,” Artif. Intell. Rev. 2020 537, vol. 53, no. 7, pp. 5113–5155, Feb. 2020.
 
[20] D. Blalock, J. J. Gonzalez Ortiz, J. Frankle, and J. Guttag, “What is the State of Neural Network Pruning?,” Proc. Mach. Learn. Syst., vol. 2, pp. 129–146, Mar. 2020, Accessed: Sep. 29, 2021. [Online]. Available: https://github.com/jjgo/shrinkbench.
 
[21] X. Chen, J. Mao, and J. Xie, “Comparison Analysis for Pruning Algorithms of Neural Networks,” Proc. - 2021 2nd Int. Conf. Comput. Eng. Intell. Control. ICCEIC 2021, pp. 50–56, 2021.
 
[22] Y. He, X. Dong, G. Kang, Y. Fu, C. Yan, and Y. Yang, “Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks,” IEEE Trans. Cybern., vol. 50, no. 8, pp. 3594–3604, Aug. 2020.
 
[23] L. Cai, Z. An, C. Yang, and Y. Xu, "Softer Pruning, Incremental Regularization," in: Proc. 2020 25th International Conf. on Pattern Recognition (ICPR), 2021, pp. 224-230.
 
[24] M. Mousa-Pasandi, M. Hajabdollahi, N. Karimi, S. Samavi, and S. Shirani, "Convolutional Neural Network Pruning Using Filter Attenuation," in: Proc. 2020 IEEE International Conf. on Image Processing (ICIP), 2020, pp. 2905-2909.
 
[25] Z. Wang, C. Li, and X. Wang, “Convolutional neural network pruning with structural redundancy reduction,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 14908–14917, 2021.
[26] PeiSongwen, WuYusheng, GuoJin, and QiuMeikang, “Neural Network Pruning by Recurrent Weights for Finance Market,” ACM Trans. Internet Technol., vol. 22, no. 3, pp. 1–23, Jan. 2022.
 
[27] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning Filters for Efficient ConvNets,” in: Proceedings of the 5th International Conf. on Learning Representations (ICLR). Nov. 2017. Toulon, France.
 
[28] S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning Both Weights and Connections for Efficient Neural Networks,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, 2015, pp. 1135–1143.
 
[29] X. Liu, B. Li, Z. Chen, and Y. Yuan, “Exploring Gradient Flow Based Saliency for DNN Model Compression,” in Proceedings of the 29th ACM International Conference on Multimedia, New York, NY, USA: Association for Computing Machinery, 2021, pp. 3238–3246.
 
[30] P. Molchanov, A. Mallya, S. Tyree, I. Frosio, and J. Kautz, “Importance estimation for neural network pruning,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 11256–11264, Jun. 2019.
 
[31] C. H. Sarvani, M. Ghorai, S. R. Dubey, and S. H. S. Basha, “HRel: Filter pruning based on High Relevance between activation maps and class labels,” Neural Networks, vol. 147, pp. 186–197, Mar. 2022.
 
[32] M. Soltani, S. Wu, J. Ding, R. Ravier, and V. Tarokh, “On the information of feature maps and pruning of deep neural networks,” Proc. - Int. Conf. Pattern Recognit., pp. 6988–6995, 2020.
 
[33] C. Hur and S. Kang, “Entropy-based pruning method for convolutional neural networks,” J. Supercomput. 2018 756, vol. 75, no. 6, pp. 2950–2963, Nov. 2018.
 
[34] Y. Si and W. Guo, “Application of A Taylor Expansion Criterion-based Pruning Convolutional Network for Bearing Intelligent Diagnosis,” 2020 Glob. Reliab. Progn. Heal. Manag. PHM-Shanghai 2020, Oct. 2020.
 
[35] C. Yu, J. Wang, Y. Chen, and X. Qin, “Transfer channel pruning for compressing deep domain adaptation models,” Int. J. Mach. Learn. Cybern. 2019 1011, vol. 10, no. 11, pp. 3129–3144, Sep. 2019.
 
[36] Z. Huang, L. Li, and H. Sun, “Global biased pruning considering layer contribution,” IEEE Access, vol. 8, pp. 173521–173529, 2020.
 
[37] B. Wang, F. Ma, L. Ge, H. Ma, H. Wang, and M. A. Mohamed, “Icing-EdgeNet: A Pruning Lightweight Edge Intelligent Method of Discriminative Driving Channel for Ice Thickness of Transmission Lines,” IEEE Trans. Instrum. Meas., vol. 70, 2021.
 
[38] T. Xu et al., “CDP: Towards Optimal Filter Pruning via Class-Wise Discriminative Power,” in Proceedings of the 29th ACM International Conference on Multimedia, New York, NY, USA: Association for Computing Machinery, 2021, pp. 5491–5500.
 
[39] Z. Chen, T. B. Xu, C. Du, C. L. Liu, and H. He, “Dynamical Channel Pruning by Conditional Accuracy Change for Deep Neural Networks,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 2, pp. 799–813, Feb. 2021.
 
[40] A. Gonzalez-Garcia, D. Modolo, and V. Ferrari, “Do Semantic Parts Emerge in Convolutional Neural Networks?,” Int. J. Comput. Vis. 2017 1265, vol. 126, no. 5, pp. 476–494, Oct. 2017.
 
[41] Y. Le Cun, Y. Le Cun, J. S. Denker, and S. A. Solla, “Optimal Brain Damage,” Adv. Neural Inf. Process. Syst., vol. 2, pp. 598--605, 1990, Accessed: Sep. 18, 2022. [Online]. Available: http://130.203.136.95/viewdoc/summary?doi=10.1.1.32.7223.
 
[42] Z. Wang, W. Hong, Y. P. Tan, and J. Yuan, “Pruning 3D Filters for Accelerating 3D ConvNets,” IEEE Trans. Multimed., vol. 22, no. 8, pp. 2126–2137, Aug. 2020.
 
[43] Y. Zhang, Y. Yuan, and Q. Wang, “ACP: Adaptive Channel Pruning for Efficient Neural Networks,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr. 2022, pp. 4488–4492.
 
[44] B. Zhou, D. Bau, A. Oliva, and A. Torralba, “Interpreting Deep Visual Representations via Network Dissection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 9, pp. 2131–2145, Sep. 2019.
 
[45] C. Zhao, B. Ni, J. Zhang, Q. Zhao, W. Zhang, and Q. Tian, “Variational convolutional neural network pruning,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 2775–2784, Jun. 2019.
 
[46] R. Q. Quiroga, L. Reddy, G. Kreiman, C. Koch, and I. Fried, “Invariant visual representation by single neurons in the human brain,” Nat. 2005 4357045, vol. 435, no. 7045, pp. 1102–1107, Jun. 2005.
 
[47] D. Bau, J.-Y. Zhu, H. Strobelt, A. Lapedriza, B. Zhou, and A. Torralba, “Understanding the role of individual units in a deep neural network,” Proc. Natl. Acad. Sci., vol. 117, no. 48, pp. 30071–30078, Dec. 2020.
 
[48] C. Li, M. Z. Zia, Q. H. Tran, X. Yu, G. D. Hager, and M. Chandraker, “Deep Supervision with Intermediate Concepts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 8, pp. 1828–1843, Aug. 2019.
 
[49] C. Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu, “Deeply-Supervised Nets,” in Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015, vol. 38, pp. 562–570, [Online]. Available: https://proceedings.mlr.press/v38/lee15a.html.
[50] Z. Zhuang et al., “Discrimination-Aware Channel Pruning for Deep Neural Networks,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, pp. 883–894.
 
[51] Z. Hou and S. Y. Kung, “A discriminant information approach to deep neural network pruning,” Proc. - Int. Conf. Pattern Recognit., pp. 9553–9560, 2020, doi: 10.1109/ICPR48806.2021.9412693.
 
[52] E. Saraee, M. Jalal, and M. Betke, “Visual complexity analysis using deep intermediate-layer features,” Comput. Vis. Image Underst., vol. 195, p. 102949, Jun. 2020.
 
[53] A. S. Morcos, D. G. T. Barrett, N. C. Rabinowitz, and M. Botvinick, “On the importance of single directions for generalization,” 6th Int. Conf. Learn. Represent. ICLR 2018 - Conf. Track Proc., Mar. 2018.
 
[54] J. Ukita, “Causal importance of low-level feature selectivity for generalization in image recognition,” Neural Networks, vol. 125, pp. 185–193, May 2020.
 
[55] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9911 LNCS, pp. 499–515, 2016.
 
[56] H. Peng and S. Yu, “Beyond softmax loss: Intra-concentration and inter-separability loss for classification,” Neurocomputing, vol. 438, pp. 155–164, May 2021.
 
[57] H. M. Yang, X. Y. Zhang, F. Yin, and C. L. Liu, “Robust Classification with Convolutional Prototype Learning,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 3474–3482, Dec. 2018.
 
[58] S. Son, S. Nah, and K. M. Lee, “Clustering Convolutional Kernels to Compress Deep Neural Networks,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11212 LNCS, pp. 225–240, 2018.
 
[59] Z. Zhou, W. Zhou, H. Li, and R. Hong, “Online Filter Clustering and Pruning for Efficient Convnets,” Proc. - Int. Conf. Image Process. ICIP, pp. 11–15, Aug. 2018.
 
[60] S. Yu, K. Wickstrom, R. Jenssen, and J. Principe, “Understanding Convolutional Neural Networks with Information Theory: An Initial Exploration,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 1, pp. 435–442, Jan. 2021.
 
[61] Y. Li et al., “Exploiting kernel sparsity and entropy for interpretable CNN compression,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 2795–2804, Jun. 2019.
 
[62] E. Elhamifar and R. Vidal, “Sparse subspace clustering: Algorithm, theory, and applications,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 11, pp. 2765–2781, 2013.
 
[63] B. McWilliams and G. Montana, “Subspace clustering of high-dimensional data: a predictive approach,” Data Min. Knowl. Discov. 2013 283, vol. 28, no. 3, pp. 736–772, May 2013.
 
[64] M. Liu, Y. Wang, and Z. Ji, “Self-Supervised Convolutional Subspace Clustering Network with the Block Diagonal Regularizer,” Neural Process. Lett. 2021, pp. 1–27, Aug. 2021.
 
[65] P. Ji, T. Zhang, H. Li, M. Salzmann, and I. Reid  “Deep Subspace Clustering Networks,” in: Proceedings of the 31st International Conf. on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA, 2017, pp 23–32.
 
[66] S. Roy, P. Panda, G. Srinivasan, and A. Raghunathan, “Pruning Filters while Training for Efficiently Optimizing Deep Learning Networks,” Proc. Int. Jt. Conf. Neural Networks, Jul. 2020.
 
[67] Y. He, Y. Ding, P. Liu, L. Zhu, H. Zhang, and Y. Yang, “Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 2006–2015, 2020.
 
[68] Z. Zhou, W. Zhou, R. Hong, and H. Li, “Online Filter Weakening and Pruning for Efficient Convnets,” Proc. - IEEE Int. Conf. Multimed. Expo, vol. 2018-July, Oct. 2018.
[69] P. Singh, V. K. Verma, P. Rai, and V. P. Namboodiri, “Acceleration of Deep Convolutional Neural Networks Using Adaptive Filter Pruning,” IEEE J. Sel. Top. Signal Process., vol. 14, no. 4, pp. 838–847, May 2020.
 
[70] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798–1828, 2013.
[71] B. Aaron, D. E. Tamir, N. D. Rishe, and A. Kandel, “Dynamic incremental K-means clustering,” Proc. - 2014 Int. Conf. Comput. Sci. Comput. Intell. CSCI 2014, vol. 1, pp. 308–313, 2014.
 
[72] U. von Luxburg, “A tutorial on spectral clustering,” Stat. Comput. 2007 174, vol. 17, no. 4, pp. 395–416, Aug. 2007.
 
[73] L. Rosasco, M. Belkin, and E. De Vito, “On Learning with Integral Operators,” J. Mach. Learn. Res., vol. 11, no. 30, pp. 905–934, 2010, Accessed: Sep. 30, 2021. [Online]. Available: http://jmlr.org/papers/v11/rosasco10a.html.
 
[74] C. Xia, W. Hsu, M. L. Lee, and B. C. Ooi, “BORDER: Efficient computation of boundary points,” IEEE Trans. Knowl. Data Eng., vol. 18, no. 3, pp. 289–303, Mar. 2006.
 
[75] A. Achille and S. Soatto, “Emergence of invariance and disentanglement in deep representations,” 2018 Inf. Theory Appl. Work. ITA 2018, Oct. 2018.
[76] L. Decreusefond, I. Flint, N. Privault, and G. L. Torrisi, “Determinantal Point Processes,” Bocconi Springer Ser., vol. 7, pp. 311–342, 2016.
 
[77] H. Wang, P. Chen, and S. Kwong, “Building Correlations between Filters in Convolutional Neural Networks,” IEEE Trans. Cybern., vol. 47, no. 10, pp. 3218–3229, Oct. 2017.
 
[78] W. Liu, Y. Wen, Z. Yu, and M. Yang, “Large-Margin Softmax Loss for Convolutional Neural Networks,” in Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, 2016, pp. 507–516.
 
[79] A. Krizhevsky and A. Krizhevsky, “Learning multiple layers of features from tiny images,” 2009, Accessed: May 10, 2022. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.9220.
 
[80] M. Lin et al., “Hrank: Filter pruning using high-Rank feature map,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 1526–1535, 2020.
 
[81] H. Pan, Z. Chao, J. Qian, B. Zhuang, S. Wang, and J. Xiao, “Network pruning using linear dependency analysis on feature maps,” ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 2021-June, pp. 1720–1724, 2021.
 
[82] C. Zhao, B. Ni, J. Zhang, Q. Zhao, W. Zhang, and Q. Tian, “Variational convolutional neural network pruning,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 2775–2784, Jun. 2019.
 
[83] Z. Huang and N. Wang, “Data-Driven Sparse Structure Selection for Deep Neural Networks,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11220 LNCS, pp. 317–334, 2018.
 
[84] R. Yu et al., “NISP: Pruning Networks Using Neuron Importance Score Propagation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 9194–9203, Dec. 2018.
 
[85] Y. He, X. Zhang, and J. Sun, “Channel Pruning for Accelerating Very Deep Neural Networks,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2017-Octob, pp. 1398–1406, Dec. 2017.
 
[86] T. Wu, X. Li, D. Zhou, N. Li, and J. Shi, “Differential Evolution Based Layer-Wise Weight Pruning for Compressing Deep Neural Networks,” Sensors 2021, Vol. 21, Page 880, vol. 21, no. 3, p. 880, Jan. 2021.