智能决策系统的深度神经网络加速与压缩方法综述

英文篇名：Review of Acceleration and Compression Methods for Deep Neural Networks in Intelligent Decision Systems
作者：黄迪 ; 刘畅
英文作者：HUANG Di;LIU Chang;School of Computer Science and Technology, University of Chinese Academy of Sciences;
关键词：深度神经网络 ; 低秩分解 ; 网络剪枝 ; 量化 ; 知识蒸馏
英文关键词：deep neural network;;low-rank decomposition;;network pruning;;quantization;;knowledge distillation
中文刊名：ZHXT
英文刊名：Command Information System and Technology
机构：中国科学院大学计算机科学与技术学院;
出版日期：2019-05-22 11:55
出版单位：指挥信息系统与技术
年：2019
期：v.10;No.56
基金：装备发展部“十三五”预研课题(31511090402)资助项目
语种：中文;
页：ZHXT201902002
页数：6
CN：02
ISSN：32-1818/TP
分类号：12-17

摘要

深度神经网络凭借其出色的特征提取能力和表达能力,在图像分类、语义分割和物体检测等领域表现出众,对信息决策支持系统的发展产生了重大意义。然而,由于模型存储不易和计算延迟高等问题,深度神经网络较难在信息决策支持系统中得到应用。综述了深度神经网络中低秩分解、网络剪枝、量化、知识蒸馏等加速与压缩方法。这些方法能够在保证准确率的情况下减小深度神经网络模型、加快模型计算,为深度神经网络在信息决策支持系统中的应用提供了思路。
For the excellent feature extraction ability and expression ability, the deep neural network does well in the fields of image classification, semantic segmentation and object detection, etc., and it plays a significant role on the development of the information decision support systems. However, for the difficulty of model storage and high computation delay, the deep neural network is difficult to be applied in the information decision support systems. The acceleration and compression methods for the deep neural network, including low-rank decomposition, network pruning, quantization and knowledge distillation are reviewed. The methods can reduce the size of model and speed up the calculation under the condition of ensuring the accuracy, and can provide the idea of the application in the information decision support systems.

引文

[1] NEAGOE V E,CARATA S V,CIOTEC A D.An advanced neural network-based approach for military ground vehicle recognition in SAR aerial imagery:AFASES 2016[R].[S.l.]:Scientific Research and Education in the Air Force,2016.
    [2] 张晓海,操新文.基于深度学习的军事智能决策支持系统[J].指挥控制与仿真,2018,40(2):1-6.
    [3] RIGAMONTI R,SIRONI A,LEPETIT V,et al.Learning separable filters[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition.Portland:IEEE,2013:2754-2761.
    [4] DENTON E,ZAREMBA W,BRUNA J,et al.Exploiting linear structure within convolutional networks for efficient evaluation[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.Montreal:ACM,2014:1269-1277.
    [5] JADERBERG M,VEDALDI A,ZISSERMAN A.Speeding up convolutional neural networks with low rank expansions[EB/OL].(2014-05-15)[2018-11-05].https://arxiv.org/pdf/1405.3866.pdf.
    [6] LEBEDEV V,GANIN Y,RAKHUBA M,et al.Speeding-up convolutional neural networks using fine-tuned CP-decomposition[EB/OL].(2015-04-24)[2018-11-05].https://arxiv.org/pdf/1412.6553.pdf.
    [7] CHENG T,TONG X,YI Z,et al.Convolutional neural networks with low-rank regularization[EB/OL].(2015-11-10)[2018-11-05].http://arxiv.org/pdf/1511.06067v2.pdf.
    [8] KIM Y D,PARK E,YOO S,et al.Compression of deep convolutional neural networks for fast and low power mobile applications[EB/OL].(2015-12-20)[2018-11-05].http://arxiv.org/pdf/1511.06530v1.pdf.
    [9] NOVIKOV A,PODOPRIKHIN D,OSOKIN A,et al.Tensorizing neural networks[EB/OL].(2015-12-20)[2018-11-05].http://arxiv.org/pdf/1509.06569v1.pdf.
    [10] WANG P S,CHENG J.Accelerating convolutional neural networks for mobile applications[C]//Proceedings of the 24th ACM International Conference on Multimedia.Amsterdam:ACM,2016:541-545.
    [11] WANG W Q,SUN Y F,ERIKSSON B,et al.Wide compression:tensor ring nets[EB/OL].(2018-02-25)[2018-11-05].https://arxiv.org/pdf/1802.09052.pdf.
    [12] YE J M,WANG L N,LI G X,et al.Learning compact recurrent neural networks with block-term tensor decomposition[EB/OL].(2018-05-11)[2018-11-05].http://arxiv.org/pdf/1712.05134.
    [13] CUN Y L,DENKER J S,SOLLA S A.Optimal brain damage[C]//Proceedings of the 2nd International Conference on Neural Information Processing Systems.[S.l.]:ACM,1989:598-605.
    [14] HASSIBI B,STORK D G.Second order derivatives for network pruning:optimal brain surgeon[C]//Proceeding of Advances in Neural Information Processing Systems.[S.l.:s.n.],1992:164-171.
    [15] SRINIVAS S,BABU R V.Data-free parameter pruning for deep neural networks[EB/OL].(2015-07-22)[2018-11-05].https://arxiv.org/pdf/1507.06149.pdf.
    [16] HAN S,POOL J,TRAN J,et al.Learning both weights and connections for efficient neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Montreal:ACM,2015:1135-1143.
    [17] YOON J,HWANG S J.Combined group and exclusive sparsity for deep neural networks[C]//Proceedings of the 34th International Conference on Machine Learning.[S.l.:s.n.],2017:3958-3966.
    [18] LUO J H,WU J X,LIN W Y.Thinet:a filter level pruning method for deep neural network compression[C]//Proceedings of 2017 IEEE International Conference on Computer Vision.[S.l.]:IEEE,2017:5068-5076.
    [19] LIU Z,LI J G,SHEN Z Q,et al.Learning efficient convolutional networks through network slimming[C]//Proceedings of 2017 IEEE International Conference on Computer Vision.[S.l.]:IEEE,2017:2755-2763.
    [20] HE Y H,ZHANG X Y,SUN J.Channel pruning for accelerating very deep neural networks[C]//Proceedings of 2017 IEEE International Conference on Computer Vision.[S.l.]:IEEE,2017:1398-1406.
    [21] SUN X,REN X C,MA S M,et al.meProp:sparsified back propagation for accelerated deep learning with reduced overfitting[C]//Proceedings of the 34th International Conference on Machine Learning.[S.l.:s.n.],2017:3299-3308.
    [22] COURBARIAUX M,BENGIO Y,DAVID J P.Training deep neural networks with low precision multiplications[EB/OL].(2015-09-23)[2018-11-05].http://arxiv.org/pdf/1412.7024.pdf.
    [23] HUBARA I,COURBARIAUX M,SOUDRY D,et al.Quantized neural networks:training neural networks with low precision weights and activations[EB/OL].(2016-09-22)[2018-11-05].https://arxiv.org/pdf/1609.07061.pdf.
    [24] GUPTA S,AGRAWAL A,GOPALAKRISHNAN K,et al.Deep learning with limited numerical precision[EB/OL].(2016-09-22)[2018-11-05].http://www.arxiv.org/pdf/1502.02551.pdf.
    [25] COURBARIAUX M,BENGIO Y,DAVID J P.BinaryConnect:training deep neural networks with binary weights during propagations[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Montreal:ACM,2015:3123-3131.
    [26] ZHOU S C,WU Y X,NI Z K,et al.DoReFa-Net:training low bitwidth convolutional neural networks with low bitwidth gradients[EB/OL].(2018-02-02)[2018-11-05].https://arxiv.org/pdf/1606.06160.pdf.
    [27] LIN D D,TALATHI S S.Overcoming challenges in fixed point training of deep convolutional networks[EB/OL].(2016-07-08)[2018-11-05].https://arxiv.org/pdf/1607.02241.pdf.
    [28] LIN D D,TALATHI S S,ANNAPUREDDY V S.Fixed point quantization of deep convolutional networks[C]//Proceedings of the 33rd International Conference on Machine Learning.[S.l.:s.n.],2016:2849-2858.
    [29] K?STER U,WEBB T J,WANG X,et al.Flexpoint:an adaptive numerical format for efficient training of deep neural networks[EB/OL].(2017-12-02)[2018-11-05].https://arxiv.org/pdf/1711.02213.pdf.
    [30] NARANG S,MICIKEVICIUS P,ALBEN J,et al.Mixed precision training[EB/OL].(2018-02-15)[2018-11-05].http://arxiv.org/pdf/1710.03740.pdf.
    [31] WEN W,XU C,YAN F,et al.TernGrad:ternary gradients to reduce communication in distributed deep learning[EB/OL].(2017-12-29)[2018-11-05].https://arxiv.org/pdf/1705.07878.pdf.
    [32] ZHOU A J,YAO A B,GUO Y W,et al.Incremental network quantization:towards lossless CNNs with low-precision weights[EB/OL].(2017-08-25)[2018-11-05].https://arxiv.org/pdf/1702.03044.pdf.
    [33] BUCILA C,CARUANA R,NICULESCU-MIZIL A.Model compression[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Philadelphia:ACM,2006:535-541.
    [34] HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[EB/OL].(2015-03-09)[2018-11-05].https://arxiv.org/pdf/1503.02531.pdf.
    [35] YIM J,JOO D,BAE J,et al.A gift from knowledge distillation:fast optimization,network minimization and transfer learning[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu :IEEE,2017:7130-7138.
    [36] ROMERO A,BALLAS N,KAHOU S E,et al.FitNets:hints for thin deep nets[EB/OL].(2015-03-27)[2018-11-05].https://arxiv.org/pdf/1412.6550.pdf.
    [37] ZAGORUYKO S,KOMODAKIS N.Paying more attention to attention:improving the performance of convolutional neural networks via attention transfer[EB/OL].(2017-02-12)[2018-11-05].https://arxiv.org/pdf/1612.03928.pdf.
    [38] FUKUDA T,SUZUKI M,KURATA G,et al.Efficient knowledge distillation from an ensemble of teachers[C]//Proceedings of Interspeech.[S.l.:s.n.],2017:3697-3701.
    [39] CHEN G B,CHOI W,YU X,et al.Learning efficient object detection models with knowledge distillation[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Long Beach:ACM,2017:742-751.
    [40] WEI Y,PAN X Y,QIN H W,et al.Quantization mimic:towards very tiny CNN for object detection[EB/OL].(2018-09-13)[2018-11-05].https://arxiv.org/pdf/1805.02152.pdf.
    [41] KHASHMAN A.Automatic detection of military targets utilising neural networks and scale space analysis[R].Northern Cyprus:Near East University,2001.