BN-cluster:基于批归一化的集成算法实例分析

英文篇名：BN-cluster: analysis on ensemble algorithm based on batch normalization
作者：张德园 ; 杨柳 ; 李照奎 ; 石祥滨
英文作者：ZHANG De-yuan;YANG Liu;LI Zhao-kui;SHI Xiang-bin;College of Computer Science,Shenyang Aerospace University;
关键词：批归一化 ; BN-cluster算法 ; 卷积神经网络 ; 集成学习
英文关键词：batch normalization;;bn-cluster algorithm;;convolutional neural network;;ensemble learning
中文刊名：HKGX
英文刊名：Journal of Shenyang Aerospace University
机构：沈阳航空航天大学计算机学院;
出版日期：2018-06-25
出版单位：沈阳航空航天大学学报
年：2018
期：v.35;No.151
基金：国家自然科学基金(项目编号:61170185、61602320);; 辽宁省博士启动基金(项目编号:20121034、201601172);; 辽宁省教育厅科学研究一般项目(项目编号:L2014070、L201607);; 辽宁省自然科学基金(项目编号:201601180)
语种：中文;
页：HKGX201803010
页数：9
CN：03
ISSN：21-1576/V
分类号：74-82

摘要

批归一化训练技术是训练现代神经网络的重要技术之一。它通过归一化各个隐藏层的均值和方差,减少了梯度爆炸或消失现象的发生。然而批归一化技术统计的均值和方差依赖于每一个mini batch的数据分布,导致训练时稳定性较差。提出了BN-cluster算法,基于构建块的思想设计了卷积神经网络框架用于分类图像数据集。分析了批归一化问题,统计了每一个批归一化输出结果均值的方差,并且设计了基于批归一化参数聚类的卷积神经网络集成算法,实验结果证明采用集成学习的方法确定批归一化的参数,网络在各个数据集上的训练波动均有所降低,保证了在不降低原有性能的同时使网络的收敛更加稳定、快速。
Batch normalization is one of the most important techniques for training modern neural networks.It slows down the occurrence of the gradient explosion or disappearance by normalizing the mean and variance of each hidden layer. However,the mean and variance highly relying on the data distribution of each mini batch results in poor stability of network during training. In this paper,the BN-cluster algorithm is presented by designing a convolutional neural network framework for images classification based on the idea of building blocks. The problem of batch normalization is analyzed based on the calculation of the variance of the mean of each batch normalization layer output. The convolutional neural network ensemble algorithm based on batch normalization parameter clustering is designed. The experimental results showed that using the ensemble learning method to determine the batch normalization parameters,the training fluctuation of the network on all datasets is reduced and the network convergence is more stable and faster without reducing the original performance.

引文

[1]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436.
    [2]BENGIO Y,COURVILLE A,VINCENT P.Representation learning:a review and new perspectives[J].IEEE Transactions on Pattern Analysis and M achine Intelligence,2013,35(8):1798-1828.
    [3]OQUAB M,BOTTOU L,LAPTEV I,et al.Learning and transferring mid-level image representations using convolutional neural netw orks[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2014:1717-1724.
    [4]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Object detectors emerge in deep scene CNNs[C]//International Conference on Learning Representations,San Diego,2015:568-576.
    [5]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition,Boston,M assachusetts,2015:1-9.
    [6]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification w ith deep convolutional neural netw orks[C]//International Conference on Neural Information Processing Systems.Lake Tahoe,Nevad,2012:1097-1105.
    [7]PARKHI O M,VEDALDI A,ZISSERMAN A.Deep face recognition[C]//British M achine Vision Conference,Sw ansea,2015:1-41.
    [8]OFFE S,SZEGEDY C.Batch normalization:accelerating deep netw ork training by reducing internal covariate shift[C]//International Conference on International Conference on M achine Learning,Lille,2015:448-456.
    [9]SIMON M,RODNER E,DENZLER J.Image Net pretrained models w ith batch normalization[J].ar Xiv:Computer Vision and Pattern Recognition,2016:1612-1615.
    [10]IOFFE S.Batch Renormalization:Towards reducing minibatch dependence in batch-normalized models[C]//Conference and Workshop on Neural Information Processing Systems,long beach,2017:1-9.
    [11]HE K.,ZHANG X.,REN S.,et al.Identity mappings in deep residual netw orks[C]//European Conference on Computer Vision,Amsterdam,2016:630-645.
    [12]LéCUN,YANN,BOTTOU,LEON,et al.Gradientbased learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
    [13]HREBIK R,KUKAL J.The Economics and Data Whitening:Data Visualisation[C]//Federated Conference on Softw are Development and Object Technologies.Springer International Publishing,2015:91-101.
    [14]SEURET M,ALBERTI M,INGOLD R,et al.PCAinitialized deep neural netw orks applied to document image analysis[C]//IAPR International Conference on Document Analysis and Recognition,Kyoto,2017:877-882.
    [15]LV S,DU W,CHEN W,ET AL.Recognition of Hulled Buckw heat M ixture Based on PCA and BP Neural Netw ork[J].Journal of Agricultural M echanization Research,2018(1):166-170.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700