摘要
批归一化训练技术是训练现代神经网络的重要技术之一。它通过归一化各个隐藏层的均值和方差,减少了梯度爆炸或消失现象的发生。然而批归一化技术统计的均值和方差依赖于每一个mini batch的数据分布,导致训练时稳定性较差。提出了BN-cluster算法,基于构建块的思想设计了卷积神经网络框架用于分类图像数据集。分析了批归一化问题,统计了每一个批归一化输出结果均值的方差,并且设计了基于批归一化参数聚类的卷积神经网络集成算法,实验结果证明采用集成学习的方法确定批归一化的参数,网络在各个数据集上的训练波动均有所降低,保证了在不降低原有性能的同时使网络的收敛更加稳定、快速。
Batch normalization is one of the most important techniques for training modern neural networks.It slows down the occurrence of the gradient explosion or disappearance by normalizing the mean and variance of each hidden layer. However,the mean and variance highly relying on the data distribution of each mini batch results in poor stability of network during training. In this paper,the BN-cluster algorithm is presented by designing a convolutional neural network framework for images classification based on the idea of building blocks. The problem of batch normalization is analyzed based on the calculation of the variance of the mean of each batch normalization layer output. The convolutional neural network ensemble algorithm based on batch normalization parameter clustering is designed. The experimental results showed that using the ensemble learning method to determine the batch normalization parameters,the training fluctuation of the network on all datasets is reduced and the network convergence is more stable and faster without reducing the original performance.
引文
[1]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436.
[2]BENGIO Y,COURVILLE A,VINCENT P.Representation learning:a review and new perspectives[J].IEEE Transactions on Pattern Analysis and M achine Intelligence,2013,35(8):1798-1828.
[3]OQUAB M,BOTTOU L,LAPTEV I,et al.Learning and transferring mid-level image representations using convolutional neural netw orks[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2014:1717-1724.
[4]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Object detectors emerge in deep scene CNNs[C]//International Conference on Learning Representations,San Diego,2015:568-576.
[5]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition,Boston,M assachusetts,2015:1-9.
[6]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification w ith deep convolutional neural netw orks[C]//International Conference on Neural Information Processing Systems.Lake Tahoe,Nevad,2012:1097-1105.
[7]PARKHI O M,VEDALDI A,ZISSERMAN A.Deep face recognition[C]//British M achine Vision Conference,Sw ansea,2015:1-41.
[8]OFFE S,SZEGEDY C.Batch normalization:accelerating deep netw ork training by reducing internal covariate shift[C]//International Conference on International Conference on M achine Learning,Lille,2015:448-456.
[9]SIMON M,RODNER E,DENZLER J.Image Net pretrained models w ith batch normalization[J].ar Xiv:Computer Vision and Pattern Recognition,2016:1612-1615.
[10]IOFFE S.Batch Renormalization:Towards reducing minibatch dependence in batch-normalized models[C]//Conference and Workshop on Neural Information Processing Systems,long beach,2017:1-9.
[11]HE K.,ZHANG X.,REN S.,et al.Identity mappings in deep residual netw orks[C]//European Conference on Computer Vision,Amsterdam,2016:630-645.
[12]LéCUN,YANN,BOTTOU,LEON,et al.Gradientbased learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[13]HREBIK R,KUKAL J.The Economics and Data Whitening:Data Visualisation[C]//Federated Conference on Softw are Development and Object Technologies.Springer International Publishing,2015:91-101.
[14]SEURET M,ALBERTI M,INGOLD R,et al.PCAinitialized deep neural netw orks applied to document image analysis[C]//IAPR International Conference on Document Analysis and Recognition,Kyoto,2017:877-882.
[15]LV S,DU W,CHEN W,ET AL.Recognition of Hulled Buckw heat M ixture Based on PCA and BP Neural Netw ork[J].Journal of Agricultural M echanization Research,2018(1):166-170.