基于深度自编码网络的运动目标检测

英文篇名：Motion detection based on deep auto-encoder networks
作者：徐培 ; 蔡小路 ; 何文伟 ; 谢易道
英文作者：XU Pei;CAI Xiaolu;HE Wenwei;XIE Yidao;School of Computer Science and Engineering,University of Electronic Science and Technology of China;
关键词：运动目标检测 ; 视频图像 ; 深度自编码网络 ; 在线学习 ; 代价函数敏感度
英文关键词：motion detection;;video image;;deep auto-encoder network;;online learning;;sensitivity of cost function
中文刊名：JSJY
英文刊名：Journal of Computer Applications
机构：电子科技大学计算机科学与工程学院;
出版日期：2014-10-10
出版单位：计算机应用
年：2014
期：v.34;No.290
基金：中央基本业务经费资助项目(ZYGX2012YB028)
语种：中文;
页：JSJY201410036
页数：5
CN：10
ISSN：51-1307/TP
分类号：179-182+207

摘要

针对从动态背景中提取前景效果较差的问题,提出了一种基于深度自编码网络的运动目标检测方法。首先,用一个三层的深度自编码网络从视频图像中提取不包含运动目标的背景图像,将背景图像作为变量构造了深度自编码网络的代价函数;然后,构造了一个分离函数得到了输入图像的背景图像,再用另一个三层的深度自编码网络学习提取出的背景图像;为了使深度自编码网络的学习能够在线地提取运动目标,还提出了一种在线学习算法,通过寻找对代价函数敏感度较低的权重进行合并,从而能够对更多的视频图像进行处理。实验结果表明,所提方法在从动态背景中提取出前景运动目标上相比Lu等的前景检测的工作(LU C,SHI J,JIA J.Online robust dictionary learning.Proceeding of the 2013 IEEE Conference on Computer Vision and Pattern Recognition,Piscataway:IEEE Press,2013:415-422)检测的准确率提高了6%,并且误报率降低了4.5%。在实际的应用中,能够获得更好的前景背景分离效果,为视频分析等方面的研究奠定更好的基础。
To address the poor results of foreground extraction from dynamic background, a motion detection method based on deep auto-encoder networks was proposed. Firstly, background images without containing motion objects were subtracted from video frames using a three-layer deep auto-encoder network whose cost function contained background as variable. Then, another three-layer deep auto-encoder network was used to learn the subtracted background images which are obtained by constructed separating function. To achieve online motion detection through deep auto-encoder learning, an online learning method of deep auto-encoder network was also proposed. The weights of network were merged according to the sensitivity of cost function to process more video frames. From the experimental results, the proposed method obtains better motion detection accuracy by 6%, and lower false rate by 4. 5% than Lu's work( LU C, SHI J, JIA J. Online robust dictionary learning. Proceeding of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Piscataway: IEEE Press, 2013: 415- 422). This work also obtains better extraction results of background and foreground in real applications,and lays better basis for video analysis.

引文

[1]STAUFFER C,GRIMSON W E L.Adaptive background mixture models for real-time tracking[C]//Proceeding of the 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,1999,2:246-253.
    [2]MITTAL A,PARAGIOS N.Motion-based background subtraction using adaptive kernel density estimation[C]//Proceedings of the2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE Press,2004,2:302-309.
    [3]MATSUYAMA T,OHYA T,HABE H.Background subtraction for non-stationary scenes[C]//Proceeding of the 2000 Asian Conference of Computer Vision.Berlin:Springer-Verlag,2000:662-667.
    [4]KIM K,CHALIDABHONGSE T,HARWOOD D,et al.Real-time foreground-background segmentation using codebook model[J].Real-time Imaging,2005,11(3):172-185.
    [5]RITTSCHER J,KATO J,JOGA S,et al.A probabilistic background model for tracking[C]//Proceedings of the 2000 European Conference Computer Vision,LNCS 6312.Berlin:Springer-Verlag,2000:336-350.
    [6]ZHONG J,SCLAROFF S.Segmenting foreground objects from a dynamic textured background via a robust Kalman filter[C]//Proceedings of the 2003 IEEE International Conference on Computer Vision.Piscataway:IEEE Press,2003:44-50.
    [7]TIAN Y,TIAN S,XU Y,et al.Image object detection based on local feature and sparse representation[J].Journal of Computer Applications,2013,33(6):1670-1673.(田元荣,田松,许悦雷,等.基于局部特征和稀疏表示的图像目标检测算法[J].计算机应用,2013,33(6):1670-1673.)
    [8]BENGIO Y,LAMBLIN P,POPOVICI D,et al.Greedy layer-wise training of deep networks[C]//Proceedings of the 20th Annual Conference on Neural Information Processing Systems.Cambridge:MIT Press,2007:153-160.
    [9]HINTON G E,OSINDERO S,TEH Y W.A fast learning algorithm for deep belief nets[J].Neural Computation,2006,18(7):1527-1554.
    [10]VINCENT P,LAROCHELLE H,BENGIO Y,et al.Extracting and composing robust features with denoising autoencoders[C]//Proceedings of the 25th International Conference on Machine Learning.New York:ACM,2008:1096-1103.
    [11]YUAN F.Codebook generation based on self-organizing incremental neural network for image classification[J].Journal of Computer Applications,2013,33(7):1976-1979.(袁飞云.基于自组织增量神经网络的码书产生方法在图像分类中的应用[J].计算机应用,2013,33(7):1976-1979.)
    [12]OUYANG W,WANG X.Joint deep learning for pedestrian detection[C]//Proceeding of the 2013 IEEE International Conference on Computer Vision.Piscataway:IEEE Press,2013:2056-2063.
    [13]LE Q V,ZOU W Y,YEUNG S Y,et al.Learning hierarchical invariant spatiotemporal features for action recognition with independent subspace analysis[C]//Proceeding of the 2011 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE Press,2011:3361-3368.
    [14]TAYLOR G W,HINTON G E,ROWEIS S T.Modeling human motion using binary latent vairiables[C]//Proceedings of the 20th Annual Conference on Neural Information Processing Systems.Cambridge:MIT Press,2007:1345-1353.
    [15]HEESS N,ROUX N L,WINN J.Weakly supervised learning of background segmentation using masked RBMs[C]//International Conference on Artificial Neural Networks,LNCS 6312.Berlin:Springer-Verlag,2011:9-16.
    [16]ZHAO C,WANG X,CHAM W K.Background subtraction via robust dictionary learning[EB/OL].[2014-02-22].http://www.docin.com/p-233234564.html
    [17]HUANG J,HUANG X,METAXAS D N.Learning with dynamic group sparsity[C]//Proceeding of the 2009 IEEE International Conference of Computer Vision.Piscataway:IEEE Press,2009:64-71.
    [18]LU C,SHI J,JIA J.Online robust dictionary learning[C]//Proceeding of the 2013 IEEE Conference on Computer Vision and Pattern Recognition,Piscataway:IEEE Press,2013:415-422.
    [19]CEVHER V,SANKARANARAYANAN A,DUARTE M,et al.Compressive sensing background subtraction[C]//Proceedings of the 2008 European Conference on Computer Vision,LNCS 6312.Berlin:Springer-Verlag,2008:155-168.
    [20]XU J,DANIEL W C H.A new training and pruning algorithm based on node dependence and Jacobian rank deficiency[J].Neurocomputing,2006,70(1/2/3):544-558.
    [21]LI L,HUANG W,GU I,et al.Statistical modeling of complex backgrounds for foreground object detecting[J].IEEE Transactions on Image Processing,2004,13(11):1459-1472.
    [22]ZHOU X,YANG C,YU W.Moving object detection by detecting contiguous outliers in the low-rank representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(3):597-610.
    [23]STAUFFER C,GRIMSON W.Adaptive background mixture models for real-time tracking[C]//Proceeding of the 1999 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,1999:2246-2252.
    [24]GUTCHESS D,TRAJKOVICS M,COHEN-SOLAL E,et al.A background model initialization algorithm for video surveillance[C]//Proceeding of the 2001 8th IEEE International Conference on Computer Vision.Piscataway:IEEE Press,2001:733-740.
    [25]CANDES E,LI X,MA Y,et al.Robust principal component analysis?[EB/OL].[2014-02-01].http://wenku.baidu.com/view/95964f3243323968011c9261.html.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700