动态场景下基于空时显著性的运动目标检测研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

动态场景下基于空时显著性的运动目标检测研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Motion Detection Based on Spatiotemporal Saliency in Dynamic Scenes
作者：周文明
论文级别：硕士
学科专业名称：检测技术与自动化装置
中文关键词：动态背景 ; 空时显著性 ; 混合动态纹理 ; 运动目标检测
英文关键词：dynamic background ; spatiotemporal saliency ; mixtures of dynamic textures ; motion detection
学位年度：2011
导师：姚莉秀
学科代码：081102
学位授予单位：上海交通大学

摘要

显著性是当前计算机视觉领域的研究热点之一,它模拟人眼视觉注意和信息处理机制,设计类似的显著性计算模型。到目前为止,针对静态图像已有较为成熟的显著区域提取方法。显著性不考虑全局特征的变化,而重点关注局部特征对比,将其扩展到对于视频中运动目标的处理,能够避免在动态场景下建立运动补偿模型的难题。
     运动检测是计算机视觉领域的一个经典问题。高动态背景与相机运动是该任务的两个难点。空时显著性检测,已被应用到运动目标检测中,且被证明具有对高动态背景和相机运动鲁棒的特点。
     本文首先回顾了显著性算法的研究现状,这些算法可以笼统的分为3类:基于底层特征、基于图像复杂度和基于生物视觉模型的。然后阐述了与显著性有关的重要的生物学基础,从这些生物学上的工作机制,我们提出了显著性算法设计基本原则。本文也回顾了经典的运动目标检测算法,并提出了一种基于混合动态纹理的空时显著性方法来检测动态场景下的运动目标。
     本文的主要工作和创新点有:
     (1)综述了显著性算法研究的国内外进展,分析常用算法的优缺点和应用场景;
     (2)实现帧差法,W4,GMM等运动目标检测算法,并通过实验分析提出了改进;
     (3)基于人视觉系统的工作机理,本文总结已有的算法设计思想,并归纳为4条原则;
     (4)改进了一种基于混合动态纹理的空时显著性方法,并将其应用到运动目标检测之中。该方法首先利用混合动态纹理(MDT)对高动态背景进行建模,然后基于中心\邻域的框架利用时空信息计算显著性图。对显著性图进行适当的阈值处理,即得到运动目标检测结果。实验结果表明,本文提出的方法能够有效地改善运动目标检测的精度。
Salience detection is one of the research focus in computer vision, which simulates human visual attention and vision information processing mechanisms to found saliency computation models. So far, for the static image, there have been many mature methods for saliency regional extraction. Saliency detection focuses on local feature contrast rather than global feature changes. So it can avoid the problem of motion compensation in dynamic scenes.
     Motion detection is a classical problem in computer vision research. Highly dynamic background and camera motion are two challenges of this task. Spatiotemporal saliency detection has been used in motion detection and is robust to the challenges.
     This paper reviews the saliency research achievement, and generally divided these algorithms into three categories: algorithms based on low-level features, algorithms based on image complexity and algorithms based on biological visual model. Then introduces some important biological basis related with saliency, based on which we propose 4 basic principles of saliency algorithm design. This article also reviews the classic motion detection algorithms, and presents a spatiotemporal saliency based on mixtures of textures which then is used to detect moving object in dynamic scenes.
     The main work and innovations are:
     (1) reviewed the significant progress in algorithm research at home and abroad, the advantages and disadvantages of commonly used algorithms and scenarios;
     (2)Implement some classic moving target detection algorithms, such as temporal difference, W4, and GMM, and make some improvements by analyzing our experiments;
     (3) Based on the mechanisms of human visual system, the paper summarizes 4 principles for the saliency algorithms design;
     (4)We present a novel spatiotemporal saliency detection method based on mixtures of textures(MDT), which is then used to resolve the motion detection problem. First we model the highly dynamic background by mixtures of textures. Then calculate the spatiotemporal saliency based on a center-surround framework. Finally the saliency map yields motion detection results after thresholding. The experiments demonstrate that our method can effectively promote the accuracy of motion detection.

引文

[1] G. Doretto, A. Chiuso, Y. N. Wu and S. Soatto, Dynamic textures, International Journal of Computer Vision, 2003, 2, 91-109.
    [2] A. Mittal and N. Paragios, Motion-based background subtraction using adaptive kernel density estimation, IEEE Conf. Computer Vision and Pattern Recognition, 2004.
    [3] A. B. Chan, N. Vasconcels, Modeling, clustering and segmenting video with mixture of dynamic textures, IEEE Trans. Pattern Analysis and Machine Intelligence, 2008, 30, 909-926.
    [4] R. Costantini, L. Sbaiz, S. Susstrunk, Higher order SVD analysis for dynamic texture synthesis, IEEE Trans. Image Processing, 2008, 17, 42-52.
    [5] A. Elgammal, D. Harwood, and L. S. Davis, Non-parametric model for background subtraction, ECCV, 2000.
    [6] X. Zhang, J. Yang, Foreground segmentation based on selective foreground model, Electronics Letters, 2008, 851-852.
    [7] A. Monnet, A. Mittal, N. Paragios, and V. Ramesh, Background modeling and subtraction of dynamic scenes. ICCV, 2003.
    [8] M. Haque, M. Murshed and M. Paul, A hybrid object detection technique from dynamic background using Gaussian mixture models, IEEE 10th Workshop on Multimedia Signal Processing, 2008.
    [9] J. Zhang and D. Ma, Nonlinear prediction for Gaussian mixture models, IEEE Trans. Image Processing, 2004, 13, 836-847.
    [10] Z. Wei, S. Jiang, Q. Huang, A pixel-wise local information based background subtraction approach, ICME, 2008.
    [11] S.Yang, C. Shu, Background modeling from GMM likelihood combined with spatial and color coherency, ICIP, 2006.
    [12] A. Elgammal, R. Duraiswami, D. Harwood, and L.S. Davis. Background and Foreground Modeling Using Nonparametric Kernel Density for Visual Surveillance[C]//Proceedings of the IEEE, 2002, 90(7): 1151-1163.
    [13] A. Elgammal, R. Duraiswami and L. Davis,“Efficient kernel density estimation using the fast gauss transform with applications to color modeling and tracking,”IEEE Trans. Pattern Analysis and Machine Intelligence, 2003, 1499-1504.
    [14] C. Yang, R. Duraiswami, N. A. Gumerov and L. Davis,“Improved fast gauss transform and efficient kernel density estimation,”CVPR, 2003, 664-671.
    [15] Y. Sheikh and M. Shah,“Bayesian modeling of dynamic scenes for object detection,”IEEE Trans. Pattern Analysis and Machine Intelligence, 2005, 27, 1778-1792.
    [16] X. Zhang, Segmenting moving object from similarly colored background, Optical Engineering, 2008.
    [17] D. Hoiem, C. Rother and J. Winn, 3D layout CRF for multi-view object class recognition and segmentation[C]//CVPR, 2007.
    [18] M. Piccardi, "Background subtraction techniques: a review," IEEE Int’l Conf. Systems, Man and Cybernetics, 2004: 3099-3104.
    [19] J. Kato, T. Watanabe, S. Joga, J. Rittscher and A. Blake, "An HMM-based segmentation method for traffic monitoring movies," IEEE Trans. Pattern Analysis and Machine Intelligence, 24, 1291-1296 (2002).
    [20] M. Fradet, P. Perez and P. Robert, Time sequential extraction of motion layers, ICIP, 2008.
    [21] J. Chang, C. Cheng, S. Chien and L. Chen, Relative depth layer extraction for monoscopic video by use of multidimensional filter, ICME, 2006.
    [22] Z. Q. Seak, L. M. Ang and K. P. Seng, Face segmentation using combined bottom-up and top-down saliency maps[C]//IEEE International Conference on Computer Science and Information Technology (ICCSIT), 2010, 5: 477-480.
    [23] L. Itti, and C. Koch, Computational modelling of visual attention[J]. Nature Reviews Neuroscience, 2001, 2(3): 194-203.
    [24] D. Gao and N. Vasconcelos. Discriminant saliency for visual recognition from cluttered scenes[C]//NIPS, 2004: 481–488.
    [25] C. Harris and M. Stephens. A combined corner and edge detector[C]//Alvey Vision Conference, 1988: 147-152.
    [26] A. Sha’ashua and S.Ullman. Structural saliency: the detection of globally saliency structures using a locally connected network[C]// Proc. Int. Conf. on Computer Vision, 1988, 321–327.
    [27] D. G. Lowe. Object recognition from local scale-invariant features[C]//Proceedings of International Conference on Computer Vision, 1999: 1150-1157.
    [28] N. Sebe, M. S. Lew. Comparing salient point detectors[J]. Pattern Recognition Letters, 2003, 24(1-3): 89-96.
    [29] T. Kadir and M.l Brady. Scale, Saliency and Image Description[J]. International Journal of Computer Vision, 2001, 45(2): 83-105.
    [30] Itti, L., Koch, C., and Niebur, E. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis[J]. IEEE Transaction on Pattern Analysis Machine and Intelligence. 2002, 20(11): 1254-1259.
    [31] Treisman, A M, Gelade, G. A feature-integration theory of attention. Cognitive Psychol, 1980, 12(1): 97-136.
    [32] G. A. Geri, D. R. Lyon, Y. Y. Zeevi. Preattentive Equivalence of Multicomponent Gabor Textures in the Central and Peripheral Visual Field. Vision Res.. 1995, 35(4): 495-506.
    [33] L. Itti, and C. Gold, and C. Koch, Visual attention and target detection in cluttered natural scenes[J]. Optical Engineering, 2001, 40: 1784-1789.
    [34] V. Mahadevan and Nuno Vasconcelos. Spatiotemporal Saliency in Dynamic Scenes[J]. IEEE transactions on Pattern Analysis and Machine Intelligence. 2010, 32(1): 171-177.
    [35]陈媛媛.图像显著区域提取及其在图像检索中的应用[硕士论文].上海:上海交通大学. 2006.12.
    [36] Jeff Hawkins and Sandra Blakeslee. On Intelligence. Times book. 2004.
    [37] Kuffler SW. Discharge patterns and functional organization of mammalian retina. Neurophysiol. 1953,16:37-68.
    [38] Rodieck R.W. Quantitative analysis of cat retinal ganglion cell response to visual stimuli. Vision Res.1965, 5: 583-601.
    [39] M.F. Bear, B.W. Connors, and M.A. Paradiso. Neuroscience: Exploring the Brain, 2nd. Lippincott Williams&Wilkins, 2001.
    [40] D.H. Hubel and T.N. Wiesel. Receptive fields and functional architecture of monkey striate cortex. Journal of Neurophysiology, 195: 215–243, 1968.
    [41] J.G. Nicholls.神经生物学:从神经元到脑.科学出版社,杨雄里译, 2005.
    [42] G. Bi and M. Poo. Synaptic modification by corelated activity: Hebb’s postulate revisited. Annual Review Neuroscience, 24: 139–166, 2001.
    [43] D. Marr.视觉计算理论.北京:科学出版社,1988.
    [44]杨文璐.视觉感知模型与编码算法研究[博士论文].上海:上海交通大学. 2008.6.
    [45] H.B. Barlow. Single units and sensation: a neuron doctrine for perceptual psychology[J]. Perception, 1972,1(4): 371–394.
    [46] D. Marr and H.K. Nishihara. Representation and recognition of the spatial organization of three dimensional structure. Proceedings of the Royal Society of London B,200:269–294,1978.
    [47]宋雁斓.视觉注意模型及其在图像分类中的应用[硕士论文].上海:上海交通大学. 2008.12.
    [48] D.H. Hubel. The visual cortex of the brain[J]. Scientific American, 1963, 209(5), 54-62.
    [49] L. Itti. Models of bottom-up and top-down visual attention[dissertation], California Institute of Technology Pasadena, 2000.
    [50] C. Koch, S. Ullman. Shifts in selective Visual attention: towards the underlying neural circuitry. Human Neurobiology. 1985, 4(4): 219-227.
    [51]贺俊.基于视觉注意机制的物体显著性研究[硕士论文].上海:上海交通大学. 2009.2.
    [52] R. Milanese. Detecting salient regions in an image: from biological evidence to computer implementation [Ph.D. thesis]. Geneva: University of Geneva.1993.
    [53] Eric I. Knudsen. Fundamental Components of Attention. Annual Review of Neuroscience. 2007, 30: 57-58.
    [54] W-H. Cheng, W-T. Chu, J-H. Kuo and J-L. Wu. Automatic video region-of-Interest determination based on user attention model[C]//Proceedings of IEEE International Symposium on Circuits and Systems. 2005, 4: 3219-3222.
    [55] Jonathan Harel, Christof Koch , Pietro Perona. Graph-Based Visual Saliency[C]//NIPS. 2007, 1: 171-174
    [56]张一.智能视频监控中的目标识别与异常行为建模与分析[博士论文].上海:上海交通大学. 2009.9.
    [57] B.P.L. Lo, S. A. Velastin. Automatic congestion detection system for underground platforms. ISIMP 2001, 2001:158-161.
    [58] Haritaoglu, D. Harwood and L. Davis, W4: Real-time surveillance of people and their activities, IEEE Trans. Pattern Analysis and Machine Intelligence, 2000, 22, 809-830.
    [59] C. Wren, A. Azarhayejani, T. Darrell, A. P. Pentland. Pfinder: real-time tracking of the human body. IEEE Transactions on PAMI, 1997, 19(7): 780-785.
    [60] C. Stauffer, W.E.L. Grimson. Adaptive Background Mixture Models for Real-time Tracking. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.246-252, 1999.
    [61] M. Piccardi, T. Jan, Mean-shift background image modeling. IEEE ICIP, 2004(5):3399-3402.
    [62] B. Han, D. Comaniciu, L.S. Davis. Sequential kernel density approximation through mode propagation: applications to background modeling. Proceedings of ACCV,2004.
    [63] N.M. Oliver, B. Rosario, A.P. Pentland. A Bayesian computer vision system for modeling human interactions. IEEE Transactions on PAMI, 2000, 22(8):831-843.
    [64] A.J. Lipton, H. Fujiyoshi, R.S. Patil. Moving target classification and tracking from real-time video. 4th IEEE Workshop on Applications of Computer Vision, p. 8-14, 1998
    [65] R. Collins, et al. A system for video surveillance and monitoring: VSAM final report. Carnegie Mellon University: Technical Report CMU, 2000.
    [66] F. Archetti, E. Cristina, et al. Foreground–to-Ghost Discrimination in Single-Difference Pre-processing. ACIVS 2006, LNCS, p. 263-274, 2006.
    [67] H. Sidenbladh. Detecting human motion with support vector machines. International Conference on Pattern Recognition(IEEE ICPR’04), 2004(2):188-191.
    [68] Antoni B. Chan, Nuno Vasconcelos. Mixtures of Dynamic Textures[C]//ICCV, 2005(1): 641-647.
    [69] X. Hou and L. Zhang. Saliency Detection: A Spectral Residual Approach[C]// CVPR, 2007, 7: 1-8.
    [70]刘坤.视频监控系统的关键技术研究[硕士论文].上海:上海交通大学. 2010.2.
    [71] http://www.svcl.ucsd.edu/projects/. 2009.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700