视觉监控中的多物体跟踪技术研究

英文题名：Research on Multiple Object Tracking of Vision Surveillance
作者：邓志辉
论文级别：硕士
学科专业名称：控制理论与控制工程
中文关键词：运动检测 ; 多物体跟踪 ; 码书模型 ; 粒子滤波器 ; On-line ; Boosting ; 全局最近邻法
英文关键词：motion detection ; multiple object tracking ; code-book model ; particle filter ; on-line boosting ; global nearest neighbor
学位年度：2010
导师：路林吉
学科代码：081101
学位授予单位：上海交通大学

摘要

视觉监控中的多物体跟踪是计算机视觉研究领域的热点问题之一,尤其是近年来,视频监控系统发挥着越来越重要的作用,它广泛应用于民宅、停车场、公共场合、银行等一些场所的实时监控。本文基于为使用一个固定的普通彩色摄像头来监控户外或者室内场景场合,设计了一种综合运动检测与物体跟踪的智能视觉监控系统,对数目变化的多物体能自动完成检测和跟踪,并保存轨迹信息。
     多物体跟踪的难点在于跟踪目标变化不定、实际场景复杂、物体存在变形等,本文基于运动检测与物体跟踪相结合的思想,将多物体跟踪系统分为运动检测模块、团块检测模块、跟踪模块和轨迹产生模块四部分,对提高运动检测与物体跟踪的实时性、鲁棒性与精确性进行了研究。本文的主要研究内容如下:
     运动检测部分,首先详细分析了基于码本模型的背景差法,在原算法基础上,将像素在时域视为高斯分布,使之更符合统计规律和人体视觉系统,重新定义了码本、亮度失真度及更新规则,从而能检测更加完整的前景目标。接着详细分析了基于贝叶斯分类的背景差法,并基于码本模型中的颜色空间模型,设计了新的阈值化法,能够一定程度上抑制拖影现象。实验表明,这两种背景差法在存在环境噪声、运动的背景等情况下,都能有效地检测出运动的物体。
     目标跟踪部分,在运动检测的基础上,给出了多物体跟踪的框架,即将多物体跟踪分解为多个单物体跟踪的组合。一方面,将跟踪问题视为最优状态估计问题,详细研究了利用粒子滤波进行物体跟踪的算法流程和实现方法。并采用目标颜色特征和运动特征相结合的似然函数,以及采用MCMC改善粒子的重要性分布,从而提高了目标跟踪的精度和效率。另一方面,将跟踪问题视为一分类问题,详细研究了利用On-line Boosting实现物体跟踪,据此分析了On-line Boosting算法、Absolute Haar特征、Haar-like特征、弱分类器设计、On-line Boosting跟踪流程。本文对原On-line Boosting算法的重要性权值更新策略进行了改进,并采用上述粒子滤波器加快原On-line Boosting算法的跟踪速度。数据关联部分,考虑到跟踪器-观测值对间的距离、尺度、速度以及运动方向等都对跟踪器-观测值对间的匹配程度会产生影响,对传统的全局最近邻法的匹配函数进行了改进,从而提高了数据关联的准确性和鲁棒性。
     本文综合以上各个部分,在PC机上搭建了视频监控算法平台,对室内与户外环境进行了大量实验与分析。实验结果表明了上述算法的有效性。
Multi-object tracking is a hot research field in video surveillance. Particularly, inrecent years, video surveillance system has played an increasing important role, whichis widely used in homes, car parks, public places, banks and some other places for real-time monitoring. In this paper, we present a detection and tracking integrated videosurveillance system that is used to monitor the outdoor or indoor scenes occasions witha fixed color cameras. This system is able to automatically track varying number oftargets and automatically complete the initialization and termination of the track.
     Based on the integration of motion detection and object tracking, our video surveil-lance system is divided into four parts: motion detection module, clumps detectionmodule, tracking module and the trajectory generated modules. The main contents ofthis thesis are as follows:
     Motion detection part: first of all, we analysis the codebook based backgroundsubtraction algorithm in detail. We assume that pixels follow a Gaussian distribu-tion i the time domain, according to statistics and human vision system, and redesignthe codebook, brightness distorition, update rules and etc. based on the original ap-proach. So that it can detec a more complete foreground object. Secondly, we deeplyconstrue Bayesian classification based background subtraction algorithm. Using thecolor-space model in the codebook algorithm, we propose a new thresholding method,which is helpful for removing the moving shadows to some extent.
     Target tracking part: based on motion detection we propose a framework to ad-dress multi-object tracking problem, that is to decompose multi-object tracking taskinto the combination of multiple single-object tracking tasks. On the one hand, thesingle-object tracking problem could be viewed as a optimal state estimation problem.We study how to utilize particle filter for multi-target tracking in detail. Accordingly, we describe the object tracking related algorithms and implementation. In order toimprove tracking accuracy and efficiency, we propose to use a combination of colorand motion feature as the likelihood function and to use MCMC after particle re-sampling.On the other hand, the single-object tracking problem could also be viewedas a classification problem. Then we study how to use On-line Boosting to achieveobject tracking, and present the analysis and implementation of On-line Boosting al-gorithm, Absolute Haar features, Haar-like features, the weak classifier design, On-line Boosting Tracking processing. Considering the distance, scale, speed and motiondirection of tracker-observation pair would have an impact on the match degree oftracker-observation pair, we propose an improvement on the traditional global near-est neighbor matching function, thereby increase the accuracy and robustness of dataassociation.
     Based on the above, we build a video surveillance platform on PC, which inte-grates a variety of algorithms and technique, and execute a large number of experi-ments and analysis in indoor and outdoor environment. Experimental results show theeffectiveness of our system.

引文

[1]侯侯志强,韩崇昭.视觉跟踪技术综述[J].自动化学报, 2006, 32(4):603–617.
    [2] HU W, TAN T, WANG L, et al. A survey on visual surveillance of object motionand behaviors[J]. IEEE Transactions on Systems, Man and Cybernetics, 2004,34:334–352.
    [3] HARITAOGLU I, HARWOOD D, DAVIS L S. W4: Real-time surveillance of peo-ple and their activities[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 2000, 22(8):809–830.
    [4] WREN C R, AZARBAYEJANI A, DARRELL T, et al. Pfinder: real-time track-ing of the human body[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 1997, 19:780–785.
    [5] COLLINS R T, LIPTON A J, KANADE T, et al. A system for video surveillanceand monitoring[R]. Pittsburgh, PA: Robotics Institute, 2000.
    [6] GRIMSON W E L, STAUFFER C, ROMANO R, et al. Using adaptive trackingto classify and monitor activities in a site[C]//CVPR’98: Proceedings of theIEEE Computer Society Conference on Computer Vision and Pattern Recogni-tion. Washington, DC, USA: IEEE Computer Society, 1998:22.
    [7] BOGAERT M, CHELQ N, CORNEZ P, et al. The PASSWORDS project [intelli-gent video image analysis system][C]//Proc. International Conference on ImageProcessing. Lausanne, Switzerland: IEEE Computer Society, 1996,3:675–678.
    [8] PICCARDI M. Background subtraction techniques: a review[C]//Proc. of IEEESMC 2004 International Conference on Systems, Man and Cybernetics. TheHague, The Netherlands: IEEE Computer Society, 2004,4:3099–3104.
    [9] LIPTON A J, FUJIYOSHI H, PATIL R S. Moving target classification and track-ing from real-time video[C]//Applications of Computer Vision, 1998. WACV’98. Proceedings., Fourth IEEE Workshop on. Washington, DC, USA: IEEEComputer Society, 1998:8–14.
    [10] BARRON J L, FLEET D J, BEAUCHEMIN S S. Performance of optical ?owtechniques[J]. Int. J. Comput. Vision, 1994, 12(1):43–77.
    [11] FRIEDMAN N, RUSSELL S. Image segmentation in video sequences: a prob-abilistic approach[C]//Thirteenth Conf. on Uncertainty in Artificial Intelligence..[S.l.]: [s.n.] , 1997:175–181.
    [12] STRINGA E. Morphological change detection algorithms for surveillance appli-cations[C]//BMVC’00. Bristol, UK: British Machine Vision Association, 2000.
    [13] KUNO Y, WATANABE T, SHIMOSAKODA Y, et al. Automated detection ofhuman for visual surveillance system[C]//ICPR’96: Proceedings of the Interna-tional Conference on Pattern Recognition (ICPR’96) Volume III-Volume 7276.Washington, DC, USA: IEEE Computer Society, 1996:865.
    [14] CUTLER R, DAVIS L S. Robust real-time periodic motion detection, analysis,and applications[J]. IEEE Trans. Pattern Anal. Mach. Intell., 2000, 22(8):781–796.
    [15] LIPTON A. Local application of optic ?ow to analyse rigid versus non-rigidmotion[R]. Pittsburgh, PA: Robotics Institute, Carnegie Mellon University, 1999.
    [16] STAUFFER C. Automatic hierarchical classification using time-based co-occurrences[J]. Computer Vision and Pattern Recognition, IEEE Computer So-ciety Conference on, 1999, 2:2333.
    [17] WELCH G, BISHOP G. An introduction to the kalman filter[R]. Chapel Hill,NC, USA: University of North Carolina at Chapel Hill, 1995.
    [18] DOUCET A, DE FREITAS N, GORDON N. An introduction to sequential montecarlo methods[M]. New York: Springer-Verlag, 2001.
    [19] BOBICK A, WILSON A. A state-based technique for the summarization andrecognition of gesture[J]. Computer Vision, IEEE International Conference on,1995, 0:382.
    [20] WILSON A D, BOBICK A F, CASSELL J. Temporal classification of naturalgesture and application to video coding[C]//CVPR’97: Proceedings of the 1997Conference on Computer Vision and Pattern Recognition (CVPR’97). Washing-ton, DC, USA: IEEE Computer Society, 1997:948.
    [21] BRAND M, OLIVER N, PENTLAND A. Coupled hidden markov models forcomplex action recognition[J]. Computer Vision and Pattern Recognition, IEEEComputer Society Conference on, 1997, 0:994.
    [22] YANG M H, AHUJA N. Extraction and classification of visual motion patternsfor hand gesture recognition[C]//CVPR’98: Proceedings of the IEEE ComputerSociety Conference on Computer Vision and Pattern Recognition. Washington,DC, USA: IEEE Computer Society, 1998:892.
    [23] IVANOV Y A, BOBICK A F. Recognition of visual activities and interactionsby stochastic parsing[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 2000, 22(8):852–872.
    [24] WADA T, MATSUYAMA T. Multiobject behavior recognition by event drivenselective attention method[J]. IEEE Trans. Pattern Anal. Mach. Intell., 2000,22(8):873–887.
    [25] OWENS J, HUNTER A. Application of the self-organizing map to trajectory clas-sification[C]//VS’00: Proceedings of the Third IEEE International Workshop onVisual Surveillance (VS’2000). Washington, DC, USA: IEEE Computer Society,2000:77.
    [26] ARULAMPALAM M S, MASKELL S, GORDON N, et al. A tutorial on particle fil-ters for on-line nonlinear/non-gaussian Bayesian Tracking[J]. IEEE Transactionson Signal Processing, 2001, 50:174–188.
    [27] BUCY R S, SENNE K D. Digital synthesis of nonlinear filter[J]. Automatic,2002, 7:287–298.
    [28] JULIER S, UHLMANN J. A new extension of the kalman filter to nonlinear sys-tems[C]//Society of Photo-Optical Instrumentation Engineers (SPIE) ConferenceSeries.[S.l.]: SPIE, 1997,3068:182–193.
    [29] ISARD M, BLAKE A. CONDENSATION - conditional density propagation forvisual tracking[J]. International Journal of Computer Vision, 1998, 29:5–28.
    [30] GORDON N J, SALMOND D J, SMITH A F M. Novel approach to nonlinear/non-gaussian bayesian state estimation[J]. Radar and Signal Processing, IEE Pro-ceedings F, 1993, 140(2):107–113.
    [31] LIN S H, KUNG S Y, LIN. Face recognition/detection by probabilistic decision-based neural network[J]. IEEE Transactions on Neural Networks, 1997,8(1):114–132.
    [32] LIU J S, CHEN R. Sequential monte carlo methods for dynamic systems[J].Journal of the American Statistical Association, 1998, 93:1032–1044.
    [33] KITAGAWA G. Monte carlo filter and smoother for non-gaussian nonlinearstate space models[J]. Journal of Computational and Graphical Statistics, 1996,5(1):1–25.
    [34] DOUCET A, GODSILL S, ANDRIEU C. On sequential monte carlo samplingmethods for bayesian filtering[J]. Statistics and Computing, 2000, 10(3):197–208.
    [35]李安平.复杂环境下的视频目标跟踪算法研究[D].上海:上海交通大学,2006.
    [36] DRUCKER H, CORTES C, JACKEL L D, et al. Boosting and other ensemblemethods[J]. Neural Comput., 1994, 6(6):1289–1301.
    [37] SCHAPIRE R E. A brief introduction to boosting[C]//IJCAI’99: Proceedingsof the Sixteenth International Joint Conference on Artificial Intelligence. SanFrancisco, CA, USA: Morgan Kaufmann Publishers Inc., 1999:1401–1406.
    [38] GRABNER H, BISCHOF H. On-line boosting and vision[C]//Proc. IEEE Com-puter Society Conference on Computer Vision and Pattern Recognition. Wash-ington, DC, USA: IEEE Computer Society, 2006,1:260–267.
    [39] RAO B. Data association methods for tracking systems[J]. 1993:91–105.
    [40] BAR-SHALOM Y, FORTMANN T E. Tracking and data association[M]. Mathe-matics in Science and Engineering, vol. 179. San Diego, CA, USA: AcademicPress Professional, Inc., 1987.
    [41] GELGON M, BOUTHEMY J P, P.AND LE CADRE. Recovery of the trajectories ofmultiple moving objects in an image sequence with a PMHT approach[J]. Imageand Vision Computing Journal, 2004, 23(1):19–31.
    [42] STREIT R L, LUGINBUHL T E. Maximum likelihood method for probabilis-tic multihypothesis tracking[C]//Society of Photo-Optical Instrumentation Engi-neers (SPIE) Conference Series.[S.l.]: SPIE, 1994,2235:394–405.
    [43]邹建华.基于新的阈值化方法的背景减法改进[J].自动化学报, 2009年,第35卷(第4期):394–400.
    [44] TOYAMA K, KRUMM J, BRUMITT B, et al. Wall?ower: principles and practiceof background maintenance[J]. Seventh International Conference on ComputerVision, 1999, 1:255–261.
    [45] STAUFFER C, GRIMSON W E L. Adaptive background mixture models for real-time tracking[C]//Computer Vision and Pattern Recognition, 1999. IEEE Com-puter Society Conference on. Fort Collins, CO, USA: IEEE Computer Society,1999,2:252.
    [46] KAEWTRAKULPONG P, BOWDEN R. An improved adaptive background mix-ture model for realtime tracking with shadow detection[C]//In Proc. 2nd Eu-ropean Workshop on Advanced Video Based Surveillance Systems, AVBS01,VIDEO BASED SURVEILLANCE SYSTEMS: Computer Vision and Dis-tributed Processing.[S.l.]: Kluwer Academic Publishers, 2001:149–158.
    [47] ELGAMMAL A, HARWOOD D, DAVIS L. Non-parametric model for back-ground subtraction[C]//ECCV’00: Proceedings of the 6th European Conferenceon Computer Vision-Part II. London, UK: Springer-Verlag, 2000:751–767.
    [48] KIM K, CHALIDABHONGSE T H, HARWOOD D, et al. Real-time foreground-background segmentation using codebook model[J]. Real-Time Imaging, 2005,11(3):172– 185. Special Issue on Video Object Processing.
    [49] PARAG T, ELGAMMAL A, MITTAL A. A framework for feature selection forbackground subtraction[C]//CVPR’06: Proceedings of the 2006 IEEE ComputerSociety Conference on Computer Vision and Pattern Recognition. Washington,DC, USA: IEEE Computer Society, 2006:1916–1923.
    [50] HUANG S S, FU L C, HSIAO P Y. Region-level motion-based background mod-eling and subtraction using MRFs[J]. IEEE Transactions on Image Processing,2007, 16:1446–1456.
    [51] SHEIKH Y, SHAH M. Bayesian modeling of dynamic scenes for object detec-tion[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2005,27(11):1778–1792.
    [52] WANG Y, LOE K F, WU J K. A dynamic conditional random field model forforeground and shadow segmentation[J]. IEEE Transactions on Pattern analysisand Machine Intelligence, 2006, 28(2):279–289.
    [53] SUN J, ZHANG W, TANG X, et al. Background cut[M].[S.l.]: [s.n.] , 2006:628–641.
    [54] LI L, HUANG W, GU I Y, et al. Foreground object detection from videoscontaining complex background[C]//MULTIMEDIA’03: Proceedings of theeleventh ACM international conference on Multimedia. New York, NY, USA:ACM, 2003:2–10.
    [55] ROSIN P. Thresholding for change detection[C]//Proc. Sixth InternationalConference on Computer Vision. Bombay, India: IEEE Computer Society,1998:274–279.
    [56] DOCKSTADER S L, TEKALP A M. Tracking multiple objects in the presenceof articulated and occluded motion[C]//HUMO’00: Proceedings of the Work-shop on Human Motion (HUMO’00). Washington, DC, USA: IEEE ComputerSociety, 2000:88.
    [57] RASMUSSEN C. Joint likelihood methods for mitigating visual tracking dis-turbances[C]//In Proceedings of the IEEE Workshop on Multi-Object Tracking.Washington, DC, USA: IEEE Computer Society, 2001:69–76.
    [58] HUE C, LE CADRE J P, PEREZ P. Tracking multiple objects with particlefiltering[J]. Aerospace and Electronic Systems, IEEE Transactions on, 2002,38(3):791–812.
    [59] ISARD M, MACCORMICK J. BraMBLe: a bayesian multiple-blobtracker[C]//Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE In-ternational Conference on. Vancouver, BC, Canada: IEEE Computer Society,2001,2:34–41.
    [60] VERMAAK J, DOUCET A, PEREZ P. Maintaining multimodality through mix-ture tracking[C]//Computer Vision, 2003. Proceedings. Ninth IEEE InternationalConference on.[S.l.]: IEEE Computer Society, 2003,2:1110–1116.
    [61] AVIDAN S. Ensemble tracking[J]. IEEE Trans. Pattern Anal. Mach. Intell., 2007,29(2):261–271.
    [62]张波.基于粒子滤波的图像跟踪算法研究[D].上海:上海交通大学, 2007.
    [63] NUMMIARO K, KOLLER-MEIER E, GOOL L V. A color-based particle fil-ter[C]//. .[S.l.]: [s.n.] , 2002:53–60.
    [64] NUMMIARO K, KOLLER-MEIER E, VAN GOOL L. Object tracking with anadaptive color-based particle filter[C]//Proceedings of the 24th DAGM Sympo-sium on Pattern Recognition. London, UK: Springer-Verlag, 2002:353–360.
    [65] PE′REZ P, HUE C, VERMAAK J, et al. Color-based probabilistic track-ing[C]//ECCV’02: Proceedings of the 7th European Conference on ComputerVision-Part I. London, UK: Springer-Verlag, 2002:661–675.
    [66] THACKER N A, AHERNE F J, ROCKETT P I. The bhattacharyya metric as anabsolute similarity measure for frequency coded data.[J]. Kybernetika, 1998,34(4):363–368.
    [67]廖雪超.基于粒子滤波和背景建模的多目标跟踪技术的研究和实现[D].武汉:武汉科技大学, 2006.
    [68] R. I. A particle filter tutorial for mobile robot localization[R]. Montreal, Quebec,Canada: Centre for Intelligent Machines, McGill University, 2004.
    [69] ZHAI Y, YEARY M. Implementing particle filters with metropolis-hastings algo-rithms[C]//Region 5 Conference: Annual Technical and Leadership Workshop,2004.[S.l.]: IEEE Computer Society, 2004:149–152.
    [70] GRABNER H, GRABNER M, BISCHOF H. Real-time tracking via on-line boost-ing[C]//. Edinburgh: [s.n.] , 2006,1:47–56.
    [71] VIOLA P, JONES M. Rapid object detection using a boosted cascade of sim-ple features[C]//Computer Vision and Pattern Recognition, 2001. CVPR 2001.Proceedings of the 2001 IEEE Computer Society Conference on.[S.l.]: IEEEComputer Society, 2001,1:I–511–I–518.
    [72] GODEC M. Robust object tracking using semi-Supervised online boosting[D].Austria: Graz University of Technology, 2008.
    [73] GODEC M, GRABNER H, LEISTNER C, et al. Speeding Up Semi-SupervisedOn-line Boosting for Tracking[C]//Proc. 33rd Workshop of the Austrian Associ-ation for Pattern Recognition, AAPR 2009. .[S.l.]: [s.n.] , 2009.
    [74] BREITENSTEIN M D, REICHLIN F, LEIBE B, et al. Robust tracking-by-Detection using a detector confidence particle filter[C]//IEEE International Con-ference on Computer Vision (ICCV’09). .[S.l.]: [s.n.] , 2009. in press.
    [75] DAR-SHYANG L, HULL J, EROL B. A bayesian framework for gaussian mix-ture background modeling[C]//Image Processing, 2003. ICIP 2003. Proceedings.2003 International Conference on.[S.l.]: IEEE Computer Society, 2003,3:III–973–6 vol.2.
    [76] CAI Y, FREITAS N D, LITTLE J J. Robust visual tracking for multi-ple targets[C]//ECCV. Graz, Austria: Springer Berlin / Heidelberg, 2006,3954/2006:107–118.
    [77] OKUMA K, TALEGHANI A, FREITAS N D, et al. A boosted particle filter:multitarget detection and tracking[C]//ECCV. Prague, Czech Republic: SpringerBerlin / Heidelberg, 2004,3021/2004:28–39.
    [78] WU B, NEVATIA R. Detection of multiple, partially occluded humans in a singleimage by bayesian combination of edgelet part detectors[J]. Computer Vision,IEEE International Conference on, 2005, 1:90–97.
    [79] ANDRILUKA M, ROTH S, SCHIELE B. People-tracking-by-detectionand people-detection-by-tracking[C]//Computer Vision and Pattern Recognition,2008. CVPR 2008. IEEE Conference on. Anchorage, Alaska, USA: IEEE Com-puter Society, 2008:1–8.
    [80] HUANG C, WU B, NEVATIA R. Robust object tracking by hierarchical as-sociation of detection responses[C]//ECCV’08: Proceedings of the 10th Eu-ropean Conference on Computer Vision. Berlin, Heidelberg: Springer-Verlag,2008:788–801.
    [81] KONSTANTINOVA P, UDVAREV A, SEMERDJIEV T. A study of a target trackingalgorithm using global nearest neighbor approach[C]//CompSysTech’03: Pro-ceedings of the 4th international conference conference on Computer systemsand technologies. New York, NY, USA: ACM, 2003:290–295.
    [82] SENIOR A, HAMPAPUR A, TIAN Y L, et al. Appearance models for occlusionhandling[J]. Image and Vision Computing, 2006, 24(11):1233–1243.
    [83] BOURGEOIS F, LASSALLE J C. An extension of the munkres algorithm forthe assignment problem to rectangular matrices[J]. Commun. ACM, 1971,14(12):802–804.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700