基于Latent SVM的人体目标检测与跟踪方法研究

英文题名：Study on Human Target Detection and Tracking Based on Latent SVM
作者：胡振邦
论文级别：博士
学科专业名称：地学信息工程
中文关键词：Latent ; SVM ; 级联Latent ; SVM ; 均值飘移算法 ; 差分演化 ; 自组织映射背景分离算法 ; RJMCMC
英文关键词：Latent SVM ; cascade Latent SVM ; mean-shift ; Differential evolution ; self-orgnized mapping background subtraction ; RJMCMC
学位年度：2013
导师：蔡之华
学科代码：081802
学位授予单位：中国地质大学
论文提交日期：2013-05-01

摘要

人体目标检测与跟踪算法是计算机视觉的研究热点,其在智能交通、城市安防、人机交互、智能机器人、视频图像分析和电子娱乐等方面都有着广泛的应用。近年来,随着城市物联网的发展,人体目标检测与跟踪算法得到了越来越多的关注。本文将人体目标检测与跟踪问题分解为图像检测、背景分离、图像跟踪与路径优化等四个方面展开讨论。
     在图像检测方面,本文的主要研究方法是Latent SVM图像检测方法。在Latent SVM的模型训练过程中,首先对训练样本进行聚类,然后根据样本的聚类结果构建多视角的观测模型。另外,不同于一般的SVM模式识别算法,Latent SVM在一般SVM模型的基础上自动生成隐变量特征并添加到模型。隐变量具有位移和外观双重属性,因此在人体目标检测问题中,训练获取的Latent SVM检测模型可包含多个隐变量。由于隐变量可以近似地理解为人体的局部外观特征,如头部、躯体和四肢等。多模型的构建与隐变量的自动生成使得Latent SVM成为目前最好的图像检测算法之一。本文对Latent SVM方法的研究包括训练模型与检测流程两个方面。
     背景分离算法是视频图像检测与跟踪系统中的一种辅助方法。该方法可以提升整个系统的处理速度和精度。背景分离算法的作用是有效地去除视频图像序列的背景区域。背景部分是指图像序列中相对静止的部分,例如摇曳的树叶、转动的风扇以及移动目标的阴影等部分。在使用Latent SVM进行图像检测时,需要计算整张图像的梯度方向直方图(HOG)特征金字塔图。由于目标检测耗时与扫描区域成倍增长,因此对于一些实时性要求较高的视频图像检测系统,可以利用时间上相邻的图像相关信息快速剔除掉大部分的背景区域,减小扫描区域以提升检测速度。由于仅在前景移动区域内检测,因此该方法能极大地减小误检率。本文先后介绍了多种背景分离算法,如多元高斯混合模型算法、编码法、以及自组织映射背景分离算法。综合这些方法的优点,本文提出了改进算法。
     与图像检测算法相比较,图像跟踪除了需要确定跟踪目标的位置,还需要画出跟踪目标的移动轨迹。此外,图像跟踪算法还具有一定的连续性和自动性,能够弥补图像检测算法中的一些遗漏检测。本文提出了一种改进的Mean-Shift图像跟踪算法。该算法与背景分离算法结合,能够精确定位目标的跟踪位置并提升处理速度,且仅需要带入初始跟踪区域即可自动完成图像跟踪。
     当跟踪目标的位置发生堆叠或遮挡时,图像跟踪算法不可避免地会产生跟踪错误或丢失。跟踪路径优化算法就是要消除这些错误。结合初始跟踪对象的位置、外观信息,可以通过设计优化函数进行函数优化,从而实现多目标跟踪轨迹优化。本文提出一种改进的基于可逆转跳变马尔科夫链的蒙特卡洛优化多目标跟踪算法(reversible jump Markov chain Monte Carlo-RJMCMC)能够在较高检测正确率的情况下有效地对初始跟踪路径进行优化。初始目标跟踪轨迹由图像跟踪算法获取。在使用跟踪路径优化算法之前,首先需使用背景分离算法获取前景移动区域,然后分别对场景中的前景区域进行Latent SVM检测并对检测结果进行验证获得跟踪对象。这种处理方式能够最大化地减少误检率,并极大地简化了跟踪路径优化问题。
     综上所述,本文设计并尝试实现了一套完整的人体目标检测与跟踪方案,并针对各个组成模块的缺陷与不足进行了改进研究。本文的主要工作概括如下：
     1) Latent SVM模型训练算法的改进。由于人体目标图像具有多变性,因此隐变量的自动生成对模型的训练至关重要。在原始的Latent SVM模型训练方法中,首先根据样本图像的HOG特征由SVM算法获得简单的分类模板,然后再对分类模板使用贪心算法自动生成隐变量。为了获得更好的训练模型,本文提出一种结合Mean-shift与差分演化的图像分割算法自动生成隐变量特征。本文提出的新方法综合考虑了样本集图像的纹理分布特性自动搜索局部特征隐变量,从而获得更好的检测模型并最终提升检测性能。
     2) Latent SVM人体目标检测算法的改进。原始Latent SVM图像检测算法在进行目标检测时首先需构建待检测图像的HOG特征金字塔,然后将检测模型与HOG特征金字塔分别进行卷积运算,最后通过卷积得分与金字塔层数确定检测目标位置。级联Latent SVM图像检测算法是在原始Latent SVM图像检测算法的基础上进行的改进。首先使用PCA对样本集HOG特征进行分析,同时对检测模型与待检测图像的HOG特征金字塔进行降维。采用级联Latent SVM进行目标检测,再将降维后的检测模型和特征金字塔进行卷积,然后仅选择大于指定阈值的特定位置进行后续判断,即后续判断则是对原始检测模型和特征金字塔的卷积得分进行判断。级联Latent SVM方法的优点是能够快速的过滤掉图像中的非人体目标。为了进一步提升级联方法的性能,本文提出使用LDA方法分析样本集HOG特征并获得降维向量。另外,改进了级联Latent SVM提出一种改进的隐变量局部搜索策略,最后提出对隐变量进行网格颜色相似性特征提取,并建模对检测结果进行2次判定,以降低检测虚警率。
     3)提出一种新的自组织映射背景分离算法。经典的多元高斯混合模型及其改进的编码法都以图像中的每一个像素点为基本处理单元,这类方法中的相邻像素间没有任何相关处理,难以适应场景中存在的变化,分离结果具有较大的虚警。而自组织映射有效地解决了场景中各个像素之间的信息关联,并对场景具有较好的适应能力。但是该方法编码长度固定,需要人工干预指定编码长度,当场景突变时无法对码本进行即时修改。为此,本文提出一种结合编码法与自组织映射将相邻像素进行关联的新方法,该方法中每一个像素的背景编码长度能够根据具体情况进行自动变换,最后该方法为基础对红外线数据和彩色影像数据进行了融合,并对阴影进行了有效的去除。
     4)将Mean Shift算法与背景分离算法相结合给出了图像序列中多个移动目标—行人跟踪的新算法。结合背景分离算法对经典的Mean Shift算法进行了两点改进：第一,提取移动目标的有效区域,然后使用特征向量的相关函数作为跟踪对象的定位标准；第二,结合背景分离结果对跟踪区域进行快速修正。在多目标跟踪问题中,与经典的Mean-Shift算法相比,改进算法在耗时、鲁棒性和跟踪精度方面均有更好的性能。
     5)提出一种改进多目标跟踪路径优化算法。即使在极高检测正确率的情况下,当跟踪目标的位置发生堆叠或遮挡时,图像跟踪算法不可避免地会产生跟踪错误或丢失。本文设计的人体目标检测与跟踪系统首先使用背景分离算法获取前景移动区域,再采用级联Latent SVM进行目标检测,再由颜色相似性判定分类获取检测结果集合：最后将检测结果集合与本文提出的改进的Mean Shift算法结合获取初始跟踪轨迹。此时的初始跟踪结果能够确保极高的跟踪位置精度。针对这种情况,本文提出的改进的多目标跟踪路径优化算法对一般的优化算法进行了简化,主要包括优化公式的简化与优化策略的简化。简化后的优化算法不再采用裁剪、增长、添加、移出等策略对跟踪对象的移动轨迹进行等优化,仅采用分割、合并策略对跟踪轨迹进行优化。
     综上所述,本文概述并分析了人体目标检测与跟踪相关算法,并指出了各组成模块的不足。重点研究了基于Latent SVM的模型优化算法、级联Latent SVM图像检测算法、自组织映射背景分离算法、Mean-Shift图像跟踪算法、RJMCMC多目标跟踪优化算法。在研究中通过试验证明了各方法的有效性。
Human target detection and tracking is an important research area with many applications such as Intelligent Transportation System, intelligent video surveillance, advanced human-machine interface, intelligent robot, video analysis and electronic entertainment. With the development of the Internet of Things, the system to solve human target detection and tracking is more popular.In this thesis, the problem is decomposed into image detection, background subtraction, image tracking and path optimization which are discussed separately.
     Latent SVM (support vector machine) is the major research method for image detection in this thesis. In the training process of Latent SVM, the training image samples are clustered into different subsets according to the aspect ratio. To adapt to the different observation angle of view, different component of the model are trained from these subsets synthetically. Besides this, a futher analysis is adaptived for the SVM model to automaticly extract latent variable. Latent SVM is better and more complicate than general SVM algorithm. The latent variable has displacement and appearance information. A Latent SVM model of human can has dozens of latent variable. These latent variables could be considered as locally hunman object characters, such as head, body, arms and legs.The muil-component and latent variables make the Latent SVM be the one of the best image detection algorithms. In this thesis, the research about Latent SVM involved bath training and detection progress.
     As an assisted method, background subtraction algorithm could be utilized in the system to improve the accuracy and speed. The goal of background separation algorithm is effectively remove the background area of the video image sequence such as tresss waving in the wind, fans turning, shadow of the moving target, et cetera. The histograms of oriented gradients (HOG) pyramid of the detected images are utilized in the Latent SVM to detect the human target. Detection time consuming is increasing with the scanning area. For the real-time video image detection system, information of the time adjacent images could be utilized to quickly remove out most of the background region and improve detection speed. The scanning area is only focus on the foreground moving region, and the detection error rate of is greatly reduced. In this thesis, a variety background subtraction algorithm is introduced, such as multivariate gaussian mixture model algorithm, coding method, and self-organizing mapping background separation algorithm. Integrated the advantages of these methods, an improved algorithm is proposed.
     Compared with the image detection algorithm, image tracking algorithm needs to find out the location of the target and the track of the target movement path. Image tracking algorithm has certain continuity and automaticity, and be able to make up some missing detection of image detection algorithm. An improved Mean-Shift image tracking algorithm is introduced in this thesis. The improved algorithm is combined with background subtraction result which could help to get a better tracking position and speed. Inputing the initial track area, the algorithm can automatically complete tracking task.
     When the tracking target conceal by the other target or background, image tracking algorithm would inevitably produce tracking error or missing.Tracking path optimization algorithm could help to make up this problem. Combined with location and appearance information of the tracking targets, a function could be desiged and optimal tracking path. An improved reversible jump Markov chain Monte Carlo-RJMCMC algorithm is introduced in this thesis. The improved algorithm can get better performance in high detection accuracy. Before using tracking path optimization algorithm, the first step is the background separation to get foreground region, then Latent SVM is applyed to detect human targets in the foreground area. And finally, the initial tracking path is obtained from image tracking algorithm. These processes can maximize reduce error detection rate, and greatly simplifies the tracking path optimization problem.
     In this thesis, complete human target detection and tracking system prototype is designed. Consider the defect and deficiency in each module, an improvement research is conducted. The main contributions of this thesis are as follows.
     1) Latent SVM model training algorithm is improved. Human target images have large variability and the initialization of latent variables in the latent SVM model is very important. In the original Latent SVM model training process, a simple classification template is trained from sample HOG features by SVM algorithm. Then a greedy algorithm is conducted with the classification template to obtain the hidden variables. In order to get a better training model, a new image segmentation algorithm based on Mean-Shift and differential evolution algorithm is proposed to generate better latent variable. The propose method taking into account texture distribution feature of the positive sample set image and automatically search local characteristics of hidden variables for a better detection model and performance.
     2) The detection algorithm of Latent SVM is improved. The HOG pyramid of the detected images are utilized in the Latent SVM to detect the human target.The original detection algorithm of Latent SVM is conducted by convolution of the HOG pyramid and Latent SVM model. And the location and size of the target can be obtained from the convolution score and pyramid level.The cascade Latent SVM detection algorithm is a fast detection algorithm based on original algorithm. Firstly, PCA is applied to the sample set HOG features, and a dimension reduced HOG pyramid is obtained.The cascade Latent SVM detection algorithm utilize the dimension reduced HOG pyramid to make convolution with the Latent SVM model and only select appropriate loations with large convolution scores for further analysis.And the further analysis is the original Latent SVM detection within the special location. The cascade Latent SVM detection algorithm could fast filter out none human target area within the image.To further improve the performance the cascade Latent SVM detection algorithm, a dimension reduction algorithm based on LDA is introduced in this thesis. Besides, a new latent variables locally search algorithm is also introduced. Finally, in order to reduce the detection false alarm rate, a second decision model based on color similarity of the latent variables is constructed.
     3) An improved self-organizing map background separation algorithm is proposed. Classic multivariate gaussian mixture model and coding method both treat each pixel of the image as the basic processing unit. This kind of methods did not make a correlation processing between adjacent pixels, and sometimes can not well adapt to the changing of the scene, and the separation result may has high false alarm. Self-organizing map can effectively solve the information connection between each pixel in the scene, and has a better ability to adapt the scence changing. However, this method requires manual intervention to fixed code length and the code book can not modified according to the scene mutation. A new algorithm combined with coding method and self-organizing mapping is proposed in this thesis which builds association of adjacent pixels and enable variable-length code book. Finally, a improved algorithm based on this algorithm and SVM is introduced and applied in the fusion of infrared and color image data to effectively remove shadow.
     4) Mean-Shift and background subtraction algorithm are used together to track multiple people in image sequences. It has tow contributions. Firstly, the moving targets area is extracted effectively and the feature vector correlation value is utilized as the measure for the tracking accuracy. Secondly, a fast region modify progress is conducted based on the foreground-background segmentation result. The improved algorithm has better performance in terms of time consuming, robust and tracking accuracy than the conventional mean shift algorithm.
     5) An improved tracking path optimal algorithm is introduced is this thesis. When the tracking target conceal by the other target or background, image tracking algorithm would inevitably produce tracking error or missing. The human target detection and tracking system firstly utilized the background subtraction algorithm get the foreground area. Then, an improved cascade Latent SVM algorithm is implemented in the foreground area to get the human targets and checked by the color similarity model. The initial tracking path is obtained from the detection results and Mean Shift algorithm. These processes could make sure a high detection accurance. In this special situation, compared with original tracking path optimal algorithm, improved algorithm can get better perfoemance. The improved algorithm involved a simpler optimal function and simpler optimal stratiage. The simpler optimal stratiage drops out decrease, increase, add, delete optimal stratiage, only keep merge and splite to get a better performance.
     In summery, through analysis of the human target detection and tracking related algorithms, this thesis pointed out the drawbacks of the related algorithms. Based on the above-mentioned analysis, this thesis mainly studied:the training algorithm of Latent SVM, cascade Latent SVM detection algorithm, self-orgnized mapping background subtraction algorithm, mean-shift image tracking algorithm, RJMCMC multi-target tracking optimal algorithm. In addition, for each improved algorithms, the experimental study was conducted to validate its performance.

引文

[1]S Andrews, I Tsochantaridis, T Hofmann. Support vector machines for multiple-instance learning[J], in Advances in Neural Information Processing Systems,2003,15:561-568.
    [2]赵敏.单目视觉多人体目标目标检测与技术研究[D],重庆：重庆大学,2010.
    [3]贾春华.智能视频监控中的人体目标检测与运动分析研究[D],大连：大连理工大学,2008.
    [4]许言午,曹先彬,乔红.行人检测系统研究新进展及关键技术展望[J],电子学报,2008,36(5):962-968.
    [5]尹宏鹏.基于计算机视觉的运动目标跟踪算法研究[D].重庆：重庆大学,2009.
    [6]云廷进.红外人体目标检测和跟踪方法研究[D],重庆：重庆大学,2008.
    [7]N Dalal, B Triggs. Histograms of Oriented Gradients for Human Detection, Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition,2005[C],886-893.
    [8]Fengliang Xu, Xia Liu, Kikuo Fujimura. Pedestrian Detection and Tracking With Night Vision[J], IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS,2005, 6:63-71.
    [9]S M Smith, J M Brady. SUSAN-a new approach to low level image processing[J], International Journal of Computer Vision,1997,23:45-78.
    [10]Hao Sun, Cheng Wang, BoliangWang, et al. Pyramid binary pattern features for real-time pedestrian detection from infrared videos[J], Neurocomputing,2011,74:797-804.
    [11]James W. Davis, Vinay Sharma. Background-subtraction using contour-based fusion of thermal and visible imagery[J], Computer Vision and Image Understanding,2007,106:162-182.
    [12]K. Kim, T. H. Chalidabhongse, D. Harwood, et al. Real-time foreground-background segmentation using codebook model[J], Real-Time Imaging 2005,11:167-256.
    [13]WAN Ying, HAN Yi, LU Han-qing. The Methods for Moving Object Detection [J]. Computer Simulation.2006.23:221-226.
    [14]H. Zhang, A. Berg, M. Maire, et al. SVM-KNN:Discriminative Nearest Neighbor Classification for Visual Category Recognition, Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition,2006[C].
    [15]Y Freund, R E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Proc. European Conf. Computational Learning Theory,1997[C]:23-37.
    [16]P Viola, M Jones, D. Snow, Detecting Pedestrians Using Patterns of Motion and Appearance[J], Int'l J. Computer Vision,2005,63:153-161.
    [17]田广.基于视觉的人体目标检测和跟踪技术的研究[D],上海：上海交通大学,2007.
    [18]武勃,黄畅,艾海舟,等.基于连续Adaboost算法的多视角人脸检测[J],计算机研究与发展,2005,42：1612-1621.
    [19]B Wu, R Nevatia. Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors, IEEE International Conference on Computer Vision,2005[C]:90-97.
    [20]A Mohan, C Papageorgiou, T Poggio. Example-Based Object Detection in Images by Components[J], IEEE Transaction On Pattern Analysis and Machine Intelligence,2001, 23:349-361.
    [21]K Mikolajczyk, C Schmid, A Zisserman. Human Detection Based on a Probabilistic Assembly of Robust Part Detector, ECCV 2004[C], Vol I:69-82.
    [22]束志林,戚飞虎.一种新的随机hough快速圆检测算法[J],计算机工程,2003,29(6): 87-88,110
    [23]于海滨,刘济林.基于中心提取的RHT在椭圆检测中的应用[J].计算机辅助设计与图形学学报,2007,19(9)：1107-1113.
    [24]卢湖川,张继霞,张明修.基于Hough变换头部检测与跟踪的方法研究[J].系统仿真学报.2008,20(8):2127-2132.
    [25]王长军,朱善安.基于颜色和变形模版的实时人体检测[J].中国图象图形学报.2006,11(6)：861-866.
    [26]赵灵芝,李伟生.一种基于LTP特征的图像匹配方法[J].计算机应用研究,2006,26(10)：3983-3985.
    [27]James W Davis, Mark A Keck. A Two-Stage Template Approach to Person Detection in Thermal Imagery, Workshop on Applications of Computer Vision,2005[C],5-7.
    [28]A K Jain, R P W Duin, J Mao. Statistical Pattern Recognition:A Review[J], IEEE Trans. Pattern Analysis and Machine Intelligence, Jan.2000,22(1):4-37.
    [29]M Szarvas, A Yoshizawa, M Yamamoto, et al, Pedestrian Detection with Convolutional Neural Networks[J], Proc. IEEE Intelligent Vehicles Symp,2005, pp.223-228.
    [30]S Munder, D M Gavrila. An Experimental Study on Pedestrian Classification[J], IEEE Trans. Pattern Analysis and Machine Intelligence,2006,28(11):1863-1868.
    [31]Pedro F Felzenszwalb, Girshick, Ross B, et al, Object Detection with Discriminatively Trained Part Based Models[J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010,32(9):1627-1645.
    [32]Pedro F Felzenszwalb, McAllester, David, et al, A discriminatively trained, multiscale, deformable part model,26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR,2008[C].
    [33]Pedro F Felzenszwalb, Girshick, Ross B, et al, Cascade object detection with deformable part models, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2010[C]:2241-2248.
    [34]P Viola and M. Jones. Fast multi-view face detection. Proc. of Computer Vision and Pattern Recognition, In CVPR,2001[C].
    [35]P Dollar, S Belongie, P Perona. The fastest pedestrian detector in the west. In BMVC, 2010[C].
    [36]Rodrigo Benenson, Markus Mathias, Radu Timofte, et al, Pedestrian detection at 100 frames per second. Proc. of Computer Vision and Pattern Recognition, In CVPR,2012[C].
    [37]C. Stauffer, W E L Grimson. Adaptive background mixture models for realtime tracking. Proc. of Computer Vision and Pattern Recognition, In CVPR'99,1999[C].
    [38]Jing Zhong, Stan Sclaroff. Segmenting Foreground Objects from a Dynamic Textured Background via a Robust Kalman Filter, Proceedings of the IEEE International Conference on Computer Vision,2003 [C]:44-50.
    [39]GIANFRANCO DORETTO, ALESSANDRO CHIUSO, YING NIAN WU. Dynamic Textures, International Journal of Computer Vision,2003(51):91-109.
    [40]Lucia Maddalena, Alfredo Petrosino, A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications[J], IEEE TRANSACTIONS ON IMAGE PROCESSING,2008(17).
    [41]Series Bench, J Davis, M. Keck, A two-stage approach to person detection in thermal imagery, In Proc. Workshop on Applications of Computer Vision, January 2005[C].
    [42]Wang Zhiming, Bao Hong, Cooperative Neural Network Background Model for Multi-Modal Video Surveillance,2011[C] Seventh International Conference on Computational Intelligence and Security,249-254.
    [43]D Demirdjian, L Taycher, G Shakhnarovich, et al, Avoiding the "Streetlight Effect": Tracking by Exploring Likelihood Modes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China,2005[C].
    [44]J Deutscher, A Blake, I Reid. Articulated Body Motion Capture by Annealed Particle Filtering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head Island, SC,2000[C].
    [45]Y Wu, T Yu, G Hua. Tracking Appearances with Occlusions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Toronto,2003 [C].
    [46]Cheng-Hao Kuo, Chang Huang, Ram Nevatia, Inter-camera Association of Multi-target Tracks by On-Line Learned Appearance Affinity Models, ECCV 2010-11th European Conference on Computer Vision,2010:383-396.
    [47]Comaniciu D, Ramesh V, Meer P. Real-time tracking of non-rigid objects using Mean Shift. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York:IEEE Press,2000(2):142-149.
    [48]Shi, Yonggang, Karl, et al, Real-time tracking using level sets, Proceedings-2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005(2): 34-41,2005 [C].
    [49]Andriyenko, Anton, Schindler, et al, Multi-target tracking by continuous energy minimization,2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011:1265-1272,2011[C].
    [50]Hai-Xia X, Yao-Nan W, Wei Z, et al, Multi-object visual tracking based on reversible jump Markov chain Monte Carlo, IET Computer Vision,2011(5):282-290.
    [51]Bradski R. Computer vision face tracking for use in a perceptual user interface. Intelligence Technology Journal,1998(2):1-15.
    [52]彭宁嵩,杨杰,刘志,等Mean shift跟踪算法中核函数带宽的自动选取[J].软件学报,2005,16(9)：1542-1 550.
    [53]颜佳,吴敏渊,陈淑珍,跟踪窗口自适应的Mean shift跟踪[J].光学精密工程,,2009,17(10):2606-2611.
    [54]王永忠,梁彦,赵春晖等.基于多特征自适应融合的核跟踪方法[J].自动化学报,2008,34(4)：393-399.
    [55]李良福,冯祖仁,陈卫东等.基于Bhattacharyya系数的由粗到精的核匹配搜索方法[J].模式识别与人工智能.2008,21(4):514-518.
    [56]BLAKE A, YUILLE A. Active Vision [M]. London:The MIT Press,1992.
    [57]S Osher, J Sethian, Fronts propagation with curvature-dependent speed:algorithms based on Hamilton-Jacobi for-mulations[J], Journal of computational physics,1988(79):12-49.
    [58]Zhang, Kai, Applying neighborhood consistency for fast clustering and kernel density estimation, Proceedings-2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005:1001-1007,2005[C].
    [59]Omran, Mahamed G H. Differential evolution methods for unsupervised image classification. 2005 IEEE Congress on Evolutionary Computation, IEEE CEC 2005[C]. Proceedings, 2005:966-973.
    [60]Kenneth V. Price, Rainer M. Storn, Jouni A. Lampinen, Differential Evolution A Practical Approach to Global Optimization[M], Springer 2005.
    [61]I De Falco, A Delia Cioppa, D.Maisto, et al, Differential Evolution as a viable tool for satellite image registration[J], Applied Soft Computing 2008(8):1453-1462.
    [62]Zhenbang Hu,Wenyin Gong, Zhihua Cai, Multi-resolution remote sensing image registration using differential evolution with adaptive strategy selection[J], Optical Engineering 51(10), October 2012:101707-101717.
    [63]Stefan Walk, Nikodem Majer, Konrad Schindler, et al, New Features and Insights for Pedestrian Detection, Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2010:1030-1037.
    [64]N Dalal, B Triggs, C Schmid, Human Detection Using Oriented Histograms of Flow and Appearance, Proc. European Conf. Computer Vision,2006:428-441.
    [65]R Girshick, From Rigid Templates to Grammars:Object Detection with Structured Models[D], Chicago:The University of Chicago, Apr.2012.
    [66]D M Gavrila, S Munder, Multi-Cue Pedestrian Detection and Tracking from a Moving Vehicle[J], Int'l J. Computer Vision,2007(73):41-59.
    [67]B Leibe, E Seemann, B Schiele, Pedestrian Detection in Crowded Scenes, Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition,2005[C]:878-885.
    [68]C Cortes, V Vapnik. Support-vector network. Machine Learning[M],273-297,1995.
    [69]B E Boser, I Guyon, V Vapnik. A training algorithm for optimal margin classifers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, ACM Press, 1992[C]:144-152.
    [70]David Luenberger. Linear and Nonlinear Programming[M]. Addison-Wesley,1984.
    [71]R E Fan, K W Chang, C J Hsieh, et al. LIBLINEAR:A Library for Large Linear Classification, Journal of Machine Learning Research[M/OL],2008(9):1871-1874. Software available at http://www.csie.ntu.edu.tw/-cjlin/liblinear.
    [72]Chih-Chung Chang, Chih-Jen Lin. LIBSVM:a library for support vector machines[M/OL]. ACM Transactions on Intelligent Systems and Technology,2011:1-27. Software available at http://www.csie.ntu.edu.tw/-cjlin/libsvm.
    [73]The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. [M/OL]. Available: http://www.pascalnetwork.org/challenges/VOC/voc2007/.
    [74]The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results. [M/OL]. Available: http://www.pascalnetwork.org/challenges/VOC/voc2008/.
    [75]M Fischler, R Elschlager. The representation and matching of pictorial structures, IEEE Transactions on Computer,1973(22).
    [76]J Zhang, M Marszalek, S Lazebnik, et al. Local features and kernels for classification of texture and object categories:A comprehensive study[J]. International Journal of Computer Vision,2007(73):213-238.
    [77]S Zhu, D Mumford. A stochastic grammar of images[J], Foundations and Trends in Computer Graphics and Vision,2007(2):259-362.
    [78]T Cootes, G Edwards, C Taylor. Active appearance models, IEEE Transactions on Pattern Analysis and Machine Intelligence,2001(23):681-685.
    [79]J Coughlan, A Yuille, C English, et al. Efficient deformable template detection and localization without user initialization[J], Computer Vision and Image Understanding, 2000(78):303-319.
    [80]A Yuille, P Hallinan, D Cohen. Feature extraction from faces using deformable templates[J], International Journal of Computer Vision,1992(8):99-111.
    [81]Y Amit, A Trouve. POP:Patchwork of parts models for object recognition[J], International Journal of Computer Vision,2007(75):267-282.
    [82]D Crandall, P Felzenszwalb, D Huttenlocher. Spatial priors for part-based recognition using statistical models, in IEEE Conference on Computer Vision and Pattern Recognition,2005[C]
    [83]B Leibe, A Leonardis, B Schiele. Robust object detection with interleaved categorization and segmentation[J], International Journal of Computer Vision,2008(77):259-289.
    [84]Ferguson, Thomas S. An inconsistent maximum likelihood estimate[J]. J. Am. Stat. Assoc. 1982(77):831-834.
    [85]Bottou, Leon, Bousquet, et al. The Tradeoffs of Large Scale Learning[J], Advances in Neural Information Processing Systems,2008(20):161-168.
    [86]Bottou, Leon. Online Algorithms and Stochastic Approximations, Online Learning and Neural Networks[M], Cambridge University Press (1998).
    [87]Kiwiel, Krzysztof C. Convergence and efficiency of subgradient methods for quasiconvex minimization[J]. Mathematical Programming (Series A), Berlin, Heidelberg:Springer,2001(90): 1-25.
    [88]Robbins, Herbert, Siegmund, et al, A convergence theorem for non negative almost supermartingales and some applications[J], in Rustagi, Jagdish S., Optimizing Methods in Statistics, Academic Press,1971.
    [89]Wijnhoven, Rob. Fast Training of Object Detection using Stochastic Gradient Descent[C]. IEEE International Conference on Pattern Recognition (ICPR) 2010:424-427.
    [90]K Fukunaga, L Hostetler. The estimation of the gradient of a density function, with applications in pattern recognition[J]. IEEE Trans. Information Theory,1975(21):32-40.
    [91]D Comaniciu, P Meer. Mean shift:A robust approach toward feature space analysis[J]. PAMI,2002(24):603-619.
    [92]D Comaniciu, V Ramesh, P Meer. The variable bandwidth mean shift and data-driven scale selection. In Proc. Int'l Conf. Computer Vision,2001 [C]:438-445.
    [93]B Georgescu, I Shimshoni, P Meer. Mean shift based clustering in high dimensions:A texture classification example.In Proc. IEEE Int'l Conf. Computer Vision,2003[C]:456-463.
    [94]L Greengard, J Strain. The fast Gauss transform[J]. SIAM J. Sci. Computing, 1991(12):79-94.
    [95]A Elgammal, R Duraiswami, L Davis. Efficient nonparametric adaptive color modeling using fast Gauss transform. In Proc. Int'l Conf. Computer Vision and Pattern Recognition, 2001[C]:563-570.
    [96]C Yang, R Duraiswami, N Gumerov, et al. Improved fast Gauss transform and efficient kernel density estimation. In Proc. Int'l Conf. Computer Vision,2003[C]:464-471.
    [97]The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results[M/OL]. [Online]. Available:http://www.pascalnetwork.org/challenges/VOC/voc2006/.
    [98]Felzenszwalb P F, Girshick R B, McAllester D. Discriminatively Trained Deformable Part Models, Release 4[M/OL], unpublished http://people.cs.uchicago.edu/-pff/latent-release4/.
    [99]WANG Zhiming, ZHANG Li, BAO Hong. PNN Based Motion Detection with Adaptive Learning Rate,2009 International Conference on Computational Intelligence and Security, Beijing, China, Dec.11-14,2009[C].
    [100]Atif Ilyas, Mihaela Scuturici, Serge Miguet, Real-time foreground-background segmentation using modified codebook model[J],2009 Advanced Video and Signal Based Surveillance 2009:454-459.
    [101]Kevin Smith, Bayesian Methods for Visual Multi-Object Tracking with Applications to Human Activity Recognition[D], Swiss:Swiss Federal Institute of Technology.2007.
    [102]Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking[J]. IEEE Transactions Pattern Analysis and Machine Intelligence,2003(25):564-575.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700