目标跟踪中在线boosting学习算法的研究

英文题名：Online boosting Learning for Object Tracking
作者：裴玉红
论文级别：硕士
学科专业名称：计算机科学与技术
中文关键词：视频目标跟踪 ; 在线boosting ; 核回归 ; Gabor滤波 ; 张量特征
英文关键词：Video object tracking ; Online boosting ; Kernel recursive ; Gabor filter ; Tensor feature
学位年度：2010
导师：马波
学科代码：081202
学位授予单位：北京理工大学
论文提交日期：2010-06-01
答辩委员会主席：陆耀

摘要

视频目标跟踪是计算机视觉领域的重要研究分支,在许多领域有着重要的应用。最近,基于学习的跟踪算法逐渐引起相关学者的关注,并取得了较好的跟踪效果。具有代表性的是把跟踪看成目标和背景的分类问题,它不是建立复杂的模型来描述目标,而是找到决策边界来区分目标和背景,当目标外观变化时,该方法只需要更新决策边界而不用更新目标外观模型。目前,基于学习的视频目标跟踪算法典型的有Avidan等人提出的Ensemble Tracking和Toufiq Parag提出的基于boosting自适应线性弱分类器,其主要思想是通过在线训练一组线性弱分类器,并使用颜色、亮度等简单特征来区分目标和背景。但是,针对复杂场景下对目标的跟踪,以上所说的方法可能失去有效性。而本文是在基于boosting的视觉目标跟踪算法的基础上,主要从特征和分类器两个方面进行研究。融入基于核函数的分类器,以及Gaobr滤波和张量等特征。主要研究工作和贡献如下:
     1)提出了基于Gabor滤波的在线boosting跟踪算法,Gabor滤波因具有优良的空间局部性和方向选择性,能够提取图像局部区域内多个方向的空间频率和局部结构特征,所以对目标和背景具有较好的判别力,但是直接使用高维Gabor特征在跟踪过程中会影响其速度,针对这一问题,本文考虑了如何对Gabor滤波特征进行有效降维,这里拟采用以下两种方案对Gabor特征进行降维并提取最突出的判别特征:(a)使用局部Gabor滤波器;(b)使用自适应的Gabor滤波器组参数。把降维后的Gabor特征同在线boosting的跟踪框架结合起来,来实现对目标的跟踪。
     2)提出了基于张量特征的在线boosting跟踪算法,张量模式作为传统向量模式的扩展和补充,近年来已引起机器学习、模式识别等领域的广泛关注。张量特征能够提取物体梯度方向的特征,对纹理特征较强的目标具有较好的区分能力。所以,本文把张量特征同在线boosting的跟踪框架结合起来,对纹理特征较强的目标,具有较好的跟踪效果。
     3)提出了基于最小平方核回归的在线boosting跟踪算法,在复杂的跟踪场景中,线性分类器往往无法提供较好的分类结果,所以,这里使用基于最小平方核回归的分类器来代替线性分类器,其主要思想是:通过使用Mercer核函数,把低维空间线性不可分的模式通过非线性映射到高维特征空间来实现线性可分。为了解决使用核函数分类将会产生较高维数分类器的问题,这里使用在线稀疏算法,通过选取部分样例来训练得到基于最小平方核回归的分类器,然后把最小平方核回归算法同基于boosting的跟踪框架结合起来。实验表明,该方法能够准确的对复杂场景中的目标进行跟踪。
     特征选取和分类器设计是模式识别领域两个重要的方向,本文从特征和分类器两个方面进行研究,首先,选取一些具有判别力的特征,本文选取Gabor滤波和张量特征,然后,使用基于核函数的分类器代替线性分类器,和传统的算法相比,本算法对复杂的跟踪场景,能够取得较精确的跟踪成果,并且能够达到稳定、实时的跟踪,实验结果验证了该算法的有效性。
Video object tracking is the important research branch in computer vision, and has applications in many fields. Recently, visual tracking based on learning has caused many scholars’attention, since it can achieve good tracking performance. The representative method treats tracking as a classification problem between object and background. Instead of building complex model to describe the visual object, this method intends to find a decision boundary between object and background. When the appearance of object changes, it only needs to update decision boundary, rather than the object appearance model. Currently, the representative tracking algorithms based on learning include Ensemble tracking and Adaptive linear weak classifiers boosting for online learning which proposed by Avidan and Tougiq Parag respectively. The basic principle behind the two methods is to train a set of linear weak classifiers for visual tracking in an online manner by use of simple image feature like color, intensity etc, and may fail to track the visual object in the complex scence. By incorporating better image features like tensor or gabor features and replacing the linear weak classifier with a nonlinear weak classifier, this thesis has done substantial research work on boosting based on visual object tracking. Specificially, this theis has made the following contributions:
     1) Gabor filter has been used to get better image feature for visual tracking by online boosting. Compared with intensity and color, Gabor filter has good spatial locality and orientation selectivity , and can extract multidirectional spatial frequency feature and local structure feature. As a result, it has a higher discriminative power between background and foreground. However, a trival application of high dimensional Gabor feature to tracking will affect tracking speed. So, we turn to two schemes to reduce dimension and select the most discriminative feature. (a).Using local gabor filter bank to extract the Gabor feature vectors; (b).Adjusting filter bank parameters adaptively.
     2) Tensor feature has been explored for visual tracking by online boosting. As complement to typical vector patterns, tensor feature can capture gradient direction information, and has good distinguishing ability for the object which has strong texture property. By combining tensor feature with online boosting algorithm, we have achieved good tracking result for textured visual object.
     3) Online boosting method using the recursive least-squares (RLS) algorithm has been proposed for visual tracking. Linear classfier cann’t acquire good discrimination power. So,we employ a nonlinear version of the recursive least square algorithm(RLS) here. It performs linear regression in a high-dimensional feature space induced by a Mercer kernel,and can therefore be used to recursively construct minimum mean-squared-error solutions to nonlinear least-squares problems. In order to regularize solutions and keep the complexity of the algorithm bounded, we use a sequenctial sparsification process that admits into the kernel representation a new input sample only if its feature space cannot be sufficiently well approximated by combining the previously admitted samples. So, using this sparsification proceduce, we can update weak classifiers online.
     Classifier and feature selection are two important fields in pattern recognition. In this paper, we focus on feature and classifier. First, using Gabor filter and tensor feature to extract some discriminative features. Then, we choose the classifier based on kernel rather than the linear classifier. When the tracking scenario is very complex, this method can still achieve good results. Experimental results verify the effectiveness of the algorithm.

引文

[1]胡家升.光学工程导论[M].大连:大连理工大学出版社,2005.
    [2] Sullivan G,Baker K,Worra l A,ct a1.Model-based vehicle detection and classification using orthographic approximations.Image and Vision Computing 1997,15(8):649-654.
    [3] Ferryman J, Visual Surveillance for moving vehicles . International Journal of Computer Vision.2000,37(2):187—197.
    [4] James Ratches A,Walters C P,Rudolf Buser G.Aided and automatic target recognition based upon sensory inputs from image forming systems.IEEE Transactions on Pattern Analysis and Machine Intelligence.1997,19(9):1004—1019.
    [5] Maggioni C,Rottge H. Virtual TouchScreen: A Novel User Interface Made of Light-Principles, Metaphors and Experiences. Proceedings of the Eighth International Conference on Human Computer Interaction. Cambridge Press. 1999:301-305.
    [6] Davsion A J,Muray D W. Mobile Robot Localisation Using Active Vision.Proc. 5th. European Conference on Computer Vision. Springer Verlag, 1998:385-392.
    [7] Espiau B,Chaumete F, Rives P. A New Approach to Visual Servoing in Robotics. IEEE Trans. Robotics and Automation. 1993, 8(3):313-320.
    [8] Crisman J. D. Color Region Tracking for Vehicle Guidance. A. Blake, A.Yuille. Active Vision. MIT Press, 1992:107-120.
    [9] Dla F, Jute F, Ferri F, Ricens M. Color Segmentation Based on a Light Reflection Model to Locate Citrus Fruits for Robust Harvesting. Computer and Electronics in Agriculture.1993, 9(1):53-70.
    [10] Collins R T,Lipton A J,Kanade T.A system for video surveillance and monitoring.[R].Pittsburgh: Robotics Institute, Carnegie Mellon University, 2000.
    [11] Javed O,Zeeshan R,Alatas O et a1.Knight M:A real time surveillance system for multiple overlapping and non-overlapping cameras[J].ICME,2003.
    [12]汪亚明,楼正国,卞听.一种非刚体运动图象序列的特征点对应方法.中国图象图形学报. 2000,5 (3):232-236.
    [13] Remagnino P,Tan T,Baker K.Multi—agent visual surveillance of dynamic scenes[J].Image and Vision Computing,1998,16(8):529—532.
    [14] Porkili F.Integral histogram:A fast way to extract histograms in Cartesian spaces,In:Proc IEEE Conf on Computer Vision and Pattern Recognition.San Diego,CA,USA,2005,829-836.
    [15] Mohammad G A,A fast globally optimal algorithm for template matching using low-resolution pruning.IEEE Trans on Image Processing,2001,10(4):526-533.
    [16]杨静,丘江,刘波.MSEA及其在模板匹配中的应用.光子学报,2001,30(4):451-454.
    [17] Elgammal A,Duraiswami R,Larry S D.Efficient kernel density estimation using fast Gauss transform with applications to color modeling and tracking,IEEE Trans on pattern Analysi and Machine Intelligence,2003,25(11):1499-1504.
    [18] Nguyen H T,Worring M,Boomagaard R.Occlusion robust adaptive template tracking.In:Proc of IEEE Int Conf on Computer Vision.New York,2001,678-683.
    [19] Oberti F,Calcagon S,Zara M.Robust tracking of humans and vehicles in cluttered scenes with occlusions.In: Proc of IEEE Int Conf on Image Processing, New York,2002:629-632.
    [20]赵建伟,刘重庆.适用于遮挡的网格跟踪算法.上海交通大学学报,2003,37(3):440-443.
    [21]马颂德,张正友.计算机视觉.科学出版社,1999:58-8.
    [22] Gennery D B,Tracking of known three—dimensional objects[J] , in Proc . AAAl2nd Nat.Conf.AI,1982,Pages 13-17.
    [23] Won K.1. Lee C.Y. Lee J J,Tracking moving object using Snake’s jump based on image flow[J], Mechatronics,2001,1,1.199-216.
    [24] Zhiqiang Wei,Xiaopeng Ji and Peng Wang,Real-time moving object detection for video monitoring systems[J],Joumal of Systems Engineering and Electronics,2006,1 7(4):73 1—736.
    [25] Jorge Badenas,Jose Miguel Sanchiz,Filiberto Pla,Motion-based segmentation and region tracking in image sequence[J],Pattem Recognition,V01.34,No.3,2001,Pages 661-670.
    [26] Barren J,Fleet D,Beauchemin S.Performance of optical flow techniques[J].International Journal of Computer Vision,1994,12(1):42—77.
    [27] M. Kass, A. Witkin, D. Terzopoulos. Snakes: Active Contour Models. International Journal of Computer Vision. 1987, 1(4)321-330.
    [28] Terzopoulos D,Szeliski D. Tracking with Kalman Snakes. A. Blake, A. Yuille. Active Vision. MIT press, 1992:5-20.
    [29] Isard M , Blake A. Condensation-conditional density propagation for visual tracking[J]. International Journal of Computer Vision,1998,29(1):5-28.
    [30] Collins R,Liu Y,Leordeanu M.Online selection of discriminative tracking features[J].PAMI,2005,27(10):1631—1643.
    [31] Tang F,Brennan S,Zhao Q et a1.Co-Tracking Using Semi-Supervised Support Vector Machines[C].IEEE Conference on Computer Vision(ICCV),2007.
    [32]王震宇,张可黛,吴毅等.基于SVM和AdaBoost的红外目标跟踪[J].中国图象图形学报,2007,12(11):2052-2057.
    [33] Oza N and Russell S. Online bagging and boosting. In Artificial Intelligence and Statistics, pages 105–112, 2001. 1, 2, 4, 5.
    [34] Shai Avidan.Ensemble tracking[J].PAMI,2007,29(2):261—271.
    [35] Toufiq Parag .Boosting Adaptive Linear Weak Classifiers for online learning and Tracking .In CVPR.2008.4587556.
    [36] Dietterich T.G. Machine Learning Research: Four Current Directions. AI Magazine, 18(4):97--136, 1997.
    [37] Angluin, D. (1992). Computational learning theory: survey and selected bibliography. Proceedings of the twenty-fourth annual ACM symposium on Theory of computing (pp. 351--369). New York: ACM Press.
    [38] Eric M., Dennis D. Machine Learning for Science: State of the Art and Future Prospects. Science, Vol.293, 2001. pp2051-2055.
    [39] Thomas G. Dietterich. Ensemble learning. In The Handbook of Brain Theory and Neural Networks, Second Edition, 2002.
    [40] Tom M. Mitchell: Machine Learning. McGraw Hill, 1997.
    [41] Dietterich T.G. Ensemble Methods in Machine Learning. In Multiple Classier Systems, Cagliari, Italy, 2000.
    [42] Valentini G and Masulli F, "Ensembles of learning machines," in Neural Nets WIRN Vietri-02, Series Lecture Notes in Computer Sciences, M. Marinaro and R. Tagliaferri, Eds.: Springer-Verlag, Heidelberg (Germany), 2002,. Invited Review.
    [43] Amini A., Weymouth T, Jain R. Using Dynamic Programming for Solving Variational Problems in Vision. IEEE Trans. on Pattern Analysis and Machine Intelligence. 1990, 12(9):855-867.
    [44] Seung H.S,Opper M, and Sompolinsky H. Query by committee. In Proceedings of the Fifth Workshop on Computaional Learning Theory, pages 287--294, San Mateo, California, 1992.Morgan Kaufmann.
    [45] Kittler, J., Hatef, M., Duin, R. P., and Matas, J. 1998. On Combining Classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20, 3 (Mar. 1998), 226-239.
    [46] Wolpert, D.H. (1992), Stacked Generalization, Neural Networks, Vol. 5, pp. 241-259, Pergamon Press.
    [47] Ricardo V., Youssef D: A perspective view and survey of meta-learning. Artificial Intelligence Review, 18(2):77–95, 2002.
    [48] William W. Cohen, Vitor R. Carvalho: Stacked Sequential Learning. IJCAI 2005: 671-676.
    [49] Kearns M., & Mansour Y. (1996). On the boosting ability of top-down decision tree learning algorithms. In Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing.
    [50] Z.-H. Zhou, J. Wu, and W. Tang. Ensembling Neural Networks: Many Could Be Better than All. Artificial Intelligence, 2002, 137(1-2): 239-263.
    [51] Robert E, Schapire. The Strength of Weak Learnability. Machine Learning, 5(2):197--227, 1990.
    [52] Freund Y, Schapire R. E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer andSystem Sciences, 1997, 55(1):119-139
    [53]宋春雷,王龙等.学习理论与鲁棒控制、控制理论与应用.Vol 17(5), 633-636,2000/10.
    [54] Valiant L.G. A Theory of the Learnable、Communications of the ACM、Vol 27 (11), 1134-1142, 1984/11.
    [55] Kearns M.The Computational Complexity of Machine Learning、Cambridge:MIT Press, 1990.
    [56] Kearns M,Valiant L. G.Cryptographic Limitations on Learning Boolean Formulae and Finite Automata、Journal of the ACM, 41(1):67-95, 1994/01.
    [57] Schapire R. E. The Strength of Weak Learnability、Machine Learning, 1990,5(2):197-227.
    [58] Freund Y. Boosting a Weak Learning Algorithm by Majority、Information and Computation, 1995, 121(2):256-285.
    [59] Grabner H and Bischof H. On-line boosting and vision. In CVPR,pages 260–267, 2006. 1, 2, 3, 4.
    [60]邓洪波.一种基于局部Gabor滤波器组及PCA +LDA的人脸表情识别方法.中国图象图形学报.2007.
    [61]陈蓉.一种基于局部Gabor滤波器组的手写体汉字识别方法.计算机应用.2007.
    [62] Donato G, BartlettM S, Hager J C, et al. Classifying facial actions [ J ]. IEEE Transactions onPattern Analysis and Machine Intelligence, 1999, 21 (10) : 974～989.
    [63] Lee T S. Image rep resentation using 2D Gabor wavelets [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18 (10) : 959～971.
    [64] Liu C, Wechsler H. A Gabor feature classifier for face recognition[A ]. In: : Proceedings of the Eighth IEEE International Conference on ComputerVision[ C ] , Vancouver, Canada, 2001, 2: 270～275.
    [65]赵英男等,一种实用的Gabor滤波器组参数设置方法.人工智能及识别技术.
    [66]程万里等,基于Gabor-2DLDA方法的人脸识别研究.
    [67] YaakovEngel.The Kernel Recursive Least-Squares Algorithm .IEEE. 2004.
    [68] Herbrich R,Learning Kernel Classifiers.Cambridge,MA:MIT Press,2002.
    [69] Scholkopf B and Smola A,Learning With Kernels.Cambridge,MA:MIT Press,2002.
    [70] Vapnic A,Statistical Learning Theory.New York:Wiley Interscience,1998.
    [71] Kailath T, Sayed A, and Hassibi B, Linear Estimation. EnglewoodCliffs, NJ: Prentice-Hall, 2000.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700