复杂背景下的运动人体跟踪算法研究

英文题名：Research on Moving People Tacking for Complex Environment
作者：赵煜
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：低对比度 ; 检测率 ; 虚警率 ; 局部直方图熵 ; 局部灰度熵 ; 均值偏移
英文关键词：Visual Tracking ; Ratio of Detection ; Ratio of False-alarm ; Local Histogram Entropy ; Local Gray Entropy ; Mean Shift
学位年度：2011
导师：黎蔚
学科代码：081203
学位授予单位：河南科技大学
论文提交日期：2011-04-01

摘要

运动人体的跟踪技术研究是机器视觉领域的核心课题之一,目前被广泛应用在视频编码、智能交通、智能监控、图像检索及军工等众多领域中。本文就低对比度的复杂环境下运动人体跟踪技术进行了深入的研究,着重分析在了低对比度的复杂环境下如何进行运动人体目标的识别和提取以及目标的后续跟踪,主要完成了以下几项工作:
     1.背景的快速构建与更新:复杂的场景中,尤其是对于大面积监控的场景,采取单一背景生成及维护模型,总会消耗系统大量资源用于处理无用的信息。针对这一问题,我们运用了一种分区管理的背景建模方法,对于不同的区域采用不同的方法进行建模,可以更加有效地利用系统资源。在背景生成和维护阶段,把背景区域划分成一个个大小相等的区域(类似“贴片”),并根据这些“贴片”所在区域的不同变化特征分别进行更新,可以在占用很少系统资源的同时,快速地适应环境的变化。
     2.运动目标的快速精确提取:为了在得到较为细致的运动目标形状的同时,又可以避免对场景非平稳变化的敏感性,本文运用了基于局部邻域相似度的目标检测方法,在对输入视频中像素进行分析的同时考虑周围背景的相似性,通过像素周围图像块在时域中的变化来区分背景和前景,在没有任何预处理的情况下,不仅有效地降低了噪声的干扰,并能够快速准确地提取出运动目标。
     3.低对比度下运动人体的识别:针对造成低对比度下运动人体识别困难的两个主要因素,拍摄时光线昏暗和拍摄时距离较远,引入局部直方图熵概念,提出基于局部直方图熵的人体识别算法,运用检测率和虚警率对实验数据进行评价,获取两种低对比度环境下获取人体的最佳局部直方图熵差值的阈值,通过对理论和实验数据的分析,得出基于局部直方图熵的人体识别算法在准确度上仍需提高,进而引入局部灰度熵概念,提出基于局部灰度熵的人体识别算法,运用检测率与虚警率对算法进行的评价,获取局部灰度熵差值的最佳阈值,经过对算法进行的综合评价,得出基于局部灰度熵的人体识别算法更适合于低对比度下的人体识别。
     4.运动人体追踪:由于Mean Shift算法在对运动人体进行追踪时表现出了很高的实时性,而且其对一些干扰素并不敏感,所以本文在对人体进行实时性追踪时采取基于Mean Shift的运动人体追踪算法。但是为了进一步提高基于Mean Shift算法的稳健性,本文做出了一些改进,设计了基于改进的Mean Shift运动人体追踪算法:在目标建模阶段,结合人体识别对人体区域进行定位,对区域内的人体目标进行多特征建模,选择反差大的特征子模型来对目标进行跟踪;在后续跟踪阶段,通过对目标特征和周围区域的特征进行对比,选择最优子模型,强化目标与周围环境的反差,从而实现对目标的鲁棒跟踪。同时为了提高运动人体目标跟踪的匹配精度,本文引进了广义距离进行目标匹配,实验表明该方法能有效地提高跟踪精度。
Moving people tracking is a challenging task within the field of computer vision. The use of moving people tracking is pertinent in the task of video coding, intelligent traffic, intelligent surveillance, image retrieval, military industry and so on. In this paper, we mainly do some research on this technology in complex and dim contrast environment. And more attentions are put on the moving people recognition and tracking in complex and dim contrast environment. The main content and innovation of the dissertation are as follows:
     1. Background generation and maintenance. In the complex background, especially for the large-scale scene monitoring, it will consume a lot of resources to deal the useless information. To solve this problem, the background is partitioned for different regions. The background is divided into areas of equal size (similar to "patch"), and the "patch" will update in accordance with the changes at respond regions. So it can occupy very little system resources to adapt the environmental changes quickly.
     2. Moving object detection and extraction. Background subtraction is widely used in moving object detection. Pixel-based methods are sensitive to the non-stationary change of the scenes. Region-based approaches allow only coarse detection of the moving objects. In this paper, a novel algorithm based on local neighborhood similarity is proposed. Integrate the similarity of its surrounding pixels with the background model, when a pixel needs to be judged. The performance of the proposed method was evaluated by a series of indoor and outdoor experiments. Compared with the current widely used Mixture of Gaussian, the proposed algorithm in this paper achieved the perfect results in object detection and extraction.
     3. Recognition of people under the dim contrast environment. According to the two main factors,the dim light shooting and the distance shooting, that cause the people recognition difficulty under the dim contrast environment, a concept of the local histogram entropy was introduced, and proposed an algorithm of people recognition based on local histogram entropy. And evaluated through the ratio of detection and the ratio of false-alarm of the algorithm, obtained the optimal thresholds on the differential of the local histogram entropy which could get the human body under the two conditions of low contrast. Through theory and experimental data, it is concluded that the analysis based on local histogram entropy of people recognition algorithm could still be improved in accuracy. Then a concept of the local gray entropy was introduced, and proposed an algorithm of people recognition based on local gray entropy. And evaluated through the ratio of recognition and the ratio of false-alarm of the algorithm, obtained the optimal thresholds on the differential of the local gray entropy which could get the human body under the two conditions of low contrast. While the results show that the algorithm of people recognition based on the trait of the local gray entropy can obtain the better effect than the algorithm based on the trait of the local histogram entropy under the dim contrast environment.
     4. Moving people tracking. We used the algorithm of moving people tracking based on mean-shift since this algorithm showed high real-time, and it wasn’t sensitive to some interferon. But for improving the robustness of the tracking algorithm based on mean-shift, we made some improvements. First at target modeling stage a group of modals are generated to represent the object after people recognition. The one which is great contrast to the background is employed to track. And the most discriminative modal can be chose to track in the follow-up phase in order to achieve the robust results. At last, generalized distance is employed for the purpose of improving the accuracy of target matching. Experiments demonstrate the effectiveness of the new strategy.

引文

[1]贾云得.机器视觉[M].北京:科学出版社,2004,28-29.
    [2] Gordon G., Darrell T., Harville M. and Woodfill J.. Background estimation and removal based on range and color[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999, 2: 459-464.
    [3] Haritaoglu I, Harwood D, Davis LS. W4: Real-time surveillance of people and their activities [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000, 22(8): 809-830.
    [4] Liu Z-F, You Z. A Real-time Vision-based Vehicle Tracking and Traffic Surveillance[C]. SNPD 2007. Qingdao: Institute of Electrical and Electronics Engineers Computer Society, 2007,1: 174-179.
    [5]侯志强,韩崇昭.视觉跟踪技术综述[J].自动化学报.2006,32(4):603-617.
    [6] Xiaxi H, Boulgouris NV. Robust Object segmentation using adaptive shareholding[C]. 2007 IEEE International Conference on Image Processing, ICIP 2007 Proceedings. San Antonio: Institute of Electrical and Electronics Engineers Computer Society, 2006, 1:45–48.
    [7] Wren CR, Azarbayejani A, Darrell T, Pentland AP. Pfinder: Real-Time Tracking of the Human Body [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19: 780-785.
    [8]梁华,刘云辉.自适应多模快速背景差算法[J].中国图像图形学报,2008, 13(2):345-350.
    [9] Toyama K., Krumm J., Brumitt B. Wallflower: Principles and Practice of Background Maintenance[C]. Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference, 1999, 1:255-261.
    [10]徐一华,李京峰,贾云德.人体三维运动实时跟踪与建模系统[J].自动化学报.2006, 32(4):560-567.
    [11]史东承,谢玉鹏,吴莉等.基于主动表现模型的人脸图像描述与编码[J].长春工业大学学报.2006,27(4):324-328.
    [12] D.Gavrila, The visual analysis of human movement:A survey, Computer Vision and Image Understanding [J]. 1999, 73(4):428-440.
    [13] Adrian Hilton,Modeling people toward vision-based understanding of a person’s shape,appearance and movement[J]. Computer Vision and Image Understanding. 2001, 81(3):227-230.
    [14] Thomas B,Moeslund, Erik Granum, A survey of computer vision-based human motion capture[J]. Computer Vision and Image Understanding. 2001,81(5):231-268.
    [15] D. M. Gavrila. The analysis of hurnan motion and its application for visual surveillance. Proc. of the 2nd International Workshop on Visual Surveillance[C]. Fort Collins, USA,1999, 2:110-112..
    [16]孙庆杰.静态图像中人体识别技术研究[D].中国科学院研究生院,2004.
    [17]田光见,赵荣椿,王东成等.基于步态识别的远距离身份认证[J].计算机工程与应用.2005,19(6):65-111.
    [18]韩双焕,笔式用户界面中手势的可用性设计和识别研究[D].中国科学院研究生院,2005,34-45.
    [19] Neri A, Colonnese S, Russo G, Talone P. Automatic moving object and background separation [J]. Signal Processing, 1998, 66(2):219-232.
    [20] Lipton AJ, Fujiyoshi H, Patil RS. Moving target classification and tracking from real-time video[C]. Proc. Fourth IEEE Workshop on Applications of Computer Vision. Nassau Inn: IEEE Computer Society, 1998, 4:8-14.
    [21] Mech R, Wollborn M. A noise robust method for segmentation of moving objects in video sequences[C]. Multidimensional Signal Processing, Neural Networks. Munich:IEEE, 1997, 4: 2657-2660.
    [22] Moscheni F, Bhattacharjee S, Kunt M. Spatio-temporal segmentation based on region merging [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998, 20(9): 897-915.
    [23] Li L, Huang W, Gu IYH, Tian Q. Foreground object detection from videos containing complex background[C]. 2003 Multimedia Conference - Proceedings of the 11th ACM International Conference on Multimedia, MM'03. Berkeley: Association for Computing Machinery, 2003, 11:2-10.
    [24] Heikkila J, Silven O. A real-time system for monitoring of cyclists and pedestrians [J]. Image and Vision Computing. 2004, 22(7):74-81.
    [25] Cucchiara R, Grana C, Piccardi M, Prati A. Statistic and knowledge-based moving object detection in traffic scenes [C]. IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 2000, 1:27-32.
    [26] Collins R. et al. A system for video surveillance and monitoring [M]. VSAM final report. Carnegie Mellon University, Technical Report CMU-RI-TR-00-12, 2000.
    [27] Lipton A, Fnjiyoshi H, Patil R. Moving target classification and tracking from real—time Video[R]. In: Proc IEEE Workshop on Applications of Computer Vision, Princeton, NJ, A.1998, 5:8-14.
    [28] Kuno Watanabe, Shimosakoda, Nakagawa S.Automated detection of human for visual surveillance system[R]. In: Proc IEEE International Conference on Pattern Recognition, Vienna, 1996, 1:865-869.
    [29] Ismail Haritaoglu, David Harwook. W4:Real-time Surveillance of People and Their Activities[C]. IEEE Transactions on Pattern and Machine Intelligence, 2000, 22(8): 74-81.
    [30]江和平.红外序列图像匹配跟踪技术研究[D].长沙:国防科技大学,2006,3-4.
    [31] Matthews I, Ishikawa T, Baker S. The template update problem [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 16:810-815.
    [32] Horn BKP, Schunck BG. Determining Optical Flow [J]. Artificial Intelligence. 1981, 17(13):185-203.
    [33] Brooks MJ, Chojnacki W, Baumela L. Determining the egomotion of an uncalibrated camera from instantaneous optical flow [J]. Journal of the Optical Society of America A. 1997, 14: 2670-2677.
    [34]朱胜利,朱善安.核函数带宽自适应的Mean Shift目标跟踪算法[J].光电工程,2006,33(8): 11-16.
    [35] Collins RT. Mean shift blob tracking through scale space [C]. Proc. of IEEE Conference on Computer Vision and Pattern Recognition. Madison: Institute of Electrical and Electronics Engineers Computer Society, 2003, 2: 234–240.
    [36] Ju S, Black M, Yaccob Y. Cardboard people: A parameterized model of articulated image motion[R]. In:Proc IEEE International Conference on Automatic Face and Gesture Recognition, Killington, Vermont USA, 1996, 7:38-44.
    [37] Toyama k Krumm J, Brumitt B, et a 1. Wallower:Principles and Practice of Background Maintenance[J].In International Conference on Computer Vision, USA, 1999, 4:45-63.
    [38]刘英霞,贺长伟,王欣.基于贝叶斯模型的动态背景检测[J].系统仿真学报.2007,19(21):5042-5058.
    [39] Heikkila J, Silven O. A real-time system for monitoring of cyclists and pedestrians [J]. Image and Vision Computing. 2004, 22(7):74-81.
    [40]杨凯鹏.复杂环境下视觉跟踪算法研究[D].河南科技大学,2009,26-33.
    [41]韩东峰,李文辉,郭武.基于潜在局部区域空间关系学习的物体分类算法[J].计算机学报,2007,30(8):1286-1970.
    [42] Comaniciu D, Meer P. Mean shift: a robust approach toward feature space analysis [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24(5):603–619.
    [43] Fisher R. CAVIAR: Context Aware Vision using Image-based Active Recognition.http://homepages.inf.ed.ac.uk/rbf/CAVIAR/caviar.htm.
    [44] Yinggan Tang, Xiumei Zhang, Xiaoli Li, Xinping Guan. Application of a new image segmentation method to detection of defects in castings[C]. The International Journal of Advanced Manufacturing Technology, 2009 ,43:431-439,.
    [45] Shannon C E. A mathematical theory of communication [J]. Bell System Technology Journal. 1948,27(7):379-423.
    [46]盛骤,谢式千,潘承毅.概率论与数理统计[M].北京:高等教育出版社,1989,5.
    [47] David E, Jansing, Thomas A, Albert, Darrel L, Chenoweth. Two-dimensional entropic segmentation [J]. Pattern Recognition Letters. 1999, 20(3):329-336.
    [48] Pal N, and et a1. Object-background segmentation using new definitions of entropy [J]. IEEE Proceedings of Computers and Digital Techniques. 1989, 36(4):284-295.
    [49] Shiozald A. Edge Extraction Using Entropy Operator [J]. Computer Vision, Graphics and Image Processing. 1986, 36 (1):1-9.
    [50]张弓,朱兆达,周亦南.局部熵算法在机载PD雷达杂波跟踪中的应用[J].电子学报.2003,3l(9):1295-1298.
    [51]田金文,苏康,柳健.基于局部熵差的图像匹配方法—算法及计算机仿真[J].宇航学报.1999 ,20 (1):28-32.
    [52]孙即祥.图像处理[M].北京:科学出版社,2004.
    [53] Dominikus Noll. Restoration of Degraded Images with Maximum Entropy [J]. Journal of Global Optimization. 1997, 10(1): 91-103.
    [54]孙即祥.图像压缩与投影重建[M].北京:科学出版社,2005.
    [55]汪洋.面向自动目标识别的图像压缩关键技术研究[D].国防科学技术大学,2006.
    [56]王永忠,梁彦,赵春晖等.基于多特征自适应融合的核跟踪方法[J].自动化学报.2008,34(4):393-399.
    [57]左军毅,梁彦,潘泉等.基于多个颜色分布模型的Camshift跟踪算法[J].自动化学报.2008,34(7):736-742.
    [58] Comaniciu D, Ramesh V, Meer P. Real-time tracking of non-rigid objects using mean shift[C]. Proc. IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head Island: Institute of Electrical and Electronics Engineers Computer Society, 2000, 2: 142-149.
    [59]李乡儒,吴福朝,胡占义.均值漂移算法的收敛性[J].软件学报.2005,16(3):365-374.
    [60] Fashing M, Tomasi C. Mean Shift is a Bound Optimization [J]. IEEE Transactions on Patterns Analysis and Machine Intelligence. 2005, 27(3):471-474.
    [61] Matthews I, Ishikawa T, Baker S. The template update problem [J]. IEEE Transactions onPattern Analysis and Machine Intelligence. 2004, 16:810-815.
    [62] Collins RT, Liu Y, Leordeanu M. Online selection of discriminative tracking features [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2005, 27(10): 1631–1643.
    [63] Shi J, Tomasi C. Good features to track[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 1994, 593-600.
    [64]冯祖仁,吕娜,李良福.基于最大后验概率的图像匹配相似性指标研究[J].自动化学报.2007,33(1):1-8.
    [65]时永刚.Minkowski广义距离与多模态图像配准[J].北京理工大学学报.2005,25(10):913-918.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700