用户名: 密码: 验证码:
基于模型的人体运动跟踪和姿态分析技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
对视频图像中的人体进行跟踪和姿态估计是计算机视觉领域中的热点问题,并且具有广泛的应用前景。目前现有的方法有一定缺陷,如:姿态估计不准确且速度慢,因此,研究一种良好的跟踪和姿态估计方法是非常有必要的。
     人体特征多种多样,在人体运动分析中特征选择和提取是关键问题。通过相关分析,本文在人体检测阶段采用的特征是梯度方向直方图;因为人体检测是人体跟踪和姿态估计的前提,检测结果的好坏直接影响跟踪和姿态估计的结果,原始的方法很难达到实时的效果,本文对其进行改进,在计算HOG特征中引入积分向量图,能有效减少HOG特征的计算时间。
     对于人体运动跟踪方法,需要根据不同的应用场合和跟踪目标的特性来选择不同的方法。本文要求实时性且前景观测比较准确,因此采用易于实现的基于Kalman滤波器的跟踪方法。本文创新的将HOG和Kalman结合在一起,不但可以准确跟踪人体目标位置,而且还可以确定人体大小,为姿态估计奠定了良好基础。
     对于人体姿态估计,本文采用基于图结构模型的方法,并对其进行改进。为了减少搜索空间,在进行姿态估计前,进行了一些前奏处理:首先进行了人体检测,通过人体检测确定人体的大体位置和尺度,把检测窗口作为姿态估计的输入,能大大减少搜索空间,提高姿态估计速度;同时在检测窗口中根据先验知识加入一些限制,如:人体的头部和躯干一般位于检测窗口的正中间,头部位于躯干的上面正中间,躯干位于头部正下方,这样进一步减少了搜索空间,提高了速度。对视频序列中的人体进行姿态估计时,本文不但利用了视频帧之间的表观连续性,还利用了几何连续性,这样不但能够提高人体姿态估计的准确性还可以提高估计速度。
Tracking human body in video and estimating pose is a hot field of computer vision, and has broad application prospects. However, there are some shortcomings in existing methods, such as inaccurate and low pose estimation, therefore it is necessary to research a good method of tracking and pose estimation.
     There are so many varieties of person features, it is the key issue for human motion analysis to choose and extract features. This paper introduces histogram of oriented gradients feature in human detection, because human detection is the premise of tracking and poses estimation, but it is difficult for the original methods to be real-time, the paper introduces the integral histogram to the calculation of HOG features which reduce the computation time.
     The tracking method is selected on the basis of different application environments and the characteristics of tracking targets. This paper requires real-time, fast implementation and the observation of prospect is more accurate, so we adopt the tracking method based on Kalman filter. The innovation is combining the HOG with Kalman filter, which not only can track the position of targets, but also can determine the body size laying a good basis for pose estimation.
     This paper takes advantage of approach which is based on graph structure to estimate human pose, and improve it. We carry out some processing before pose estimation for reducing the search space. First of all, detecting the human body, the location and scale information of human can be determined, and the detecting window not the whole image is the input of pose estimation, which can reduces the search space and improve the speed of pose estimation. According to prior knowledge, we add some restrictions in the detection window, for example, the body's head and torso is usually in the middle of the detection window, the head in the middle of the trunk, the trunk just below the head, thus reducing the search space and enhancing its speed. This paper uses not only the apparent continuity among the video frames, also uses the geometric continuity in the human body pose estimation of video sequence, so as to enhance accuracy and speed.
引文
[1]陈坚.单目视频人体运动跟踪和获取技术研究[D].北京:中国科学院研究生院,2005.
    [2]王亮胡卫明,谭铁牛.人运动的视觉分析综述[J].计算机学报,2002,25(3):225-237.
    [3]侯志,韩崇昭.视觉跟踪技术综述[J].自动化学报,2006,32(4):603-617.
    [4]赵国英,陈睿等.数字人与人体发仿真[J].计算机仿真,2003,z1:520-524.
    [5]禹晶,段娟,苏开娜.一种基于Hough变换的步态特征提取方法的研究[J].中国图象图形学报,2005,10(10):1304-1309.
    [6] Hjelmas,E.,Low,B.K.Face Detection:Survey[J].Computer Vision and Image Understanding,2002,83(3):236-274.
    [7] Liang Wang, Weiming Hu, Tieniu Tan, Recent Developments in Human Motion Analysis[J]. Pattern Recognition, 36,2003,585-601.
    [8] D.J.Lee, P.Zhan, A.Thomas, et al. Shape-based human intrusion detection[J].SPIE International Symposium on Defense and Security ,Visual Information Processing XIII,2004,5438:81-91.
    [9] Jianpeng Zhou, Jack Hoang. Real time robust human detection and tracking system[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005,3:149-149.
    [10]孙吉花,刘肖琳.一种新的基于统计的背景减除方法[J].计算机工程与应用, 2007, 43(22) :73-75.
    [11] Anderson C, Bert P, Vander Wal G. Change detection and tracking using pyramids transformation techniques[C]. Proc SPIE Conference on Intelligent Robots and Computer Vision, Cambridge,MA,1985,579:72-78.
    [12] Barron J, Fleet D, Beauchemin S. Performance of optical flow techniques[J]. International Journal of Computer Vision, 1994,12(1):42-77.
    [13]冯波.基于光流计算的典型行为识别算法研究[D].西安:西北工业大学,2006.
    [14]李刚,邱尚斌,林凌等.一种基于背景减除与三帧差分的运动目标检测算法[J].仪器仪表学报,2009,25(12):274-276.
    [15]王晓军,宋展.基于高斯混合模型的视频背景建模技术初探[J].先进技术研究通报,2010,4(7):75-79.
    [16] N. Dalal and B Triggs. Histogram of Oriented Gradients for Human Detection. Proc. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2005,2:886–893.
    [17] Dalal,N., Triggs,B. Histograms of oriented gradients for human detection[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2005, I:886–893.
    [18] Collins R et al. A system for video surveillance and monitoring: VSAM final report. Carnegie Mellon University[R]. CMU-RI-TR-00-12,2000.
    [19] Lipton A, Fujiyoshi H, Patil R. Moving target classification and tracking from real-time video [J].Proc IEEE Workshop on Applications of Computer Vision, 1998,8-14.
    [20] Cutler R, Davis L. Robust real-time periodic motion detection, analysis, and applications [J].IEEE Trans Pattern Analysis and Machine Intelligence, 2000,22(8):781-796.
    [21] Lipton A. Local application of optic flow to analyse rigid versus non-rigid motion. In: http://www.eecs.lehigh.edu/FRAME/Lipton/iccvframe.html.
    [22] Haritaoglu I, Harwood D, Davis L. W4:real-time surveillance of people and their activities [J].IEEE Trans Pattern Analysis and Machine Intelligence ,2000,22(8):809-830.
    [23] Zhao,T., Nevatia,R.. Tracking multiple humans in complex situations [J].IEEE trans. on PAMI ,2004,26(9):1208-1221.
    [24] Zhao,T., Nevati,R.. Tracking multiple humans in crowded environment[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2004,II:406–413.
    [25] Han,Z.J., Ye,Q.X, Jiao,J.B. Online Feature Evaluation for object Tracking Using Kalman Filter[C]. in Proc ICPR’08,Tampa(FL,USA),8-11,2008.
    [26] Peter, J.R., Tu,H., and Krahnstoever,N. Simultaneous estimation of segmentation and shape[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2005,II:486–493.
    [27] Isard, M., MacCormick, J. BraMBLe. A bayesian multiple blob tracker[C]. ICCV2001,II:34-41.
    [28]马雷,田原,苏红旗.一种基于蒙特卡罗方法的小目标视觉跟踪算法[J].中国图象图形学报,2008,13(03):445-449.
    [29] Junghyun Kwon Park, F.C. Visual tracking via particle filtering on the affine group[C]. Information and Automation, ICIA 2008,997-1002.
    [30]邱显杰.基于视频的三维人体运动捕获方法研究[D].北京:中国科学院计算机技术研究所,2006.
    [31] Juergen Gall, Carsten Stoll. Motion Capture Using Joint Skeleton Tracking and Surface Estimation[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2009.
    [32] Felzenszwalb P. F., Huttenlocher D.P. Pictorial structures for object recognition [J].IJCV,2005,61(1):55-79.
    [33] Pedro F.Felzenszwalb, Daniel P.Huttenlocher. Distance Transforms of Sampled Functions[C].Cornell Computing and Information Science, 2004.
    [34]张佳文.人体运动姿态分析的研究与实现—基于单目视频序列[D].上海:东华大学,2008.
    [35] D. M. Gavrila. A Bayesian exemplar-based approach to hierarchical shape matching[J]. IEEE Trans.Pattern Anal ,2007,29(8):1408–1421.
    [36] P. Sabzmeydani, G.Mori. Detecting pedestrians by learning shapelet features[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2007,1-8.
    [37] M. Dimitrijevic, V. Lepetit, P. Fua. Human body pose detection using bayesian spatio-temporal templates [J].Comput. Vis. Image Underst , 2006,104(2):127–139.
    [38] C. Harris, M. Stephens. A Combined Corner and Edge Detector[C]. Proc. Alvey Vision Conf., 1988,147-151.
    [39] D. Lowe. Distinctive Image Features from Scale-Invariant Keypoints[C]. Int’l J. Computer Vision, 2004,2(60):91-110.
    [40] P. Viola, M. J. Jones, D. Snow. Detecting pedestrians using patterns of motion and appearance. Int.J.Comput.Vision, 2005,63(2):153–161.
    [41] K. Mikolajczyk, C. Schmid. Scale and Affine Invariant Interest Point Detectors[C]. Int’l J. Computer Vision, 2004,1(60):63-86.
    [42] D. G. Lowe. Distinctive image features from scale-invariant keypoints[C]. Int. J. of Comp. Vision, 2004,60(2):91-110.
    [43] David G. Lowe. Object recognition from local scale-invariant features[C]. International Conference on Computer Vision, Corfu, Greece, 1999,1150-1157.
    [44] Matas, J., Burianek, J., Kittler, J.. Object Recognition using the Invariant Pixel-Set Signature[C]. In Proceedings of the British Machine Vision Conference, London, UK, 2000,606-615.
    [45] Iryna Gordon, David G. Lowe. Scene modelling, recognition and tracking with invariant image features[C]. International Symposium on Mixed and Augmented Reality (ISMAR), Arlington, VA, 2004,110-119.
    [46] W. Freeman, E. Adelson. The Design and Use of Steerable Filters [J].IEEE Trans. Pattern Analysis and Machine Intelligence ,1991,13(9):891-906.
    [47] S.Belongie, J.Malik, J.Puzicha. Shape matching and object recognition using shape contexts [J].PAMI ,2002,24:509-522
    [48] D. M. Gavrila. A Bayesian exemplar-based approach to hierarchical shape matching[J]. IEEE Trans.Pattern Anal ,2007,29(8):1408–1421.
    [49] http://www.cnblogs.com/saintbird/archive/2008/08/20/1271943.html.
    [50]陈实,马天骏,黄万红,等.基于形状上下文描述子的步态识别[J].模式识别与人工智能,2007,20(6):794-799.
    [51]苏磊,马良.形状上下文在验证码识别中的应用[J].微计算机信息,2007,23(12-2):252-253.
    [52]申家振,张艳宁,刘涛.基于形状上下文的形状匹配[J].微电子学与计算机,2005,2(4):144-146.
    [53] R AINER L,JoC_N M.An extended set of haar-like features for rapid object detection [J].IEEE ICIP ,2002,I:900-903.
    [54] VIOLA P J M.Robust real-time objeet detection[R].Compaq CRL,2001.
    [55] http://blog.csdn.net/cuiyuzheng/archive/2010/03/13/5375959.aspx.
    [56]巩艳华,朱爱红,代凌云,等.基于颜色直方图的颜色特征提取[J].福建脑,2007(5):96-97.
    [57]梁静.视频序列中对特定运动目标跟踪算法研究[D].上海:上海交通大学,2008.
    [58]陈强.视频监控中的目标检测与行为分析[D].上海:上海交通大学,2009.
    [59]张继霞.智能视频控制中人体的检测与跟踪研究[D].大连:大连理工大学,2007.
    [60] Mac Y,Shirai Y,Miura J,Kono Y.Object tracking in cluttered background based on optical flow and edges[C].13th International Conferenceon Pattern Recognition,1996,196-200.
    [61] R.Klinkenberg, T.Joachims. Detecting Concept Drift with Support Vector Machines[C].in Proceeding of the Seventeenth International Conference on Machine Learning. Morgan Kaufmann,2000.
    [62] Paul Viola, Michael Jones. Rapid Object Detection using a Boosted Cascade of Simple Features[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2001.
    [63] Fatih Porikli. Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2005.
    [64] P. Felzenszwalb, D. McAllester, D. Ramaman. A Discriminatively Trained, Multiscale, Deformable Part Model[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2008.
    [65] P.Felzenszwalb, B.Girshick, D.McAllester, et al. Object detection with discriminatively trained part based models[J]. IEEE Transaction on Pattern and Machine Intelligence, 2010,32(9):1627-1645.
    [66] M. Weber, M. Welling, and P. Perona. Unsupervised learning of models for recognition[C]. In Proc. ECCV 2000,18-32.
    [67] R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2003, 2:264-271.
    [68] T. Serre, L. Wolf, and T. Poggio. A new biologically motivated framework for robust object recognition. Technical Report CBCL Paper 243 / AI Memo 2004-026, Massachusetts Institute of Technology, Cambridge, MA, November 2004.
    [69] D. Crandall, P. Felzenszwalb, D. Huttenlocher. Object recognition by combining appearance and geometry. In Toward Category-Level Object Recognition. Springer, 2007.
    [70] B. Leibe, A. Leonardis, and B. Schiele. Combined object categorization and segmentation with and implicit shape model. In ECCV’04 workshop on Statistical Learning in Computer Vision, pages 17–32, Prague, 2004.
    [71] Mykhaylo Andriluka, Stefan Roth, Bernt Schiele. Pictorial Structures Revisited: People Detection and Articulated Pose Estimation[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2009.
    [72] A.Menache.Understanding motion capture for computer animation and video games.Morgan Kaufmann.1999.
    [73] Ismail Haritaoglou, David Harwood, Larry S.Davis. W4:Who?When?where?What?A Real-Time System for Detecting and Tracking People[C]. International. Conference Face Gesture Recognition, April 14-16,Nara,Japan,1998.
    [74] D. Ramanan, D. A. Forsyth, A. Zisserman. Strike a pose: Tracking people by ? nding stylized poses. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 2005,1:271-278.
    [75]张进.基于视频的人体检测跟踪算法研究[D].山东:山东建筑大学,2009.
    [76]毛欣.基于学习的人体检测与跟踪[D].上海:上海交通大学,2007.
    [77] Kalman R E. A New Approach to Linear Filtering and Prediction Problem [J]. Transaction of the ASME-Journal of Basic Engineering ,1960:35-45.
    [78] Intille S and Bobick A. Representation and visual recognition of complex, multi-agent actions using belief networks[R].Perceptual Computing Section, MIT Media Lab,1998.
    [79] Bregler C. Learning and recognizing human dynamics in video sequences[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 1997,568-574.
    [80] Ryuzo Okada and Stefano Soatto. Relevant Feature Selection for Human Pose Estimation and Localization in Cluttered Images. ECCV 2008, (2):434-445.
    [81] Bregler C. Learning and recognizing human dynamics in video sequences[C]. Proc.IEEE Conf. on Computer Vision and Pattern Recognition, 1997,568-574.
    [82]张晓.基于单目视觉的人体运动分析研究[D].武汉:武汉理工大学,2006.
    [83] H.Sidenladh, M. J. Black. Learning Image Statistics for Bayesian Tracking[C]. Proceeding of the International Conference on Computer Vision, Vancouver, Canada, 2001,Ⅱ:709-716.
    [84] C.Sminchisescu, B.Triggs. A Robust Multiple Hypothesis Approach to Monocular Human Motion Tracking [J]. INRIA Research Report, 2001.
    [85]陈睿.基于概率模型的三维人体运动跟踪研究[D].北京:中国科学院研究生院,2005.
    [86] Ferrari V., Marin M.,Zisserman A. 2D Human Pose Estimation in TV Shows[C]. In Dagstuhl post-proceedings, 2009.
    [87] L. Sigal and M.J. Black. Measure locally, reason globally: Occlusion-sensitive articulated pose estimation. Proc.IEEE Conf. on Computer Vision and Pattern Recognition,2006,2:2041-2048.
    [88] D. Ramanan. Learning to parse images of articulated bodies. In NIPS, 2006.
    [89] V. Ferrari, M. Marin-Jimenez, A. Zisserman. Progressive search space reduction for human pose estimation. Proc.IEEE Conf. on Computer Vision and Pattern Recognition,2008.
    [90] Marcin Eichner and Vittorio Ferrari. Better appearance models for pictorial structures[C]. Britich Machine Vision Conference, 2009.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700