单目视频中人体运动建模及姿态估计研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
对于单目视频中人体运动的自动理解和姿态估计一直是计算机视觉研究的热点问题。本文从五个方面对基于单目视频的人体检测及运动分析展开研究,分析了基于视频的人体检测技术:三维人体运动捕获技术,行人检测技术,视频人体运动特征提取的方法,人体运动跟踪技术和人体运动姿态估计技术。在此基础上进行人体运动建模和基于单目图像的姿态估计。
     首先采用基于窗口梯度势能的遮挡人体检测方法对视频中人体信息进行检测,提出了一种基于窗口梯度势能(Window Gradient Potential Energy,WGPE)的人体检测方法。在特征窗口扫描过程中,通过加权级联SVM,实现对半遮挡情况下的人体检测,利用稀疏-稠密窗口势能集筛选缩短了检测时间。由于WGPE利用了HOG特征计算过程中的梯度信息,因此本算法与其他的基于HOG的快速检测算法来,并不需要增加过多的计算开销,在背景较为平滑的图像中,与传统的HOG检测方法相比具有较少的检测时间,对于较复杂的背景,本算法与传统的HOG检测算法相当。实验表明在人体检测的准确率和效率方面有所提高,对于处于半遮挡情况下人体检测,准确率也有明显提高。
     对图像中人体姿态估计方面,采用基于贝叶斯模型的人体运动姿态估计方法,对静态图像中人体进行肢体进行分析。提出基于边缘轮廓特征的贝叶斯模型,为了进一步提高肢体分析的准确率引入了基于骨架轨迹图对姿态进行分析。
     对于视频图像中人体的姿态分析采用基于条件随机场模型的静态图像姿态估计,首先对图像中人体运动姿态的SIFT特征进行提取,建立SIFT人体运动特征库对人体运动姿态进行估计,采用基于条件随机场的肢体可变结构对人体进行建模,并采用条件随机场模型对对人体姿态进行估计,
     为进一步提高姿态估计的准确率和满足实时性的要求,先对人体运动数据进行运动节奏特征数据的提取,提出基于EM-GM人体运动节奏特征数据的自动提取算法;对视频图像中的人体运动采用动态构建颜色-边缘特征人体模型的方法进行建模,其中各肢体的边缘信息匹配采用快速定向导角(Fast Directional Chamfer Matching FDCM)方法,并提出了快速人体肢体检测算法。然后采用基于节奏运动信息进行人体三维姿态估计。对检测结果融入运动节奏信息进行三维人体姿态估计,在参数的推理过程中,首先采用GPLVM方法对人体运动数据进行降维处理,再采用局部动态特征建模,最后进行三维人体姿态参数估计。
     对于视频图像中人体姿态估计,本文提出了基于约束图的视频人体姿态估计方法,首先建立层次组合的人体运动模型,定义了人体肢体模型。并提出了基于相关动作簇的运动模型,为了缩减搜索空间,提出RPC节点图生成树算法,并细化了RPC的节点合并,节点分裂和生成树平衡算法。根据RPC节点图生成树算法,提出了视频人体姿态估计算法,和基于RPC生成树模型的推理算法。提出了一种基于三维人体动作库投影图数据驱动的(Markov chain Monte Carlo MCMC)方法对单目视频图像中的人体姿态进行跟踪,首先对运动捕捉设备获取的人体基本运动库中人体外观在不同视角下的外观投影图进行聚类;采用HOG对单目视频图像中人体进行检测,能较准确分割出人体各肢体位置;最后通过三维人体姿态推理算法外观模型对每帧进行分析模型,再利用时间约束的分析模型对目标进行跟踪。采用约束图驱动的MCMC和基本动作库相结合构建一个适用于视频数据建模的,并将该模型应用于数据驱动的联机行为识别,提高人体姿态的建模能力
Automatically analyzing and understanding human motion and pose estimation has been an important field of computer vision for many years. This thesis focuses on five aspects for video-based human motion analyzing:3D human motion capture technology, pedestrian detection, human motion feature extraction, human motion tracking and 3D pose estimation technology. Through these methods, the human motion model is built, and to estimate human pose from monocular images
     In order to improve accuracy of the human detection under occlusion, this paper proposes the conception of the Window edge of the Gradient of Potential Energy (WGPE) and a fast human detection method based on gradient potential energy. By using sparse-dense gradient potential windows set, the detection time of the multi-scale detection can be shortening. Cascading SVM training using weighted positive and negative samples, the occlusion sample of the human body is weighted to detect the human body under occlusion. Filter positive in the detection window, the algorithm does not require too much computational overhead increases when the detection window is filtered. In the smooth background image, the proposed method compared to the Multi-level HOG detection and Histograms of Oriented Gradients and Local Binary Pattern (HOG-LBP) methods accuracy at the same rate, spends less detection time. Experiments show that the human detection accuracy and efficiency has increased, the case for the human body in partial occlusion detection, the accuracy rate is improved markedly.
     To human pose estimation, the Bayesian model based on the edge contour is used to estimate human motion. We proposed a novel Bayesian method, and introduce trajectories of bones in order to improve the accuracy of the analysis.
     For the video analysis in the human pose estimation, another method based on Conditional Random Field (CRF) model is proposed. The human body silhouette image SIFT feature is extracted, and using SIFT feature database to estimate the pose of the human motion, and using CRF to estimate human posture.
     To improve pose estimation accuracy and meet the requirements of real-time, the human motion rhythmic data is automatic extracted by the proposed method EM-GM algorithm. We build dynamic color-edge features to model the human body, in which the edge information matching using Fast Directional Chamfer Matching (FDCM) method. The rhythm-based 3D motion information is used to estimate the human pose. By rhythmic movement data, the 3D human posture is estimated. Using GPLVM method to reduce the human motion data dimension and then using a local modeling of the dynamic, the 3D human body pose can be estimated.
     For the video image in the human body pose estimation, this thesis presents a constraint graph based video body posture estimation method, first to establish levels for human movement model, defines the human body model. And put forward relevant actions based on the movement of the cluster model, in order to reduce the search space, spanning tree algorithm proposed RPC node graph, and refinement of the merger of RPC nodes, node splitting and spanning tree balancing algorithm. According to RPC node graph spanning tree algorithm, proposed human body posture estimation algorithm for video, and RPC-based spanning tree model inference algorithm. We proposed a Markov chain Monte Carlo method based on 3D human motion silhouette projection library for monocular video images of the human body gesture tracking, motion capture equipment to get the basic movement of the body's appearance in a different library. Perspective projection of the human silhouette of clustering; using HOG monocular video images of the human body to detect the human body can be segmented more accurately the location of the body; the final adoption of the 3D silhouette model of the human body posture inference algorithm to analyze the model for each frame, re-use time constraints of the model to track the target. Constraint graph-driven MCMC using basic movements and combined to build a database for video data modeling and data-driven model is applied to the online behavior recognition; improve the body posture of the modeling capabilities.
引文
[1]Felzenszwalb P., McAllester D.,Ramanan D. A discriminatively trained, multiscale, deformable part model[C]. Proceedings of the IEEE Computer Vision and Pattern Recognition,2008:1-8.
    [2]Hu W., Tan T., Wang S. Maybank. A survey on visual surveillance of object motion and behaviors[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C:Applications and Reviews,2004,34(3):334-352
    [3]Leibe B., Seemann E. Schiele. Pedestrian detection in crowded scenes[C]. Proceedings of the IEEE Computer Vision and Pattern Recognition, 2005:878-885.
    [4]Jixu Chen, M.K., Yu Wang. Qiang Ji. Switching Gaussian Process Dynamic Models for Simultaneous Composite Motion Tracking and Recognition[C]. Proceedings of the IEEE Computer Vision and Pattern Recognition,2009: 2655-2662.
    [5]Li Q. L., Geng W.D., Yu T., et al. MotionMaster:Authoring and Choreographing Kung-fu Motions by Sketch Drawings[C]. Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation,2006:2655-2662.
    [6]Tao Yu,Qilei Li,Weidong Geng. Motion retrieval based on movement notation language[J]. Computer Animation and Virtual Worlds,2005,16(3-4):273-282.
    [7]Dalal N.B. Triggs. Histograms of oriented gradients for human detection[C]. Proceedings of the IEEE Computer Vision and Pattern Recognition,2005: 886-893.
    [8]Urtasun R., Fleet D. J., Fua, P.3D People Tracking with Gaussian Process Dynamical Models[C]. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2006(1):238-245.
    [9]梅丽,鲍虎军,彭群生.特定人脸的快速定制和肌肉驱动的表情动画[J].计算机辅助设计与图形学学报,2001,13(12):1077-1082.
    [10]Gypsy.http://www.animazoo.com/.
    [11]Plus, S.I.http://www.motion-capture-system.com/index.html.
    [12]Optotrak.http://www.ndigital.com/lifesciences/ certus-moti oncapturesystem.php.
    [13]3dsuit.http://www.3dsuit.com/cn/product/.
    [14]mvn-biomch.http://www.xsens.com/en/general/mvn-biomch.
    [15]Foxlin E.M., Harrington. Weartrack:A self-referenced head and hand tracker for wearable computers and portable vr[C]. The Fourth International Symposium on Wearable Computers,2002:155-162.
    [16]ViconMX.http://www.vicon.com/products/viconmx.html.
    [17]Visualeyez.http://www.ptiphoenix.com/Products.php.
    [18]ShapeTape.http://www.measurand.com/products/ShapeTape.html.
    [19]IGS-190-M.http://www.animazoo.com/index.php/igs-190-m.
    [20]Fengliang Xu,Xia Liu,Fujimura, K. Pedestrian detection and trackingwith night vision[J]. IEEE Transactions on Intelligent Transportation Systems,2005,6(l): 63-71.
    [21]Hong Cheng,Nanning Zheng,Junjie Qin. Pedestrian detection using sparse Gabor filter and support vector machine[C]. in Proceedings of IEEE Intelligent Vehicles Symposium,2005:583-587.
    [22]Xian-Bin Cao,Hong Qiao,Keane, J. A low-cost pedestrian detection system with a single optical camera[J]. IEEE Transactions onIntelligent Transportation System 2008,9(1):58-67.
    [23]Zhang D., Lu G Review of shape representation and description techniques [J]. Pattern recognition,2004,37(1):1-19.
    [24]Wang L.,Tan T.,Ning H. Silhouette analysis-based gait recognition for human identification[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2003,25(12):1505-1518.
    [25]Fan, B.Z.,Wang F. Pose estimation of human body based on silhouette images[C]. Proceedings of International Conference on Information Acquisition, 2004:296-300.
    [26]Davis J., Bobick A. A robust human-silhouette extraction technique for interactive virtual environments[J]. Modelling and Motion Capture Techniques for Virtual Environments,1998:12-25.
    [27]Agarwal A.,Triggs B.3D human pose from silhouettes by relevance vector regression [J]. Proceedings of the IEEE Computer Vision and Pattern Recognition,2004(2):882-888.
    [28]Cheung, K.M.G.,Baker S.,Kanade T.Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture[C]. Proceedings of the IEEE Computer Vision and Pattern Recognition, 2003,(1):77-84.
    [29]Andalo F.A., Miranda P.A.V., Torres A.X, et al.Shape feature extraction and description based on tensor scale[J]. Pattern recognition,2010,43(1):26-36.
    [30]Belongie, S., Malik J. Puzicha. Shape matching and object recognition using shape contexts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,22(22):509-522.
    [31]Belongie S., Malik J., Puzicha J. Shape context:A new descriptor for shape matching and object recognition[C], In Advances in Neural Information Processing Systems,2000:1741-1748.
    [32]Mori G., Malik J. Estimating human body configurations using shape context matching[J]. Computer Vision—ECCV 2002,2002:150-180.
    [33]Mori G., Belongie S.J., Malik J. Efficient shape matching using shape contexts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005,27(11):1832-1837.
    [34]Hoover A.B., Olsen D. A real-time occupancy map from multiple video streams[C].Proceedings of IEEE International Conference on in Robotics and Automation,2002:2261-2266.
    [35]Fleuret, F., Berclaz J., Lengagne P., et al. Multicamera people tracking with a probabilistic occupancy map[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,30(2):267-282.
    [36]Felzenszwalb P.F.D., Huttenlocher P. Efficient graph-based image segmentation [J]. International Journal of Computer Vision,2004,59(2): 167-181.
    [37]Chunhui Gu, Arbelaez P., Malik J. Recognition using regions[C]. IEEE Conference on Computer Vision and Pattern Recognition,2009:1030-1037.
    [38]Shahrokni A.G., D.,Ferryman J. A Novel Shape Feature for Fast Region-Based Pedestrian Recognition[C].2010 20th International Conference on Pattern Recognition (ICPR),2010:444-447.
    [39]Lowe D.G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision,2004,60(2):91-110.
    [40]Ke Y., Sukthankar R. PCA-SIFT:A more distinctive representation for local image descriptors[C].2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04),2004:2012-2016.
    [41]Abdel-Hakim A.E.,Farag A.A. CSIFT:A SIFT descriptor with color invariant characteristics[C].2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2006:1978-1983.
    [42]Deva Ramanan. Learning to parse images of articulated bodies[J]. Advances in Neural Information Processing Systems,2007,19:1129-1136.
    [43]Felzenszwalb P.F., Girshick R.B., McAllester D.,Ramanan D. Object detection with discriminatively trained part based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2009:1087-1023.
    [44]Deva Ramanan,Forsyth D.Z.A. Tracking People and Recognizing their Activities[C].2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005:1194-1196.
    [45]Fergus R., Zisserman A. Object class recognition by unsupervised scale-invariant learning[C].2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2003:2077-2083.
    [46]Lazebnik S.,Ponce J. Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories[C].2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2006:2169-2178.
    [47]Christian S.,Ivan L., Barbara C. Recognizing human actions:A local SVM approach[C]. Proceedings of the 17th International Conference on Pattern Recognition,2004:32-36.
    [48]Laptev I., Marszalek M., Schmid C.,Rozenfeld B. Learning realistic human actions from movies[C].2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2008:1-8.
    [49]Dollar P., V. Rabaud, CottrellS G.,et al. Behavior recognition via sparse spatio-temporal features[C]. in Visual Surveillance and Performance Evaluation of Tracking and Surveillance,2005.2nd Joint IEEE International Workshop on, 2005:65-72.
    [50]Jhuang H., Serre T., Wolf L., Poggio T. A biologically inspired system for action recognition[C]. IEEE 11th International Conference on Computer Vision, 2007:1-8.
    [51]Laptev I. On space-time interest points[J]. International Journal of Computer Vision,2005,64(2):107-123.
    [52]Oikonomopoulos A., Patras I., Pantic M. Spatiotemporal salient points for visual recognition of human actions[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics,2006,36(3):710-719.
    [53]Willems G., TuytelaarsL T., Gool Van. An efficient dense and scale-invariant spatio-temporal interest point detector[C]. in Computer Vision-ECCV 2008, 2008:650-663.
    [54]Barron J., Fleet D., Beauchemin S. Performance of optical flow techniques[J]. International Journal of Computer Vision,1994,12(1):43-77.
    [55]Niyogi S. Analyzing and recognizing walking figures in XYT[C]. in CVPR, 1994:469-474.
    [56]Bobick A.F.,Davis J.W. The recognition of human movement using temporal templates[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2002,23(3):257-267.
    [57]Niebles J.C., Wang H.,Li Fei-Fei. Unsupervised learning of human action categories using spatial-temporal words[J]. International Journal of Computer Vision,2008,79(3):299-318.
    [58]Wong S.F., Cipolla R. Extracting spatiotemporal interest points using global information[C], Proc. IEEE Int. Conf. Computer Vision,ICCV 2007:1-8.
    [59]Niebles J.C. A hierarchical model of shape and appearance for human action classification[C]. in 2007 IEEE Conference on Computer Vision and Pattern Recognition,2007:1-8.
    [60]Ai-Min H.A.O., Yue-Dong Y. View-Invariant Action Recognition Based on Action Graphs[J]. Journal of Software,2009,20(10):2679-2691.
    [61]Koller D., Weber J., Malik J. Robust multiple car tracking with occlusion reasoning[C]. in European Conference on Computer Vision 1994,1994: 189-196.
    [62]Blak M. Contour Tracking by Stochastic Propagation of Conditional Density[C]. in European Conference on Computer Vision,1996:343-356.
    [63]Nummiaro K., Koller-Meier E.,Van Gool E. An adaptive color-based particle filter[J]. Image and Vision Computing,2003,21(1):99-110.
    [64]Comaniciu D. Real-Time Tracking of Non-Rigid Objects using Mean Shift,[C]. in Computer Vision and Pattern Recognition CVPR2000,2000(2):142-149.
    [65]Stauffer C. Adaptive background mixture models for real-time tracking[C]. in Computer Vision and Pattern Recognition,1999. IEEE Computer Society Conference on,2002:1437-1442..
    [66]Wren C.R., Azarbayejani T., Darrell A.P. Pentland. Pfinder:Real-time tracking of the human body[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,1997,19(7):780-785.
    [67]Cucchiara R., Grana C., Piccardi A.,et al. Detecting moving objects, ghosts, and shadows in video streams[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2003:1337-1342.
    [68]Elgammal A., Harwood L. D. Non-parametric model for background subtraction [J]. Computer Vision—ECCV 2000,2000:751-767.
    [69]How-Lung Eng,Kar-Ann Toh,Alvin H. Kam,et al. An automatic drowning detection surveillance system for challenging outdoor pool environments[C]. in Ninth IEEE International Conference on Computer Vision (ICCV'03),2003,1,532-535.
    [70]Eng H.L., Wang J.,Wah W.,et al. Robust human detection within a highly dynamic aquatic environment in real time[J]. IEEE Transactions on Image Processing,2006,15(6):1583-1600.
    [71]Heikkila M.,Pietikainen M. A texture-based method for modeling the background and detecting moving objects[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2006,28(4):657-662.
    [72]Monnet A., Mittal A., Paragios N.,et al. Background modeling and subtraction of dynamic scenes[C]. Proceedings of Ninth IEEE International Conference on Computer Vision,2003:1305-1312.
    [73]Haritaoglu I., Harwood D. Davis L.S. W(?)4:real-time surveillance of people and their activities[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,22(8):809-830.
    [74]Guha, P., A. MukerjeeK., Venkatesh S. Efficient occlusion handling for multiple agent tracking by reasoning with surveillance event primitives[C]. in Visual Surveillance and Performance Evaluation of Tracking and Surveillance,2005. 2nd Joint IEEE International Workshop on,2005:49-56.
    [75]Yang T., Pan Q., Li J.,et al. Real-time multiple objects tracking with occlusion handling in dynamic scenes[C]. in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on 2005:970-975.
    [76]Nevatia, Z. Tracking multiple humans in crowded environment[C]. Proceedings of. IEEE Conf. Computer Vision and Pattern Recognition,2004:406-413.
    [77]Zhao, T, Nevatia R.,Wu B. Segmentation and tracking of multiple humans in crowded environments[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2008,30(7):1198-1211.
    [78]Zhu L., Zhou J., Song J. Tracking multiple objects through occlusion with online sampling and position estimation [J]. Pattern recognition,2008,41(8): 2447-2460.
    [79]Horprasert T., HarwoodL D., Davis S. A statistical approach for real-time robust background subtraction and shadow detection[C]. Proceedings of IEEE International Conference on Computer Vision,1999:1-19.
    [80]Rajadell O., Garc P. Textural Features for Hyperspectral Pixel Classification[C]. Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis,2009:208-216.
    [81]Schindler K., Wang H. Smooth foreground-background segmentation for video processing[C]. in Computer Vision-ACCV 2006,2006:581-590.
    [82]McKenna S.J., Jabri S., DuricH Z.,Wechsler. Tracking interacting people[C]. in Fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG'00),2000:348-353.
    [83]Fihl P., Corlin R., Park S., et al. Tracking of individuals in very long video sequences[J]. Advances in Visual Computing,2006,1:60-69.
    [84]Figueroa P.J., Leite N.J., Barros R.M.L. Background recovering in outdoor image sequences:An example of soccer players segmentation[J]. Image and Vision Computing,2006,24(4):363-374.
    [85]Fukunaga K., Hostetler L. The estimation of the gradient of a density function, with applications in pattern recognition[J]. Information Theory, IEEE Transactions on,1975,21(1):32-40.
    [86]Cheng Y. Mean shift, mode seeking, and clustering[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2002,17(8):790-799.
    [87]Chu H., Ye S., Liu G.X. Object tracking algorithm based on CamShift algorithm combinating with difference in frame[C]. in Automation and Logistics,2007 IEEE International Conference on,2007:51-55.
    [88]Allen J.G., Xu J.,Jin S.. Object tracking using camshift algorithm and multiple quantized feature spaces[C]. in Proceedings of the Pan-Sydney area workshop on Visual information processing,2004:3-7.
    [89]Isard M., Blake A. Condensation—conditional density propagation for visual tracking[J]. International Journal of Computer Vision,1998,29(1):5-28.
    [90]Aherne F. The Bhattacharyya Metric as an Absolute Similarity Measure for Frequency Coded Data[J]. Kybernetika,1997,32(4):1-7.
    [91]Czyz J., RisticB B., Macq. A particle filter for joint detection and tracking of color objects[J]. Image and Vision Computing,2007,25(8):1271-1281.
    [92]MacCormick J. Stochastic Algorithms for Visual Tracking:probabilistic modelling and stochastic algorithms for visual localisation and tracking. 2002:2314-2319.
    [93]Arnaud Doucet, Nando De Freitas, Neil Gordon. Sequential Monte Carlo methods in practice[M], Springer,2001:1021-1027.
    [94]Lawrence N.D., M. A. J. Hierarchical Gaussian process latent variable models. [C]. in In Proceedings of the 24th international Conference on Machine Learning, ICML'07,2007:481-488.
    [95]Carranza J., Theobalt C., Magnor H.,et al. Free-viewpoint video of human actors[C]. in ACM SIGGRAPH 2003,2003:569-577.
    [96]Starck J., Hilton A. Model-based multiple view reconstruction of people[C]. in International Conference on Computer Vision,2003:123-129.
    [97]Mulayim A.Y., Yilmaz U., Atalay V. Silhouette-based 3-D model reconstruction from multiple images[J]. Systems, Man, and Cybernetics, Part B:Cybernetics, IEEE Transactions on,2003,33(4):582-591.
    [98]Plankers R., Fua P. Articulated soft objects for multiview shape and motion capture[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2003,25(9):1182-1187.
    [99]Delamarre Q., Faugeras O.3D articulated models and multi-view tracking with silhouettes[C]. in The Proceedings of the Seventh IEEE International Conference on Computer Vision,1999,1999:716-721.
    [100]Natarajan P., Nevatia R. View and scale invariant action recognition using multiview shape-flow models[C]. IEEE Conference on Computer Vision and Pattern Recognition,2008:1-8.
    [101]Hofmann M.D., Gavrila M. Multi-view 3D human pose estimation combining single-frame recovery, temporal integration and model adaptation[C]. in Computer Vision and Pattern Recognition,2009. IEEE Conference on,2009: 2214-2221.
    [102]Timothy J., Roberts S.J. Human Pose Estimation Using Learnt Probabilistic Region Similarities and Partial Configurations [C]. in Computer Vision- ECCV,2004:291-303.
    [103]Zhao X., Ning H., Liu Y., et al. Discriminative estimation of 3D human pose using gaussian processes[C].19th International Conference on Pattern Recognition,2009:1-4.
    [104]Agarwal A.T. Recovering 3D human pose from monocular images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,1:44-58.
    [105]Martinerie F., Forster P. Data association and tracking using hidden Markov models and dynamic programming[C]. in Acoustics, Speech, and Signal Processing,1992 IEEE International Conference on,1992:449-452.
    [106]Rabiner L.R. A tutorial on hidden Markov models and selected applications in speech recognition[J]. Proceedings of the IEEE,1989,77(2):257-286.
    [107]Forney J., G.D. The viterbi algorithm[J]. Proceedings of the IEEE,1973,61(3): 268-278.
    [108]Jonathan Deutscher, Andrew Blake, Ian Reid. Articulated body motion capture by annealed particle filtering[C], IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2000:2126-2132.
    [109]Pavlovic V., Rehg J.M., Cham K.,et al. A dynamic Bayesian network approach to figure tracking using learned dynamic models[C]. in Computer Vision,1999. The Proceedings of the Seventh IEEE International Conference on,1999: 94-101.
    [110]Sidenbladh H., Black L.,Sigal M. Implicit probabilistic models of human motion for synthesis and tracking[C]. in Computer Vision—ECCV 2002,2002: 784-800.
    [111]Sidenbladh H., Black M., Fleet D. Stochastic tracking of 3D human figures using 2D image motion[C]. in Computer Vision—ECCV 2000,2000:702-718.
    [112]Toyama K., Blake A. Probabilistic tracking with exemplars in a metric space[J]. International Journal of Computer Vision,2002,48(1):9-19.
    [113]Ioffe S.,Forsyth D. Human tracking with mixtures of trees[C]. in Computer Vision,2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on,2001:690-695.
    [114]O'Rourke J., Badler I. Model-based image analysis of human motion using constraint propagation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1980,2(6):522-536.
    [115]Gavrila D.M., Davis L.S.3-D model-based tracking of humans in action:a multi-view approach[C]. Proceedings CVPR'96 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,1996:73-80.
    [116]Bregler C., Malik J. Tracking people with twists and exponential maps[C]. Proceedings.1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,1998:8-15.
    [117]Sminchisescu C.,Triggs, B. Kinematic jump processes for monocular 3D human tracking[C]. Proceedings.2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2003.,2003:69-76.
    [118]Sminchisescu C.,Triggs B. Covariance scaled sampling for monocular 3D body tracking[C]. Proceedings.2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,,2001:1347-1352.
    [119]Marks T.K., Hershey J., Roddey J.,et al.3d tracking of morphable objects using conditionally gaussian nonlinear filters[C]. in Computer Vision and Pattern Recognition Workshop,2004. CVPRW'04. Conference on,2004: 190-196.
    [120]Marks T.K., Hershey J., Roddey J.C.,et al. Joint tracking of pose, expression, and texture using conditionally Gaussian filters[J]. Advances in neural information processing systems,2005,17:889-896.
    [121]甘明刚,陈杰,刘劲等.一种基于三帧差分和边缘信息的运动目标检测方法[J].电子与信息学报,2010,34(4):894-897.
    [122]Piotr Dollar, Boris Babenko Serge Belongie,Pietro Perona Zhuowen Tu. Multiple component learning for object detection[C]. in Proceedings of 10th European Conference on Computer Vision,2008:211-224.
    [123]Lienhart R.,Maydt J. An extended set of haar-like features for rapid object detection[C]. in Proceedings 2002 International Conference on Image Processing,2002:900-903.
    [124]Schwartz W.R.K., Aniruddha, et al. Human detection using partial least squares analysis[C]. in 2009 IEEE 12th International Conference on Computer Vision, 2009:24-31.
    [125]Ramanan D., Forsyth D.A., Zisserman A. Tracking people by learning their appearance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2007:65-81.
    [126]Chandrasekhar V, T.G., Chen M,et al. Compressed histogram of gradients:a low-bitrate descriptor[J]. International Journal of Computer Vision,2011,94(3): 317-334.
    [127]Xiaoyu Wang, T.X.H.,.S.Y. An HOG-LBP Human Detector with Partial Occlusion Handling[C]. in 2009 IEEE 12th International Conference on Computer Vision,2009:417-423..
    [128]李同治,丁晓青,王生进.利用级联SVM的人体检测方法[J].中国图象图形学报,2008,13(3):566-570.
    [129]David G, A.D.S., Antonio L.,et al. Adaptive image sampling and windows classification for on-board pedestrian detection[C]. in The 5th International Conference on Computer Vision Systems,2007:1521-1528.
    [130]Papageorgiou C.T.P. A trainable system for object detection[J]. International Journal of Computer Vision,2000,38(1):15-33.
    [131]Massimiliano Albanese, R.C., Naresh Cuntoor, et al.. PADS:A Probabilistic Activity Detection Framework for Video Data[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2010:2246-2261.
    [132]CMU, Cmu motion capture library.EB/OL.2007.http:,mocap.cs.cmu.edu.
    [133]Ramanan D., Forsyth D.A. Finding and tracking people from the bottom up[C]. Proceedings.2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2003:187-193.
    [134]Zhu S., R.Z., Tu Z. Integrating bottom-up/top-down for object recognition by data driven markov chain monte carlo[C]. in Proceedings of. IEEE Conf. Computer Vision and Pattern Recognition,2000:738-745.
    [135]Nevatia., M.L.a.R. Dynamic human pose estimation using markov chain monte carlo approach motion[C]. in IEEE Workshop on Motion and Video Computing,2005:168-175.
    [136]John Lafferty, A.M., Fernando Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. [C]. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML-2001),2001:214-220.
    [137]McCallum A. Efficiently Inducing Features of Conditional Random Fields[C]. in In Proceedings of the 19th Conference in Uncertainty in Articifical Intelligence,2003:756-760.
    [138]Ramanan, D.. Training deformable models for localization[C]. in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006:206-213.
    [139]Felzenszwalb, P.F., Huttenlocher D.P. Pictorial structures for object recognition[J]. International Journal of Computer Vision,2005,61(1):55-79.
    [140]McCallum A., Freitag D., Pereira F. Maximum entropy Markov models for information extraction and segmentation.[C]. in Proc. ICML 2000,2000: 591-598.
    [141]Barrow, H.G., Parametric correspondence and chamfer matching:Two new techniques for image matching.1977, DTIC Document
    [142]Danielsson, O.C., S.Sullivan, J. Automatic learning and extraction of multi-local features[C]. in 2009 IEEE 12th International Conference on Computer Vision,2009:917-924.
    [143]Liu, M.Y., Tuzel O., Veeraraghavan A.,et al. Fast directional chamfer matching[C]. in 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2010:1696-1703.
    [144]Lawrence N.D. Gaussian process latent variable models for visualisation of high dimensional data. [C]. in Proc. NIPS 16,2004,2004:1085-1090.
    [145]Ariadna Quattoni, Louis-Philippe Morency, Michael Collins, et al. Hidden Conditional Random Fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(10):1848-1852.
    [146]Sminchisescu.C, A.K., Metaxas.D. Learning joint top-down and bottom-up processes for 3d visual inference[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06),2006:1743-1752.
    [147]Lee. M.W.,R.N. Human pose tracking in monocular sequence using multilevel structured models [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,31(1):27-38.
    [148]Weiwei Guo,Patras I.Q.M. Discriminative 3D human pose estimation from monocular images via topological preserving hierarchical affinity clustering[C]. in 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops),2009:9-15.
    [149]Elgammal A.,Chan-Su Lee. Inferring 3d body pose from silhouettes using activity manifold learning[C]. in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04),2004:681-688.
    [150]Lv F. N.R. Single view human action recognition using key pose matching and viterbi path searching[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04),2007:1-8.
    [151]Everingham. M.A.V.G. L., Williams, C. K. I., Winn, J., Zisserman. EB/OL.2008.A.The PASCAL VOC2008 Results.2008
    [152]Wang Y., Zhang N.L., Chen T. Latent tree models and approximate inference in Bayesian networks[J]. Journal of Artificial Intelligence Research,2008, 32(1):879-900.
    [153]Mun Wai Lee,Cohen I. Proposal maps driven mcmc for estimating human body pose in static images[C]. in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04),2004:334-341.
    [154]Zhuowen Tu, Song-Chun Zhu, Heung-yeung Shum. Image segmentation by data-driven markov chain monte carlo[J]. IEEE Trans. Pattern Analysis and Machine Intelligence,2002,24(5):657-672.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700