连接刚体及人体姿态估计的理论与方法
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
人类的视觉是一套复杂而精密的系统,具有察觉、分辨物体的能力,并能分析和判断物体的结构、姿态及运动,从而在复杂环境中自由、安全地行走和活动。随着现代科技的发展,特别是计算机技术的迅猛进步,如何使机器实现人类的这种视觉功能已成为科学家面临的一个极富挑战的研究课题,并形成计算机视觉学科。计算机视觉的总体研究目标是从可视媒体(包括图像及视频)中创建或恢复世界模型,然后认知现实世界。而在这个世界上,人的运动携带了大量对于对人类社会而言非常重要的信息,人与人、人与物体及人与环境之间的交互构成了可视媒体的主要内容。因此,研究可视媒体中的人体运动信息,对其进行有效的表示、分析和理解,具有着重要的意义。姿态及运动估计问题,作为计算机视觉研究中的一个重要类别,也经历了研究目标从刚体过渡到连接刚体,以至于人体的发展过程。这其中的每一步发展,都伴随着新理论的提出及新方法的实践,也带来了更广泛的应用前景和经济、社会价值。
     例如,在虚拟现实领域,通过基于视觉的人体运动姿态估计来捕捉人的动作,生成动画,可以代替价格昂贵的基于传感器的捕获设备。通过人体运动模型和关节运动的规律合成新的运动,自动生成复杂的人体运动场景,可以取代那些手工的费时费力的动画合成方法。在人机交互和高级用户接口应用领域中,我们希望未来的机器能像人一样与我们更加容易和便捷地交流,如手势驱动控制、手语翻译等。面向聋哑人的自动售货机将能识别哑语,体现对残疾人更多的关心。在智能安全监控中,利用基于人体姿态的运动分析可以在预防和减少犯罪方面发挥重要作用。这种智能监控系统可以在无人值守的情况下,自动理解人的行为,并及时发出警报,减少损失。分割图像中的人体部分并在图像序列中提取出人体的骨架,估计并分析感兴趣的关节运动姿态,对于建立人体的几何模型、解释人体的运动行为机制从而提高它的运动性能有着积极的推动作用,这可以应用于体育运动、舞蹈等训练中。
     针对连接刚体及人体姿态估计中所存在的一些问题,本文对一些关键问题进行了研究,主要包括图像序列中人体前景部分的自动化提取、基于切圆不变性的旋转曲面姿态估计、基于随机树的低分辨率图像中的头部姿态估计、基于粒子置信传播算法的二维人体姿态估计、改进的生成式三维人体姿态估计和结合生成式算法与判别式算法的三维人体姿态估计。主要研究内容和成果如下:
     提出了一种自动提取图像序列中前景部分的算法框架。首先,利用运动估计算法根据相邻帧的图像得到帧间的运动矢量(光流)。然后可以通过挖去所有发生运动的区域来得到一个背景的不完整图像。再利用具有缺失数据的主成份分析,可以从这些不完整图像中恢复出完整的背景图像。最后,一个简单的背景减除即可分割出前景部分来。该算法框架的优点在于能够根据运动信息自动产生背景模型,因此无需单独提供背景图像(或视频)作为分割依据,可用于处理那些已经存在并无法重现的视频。通过实验比较,该方法可获得比通常使用的混合高斯模型算法更好的结果。
     对于旋转曲面刚体或连接刚体的运动图像序列,我们在其外轮廓上发现了一种切圆不变性。该不变性能够为求解其姿态提供约束,从而使得我们可以仅从一个旋转曲面物体的两幅不同姿态的图像中求解出其姿态。该求解算法与之前同类算法相比,所需的条件更少(不需要物体具有可辨识的纹理、不需要图像中存在对象的圆形切面),因此是一种更为通用的算法。通过模拟实验、在旋转曲面的刚体及连接刚体上的实验,证明了这种算法的可行性和有效性。
     提出了一种利用霍夫森林在低分辨率图像中进行头部姿态估计的算法。姿态估计是通过一种类似霍夫变换的投票过程完成,图像中所有位于头部区域的固定大小的图像块就头部的位置和姿态进行投票。这样做的根据是我们认为这些例如眼部、头发或颈部的图像块包含了关于头部姿态的大量信息。投票过程最终通过随机森林完成,这是一种高效且鲁棒的分类工具。利用头部姿态的真值与估计结果的比较,验证了所提出算法的有效性。
     提出了一种基于运动的时空连续性的2D连接刚体姿态估计算法,该算法用图模型中的节点(Node)表示连接刚体的每一部分的姿态,用边(Edge)表示连接刚体间的姿态关联及前后帧图像间的姿态连续,采用粒子置信转播来推断最大后验概率的状态。由于粒子置信传播算法仅考虑具有潜在可能性的状态,使得在状态空间巨大时依然能够得到推断结果。这种估计算法的优点在于不需要初始姿态,也不需要训练数据。我们通过实验验证了该算法。
     对两种典型的生成式人体姿态估计算法,模拟退火的粒子滤波和非参数置信传播算法,在HumanEva和PEAR两个人体动作数据库上进行了实验及算法评估与比较。随后提出了两种对生产式算法的改进:第一种根据对APF和NBP算的比较及评估所提供的结论,提出了一种同时利用两者优点的姿态估计算法。第二种改进通过将生成式算法得到的人体各部位姿态作为生成式算法的输入,综合得到整体姿态。
Human vision is a complicated and precise system who has the ability to perceive and recog-nize. Human can analyze and make judgement about the structure, pose and movement of objectswith vision. So human may move and live in complicated environment. With the developmentof modern science (especially the development of computer science), how to achieve vision formachines has become a challenging research area for scientists. The goal of computer vision is tocreate, reconstruct and understand the whole world from visual media (image and video). Humanactivities is mostly concerned for us as it is the main content in visual media. How human interactwith each other, how human interact with objects or how human interact with environment is themost important topic. So the research of human motion in visual media, including its represen-tation, analysis and understanding, is vital to us. The problem of pose/motion estimation is alsoan active area of computer vision. The research object is rigid object at first. Then it becomesarticulated objects and human body. Every progress in this direction comes with new theory andinnovate approach. It brings new applications for our society.
     For example, in virtual reality, the vision-based human pose estimation could be used todo motion capture for generating animation. Compared to traditional motion capture device, thevision-based method is convenient and less expensive. By using new motion sequences drivenby human motion models and joints physical property, the complex human motion scenes canbe generated automatically. Such methods outperform the traditionally hand adjusted animationwhich is time consuming. In the application of human-computer interaction and advanced userinterface, we hope computers can act like a human. The computer can communicate with humanmore convenient with gesture-based control or understand gesture language. If a vendor machinecould understand the gesture of dumb people, that would make them feel more comfortable. Inintelligent surveillance, action analysis based on human pose could play an important role forpreventing crime from happening. Such an intelligent surveillance system could understand thehuman active automatically, and bring alarm when necessary. Segmentation of human part inimage sequences, estimation and analysis of joints pose are important to build human body model and explain the mechanism of human motion. Such technologies can be used in training of sportsand dancing.
     Since there are many unsolved problems in the articulated object and human body pose es-timation. We studied some key issues including motion-based background subtraction, using sil-houette for pose estimation of object with surface of revolution, head pose estimation from low-resolution image with hough forest, 2d human pose estimation by particle belief propagation, im-proved generative 3d human tracing and 3d human pose estimation by combining generative anddiscriminative methods. The major research contents and results of this thesis are as follows.
     A framework is suggested to segment the foreground in image sequences using backgroundsubtraction based on the reconstructed background image for each frame. First, the consec-utive frames are taken as inputs for a motion estimation algorithm to calculate the motionvector between frames. Second, an incomplete background image can be achieved by cut-ting moving parts from origin images. Then, all the incomplete background images from thesame sequence can be used to model the background by probabilistic principle componentanalysis with missing data. The background image can be estimated for each frame. Fi-nally, a simple background subtraction can segment the foreground. The effectiveness of ourmethod is demonstrated in different illumination conditions and compared to the commonlyused Gaussian mixture models method.
     A novel approach of pose estimation is proposed for the object with surface of revolu-tion(SOR). The silhouette of the object is the only information necessary for this methodand no cross section circle (latitude circle) is needed. We explain the property of tangentcircle and use it to establish constraint between two images of object with different poses.Such constraint can help to solve the pose of object in both images. We test our method witha simulation experiment and use it to estimate the pose for both rigid body and articulatedobject.
     An approach for head pose estimation has been proposed using Hough forest. The esti-mation of pose are generated by voting from image patches as in a Hough transform. Thebasic idea is that image patches which contain eyes, hair or neck can give rich informationabout the head position and orientation. The voting process is implemented by randomizedforest which is an efficient and robust tool for classification and regression. The method isquantitatively evaluated by comparing the estimated pose to the ground truth.
     We propose an approach to address the articulated body tracking problem by utilizing spatialand temporal consistency with Particle Belief Propagation which is an extension to discrete Belief Propagation (BP). By considering only potentially good candidates, it makes the in-ference practical and effective when the state space is enormous. The main advantage of thisapproach is that the tracking process can be self-initialized and no training is needed. Thealgorithm is verified by experiments on monocular image sequences.
     Two frequently used generative methods for human pose estimation and tracking, AnnealedParticle Filter (APF) and Non-parametric Belief (NBP) Propagation are evaluated on two hu-man motion database—HumanEva and PEAR. By analyzing their principles and experimentresults, these two algorithms are compared. By combining advantages of two methods, animproved method is proposed for pose estimation. It is proved in experiment that the newmethod is better than both APF and NBP. Another improvement on generative methods is tocombine the detection result provided by discriminative methods.
引文
[1] BRADSKI G R. Computer vision face tracking for use in a perceptual userinterface[J]. Intelligence Technology, 1998, 2:1–15.
    [2] HUANG F, CHEN T. Tracking of multiple faces for human-computer interfacesand virtual environments[C]//Multimedia and Expo, ICME. NY, USA: IEEE,2000,3:1563–1566.
    [3] COLLINS R, LIPTON A, KANADE T. Introduction to the special section onvideo surveillance[J]. IEEE Transactions on Pattern Analysis and Machine In-telligence, 2000:745–746.
    [4] KANG S. Hands-free interface to a virtual reality environment using head track-ing[M]. USA: Google Patents, 1999. US Patent 6,009,210.
    [5] GAO W. Sports video analysis[C]//Multi-Media Modelling Conference Pro-ceedings, 2006. Beijing, China.: IEEE:1–pp.
    [6] DHAWAN A. Medical image analysis[M]. Vol. 11. US: Wiley-Interscience,2003.
    [7] SMEULDERS A, WORRING M, SANTINI S, et al. Content-based image retrievalat the end of the early years[J]. Pattern Analysis and Machine Intelligence, IEEETransactions on, 2000, 22(12):1349–1380.
    [8] MARPE D, SCHWARZ H, WIEGAND T. Context-based adaptive binary arith-metic coding in the H. 264/AVC video compression standard[J]. Circuits andSystems for Video Technology, IEEE Transactions on, 2003, 13(7):620–636.
    [9] COWIE R, DOUGLAS-COWIE E, TSAPATSOULIS N, et al. Emotion recognitionin human-computer interaction[J]. Signal Processing Magazine, IEEE, 2001,18(1):32–80.
    [10] WILSON A, BOBICK A. Parametric hidden markov models for gesture recog-nition[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,1999, 21(9):884–900.
    [11] SHOTTON J, FITZGIBBON A, COOK M, et al. Real-Time Human Pose Recog-nition in Parts from Single Depth Images[J].
    [12] RAMANAN D. Learning to parse images of articulated bodies[J]. Advances inNeural Information Processing Systems, 2007, 19:1129.
    [13] RAMANAN D, FORSYTH D. Finding and tracking people from the bot-tom up[C]//Computer Vision and Pattern Recognition, 2003. Proceedings. 2003IEEE Computer Society Conference on. Madison, WI, USA.: IEEE, 2003,2.
    [14] RAMANAN D, FORSYTH D, ZISSERMAN A. Strike a pose: Tracking peopleby finding stylized poses[C]//Computer Vision and Pattern Recognition, 2005.CVPR 2005. San Diego, CA, USA.: IEEE, 2005,1:271–278.
    [15] DUETSCHER J, BLAKE A, REID I. Articulated body motion capture by an-nealed particle filtering[C]//IEEE Computer Society Conference on ComputerVision and Pattern Recognition. Hilton Head, SC, USA.: IEEE, 2000:2126.
    [16] DEUTSCHER J, REID I. Articulated body motion capture by stochasticsearch[J]. International Journal of Computer Vision, 2005, 61(2):185–205.
    [17] ISARD M, BLAKE A. Condensation―conditional density propagation for vi-sual tracking[J]. International journal of computer vision, 1998, 29(1):5–28.
    [18] SIGAL L, BHATIA S, ROTH S, et al. Tracking loose-limbed people[J]. 2004.
    [19] SIGAL L, BLACK M. Predicting 3d people from 2d pictures[J]. ArticulatedMotion and Deformable Objects, 2006:185–195.
    [20] SIGAL L, ISARD M, SIGELMAN B, et al. Attractive people: Assembling loose-limbed models using non-parametric belief propagation[J]. Advances in NeuralInformation Processing System, 2004, 16.
    [21] AGGARWAL J, CAI Q. Human motion analysis: A review[J]. Computer Visionand Image Understanding, 1999, 73(3):428–440.
    [22] SIGAL L, BALAN A, BLACK M. Humaneva: Synchronized video and mo-tion capture dataset and baseline algorithm for evaluation of articulated humanmotion[J]. International journal of computer vision, 2010, 87(1):4–27.
    [23] POPPE R. Vision-based human motion analysis: An overview[J]. ComputerVision and Image Understanding, 2007, 108(1-2):4–18.
    [24] MOESLUND T, GRANUM E. A survey of computer vision-based human motioncapture[J]. Computer Vision and Image Understanding, 2001, 81(3):231–268.
    [25] MOESLUND T, HILTON A, KRUGER V. A survey of advances in vision-basedhuman motion capture and analysis[J]. Computer Vision and Image Under-standing, 2006, 104(2-3):90–126.
    [26] OREN M, PAPAGEORGIOU C, SINHA P, et al. Pedestrian detection usingwavelet templates[C]//IEEE Conference on Computer Vision and Patter Recog-nition. .[S.l.]: [s.n.] , 1997:193.
    [27] DAI C, ZHENG Y, LI X. Pedestrian detection and tracking in infrared imageryusing shape and appearance[J]. Computer Vision and Image Understanding,2007, 106(2-3):288–299.
    [28] REHG J, KANADE T. Visual tracking of high dof articulated structures: anapplication to human hand tracking[J]. Computer Vision―ECCV’94, 1994:35–46.
    [29] MACCORMICK J, ISARD M. Partitioned sampling, articulated objects, andinterface-quality hand tracking[J]. Computer Vision―ECCV 2000, 2000:3–19.
    [30] BIRCHFIELD S. Elliptical head tracking using intensity gradients and colorhistograms[C]//IEEE Computer Society Conference on Computer Vision andPattern Recognition, Proceedings. Los Angeles, CA, US.: IEEE, 1998:232–237.
    [31] LA CASCIA M, SCLAROFF S, ATHITSOS V. Fast, reliable head tracking undervarying illumination: An approach based on registration of texture-mapped 3Dmodels[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2000, 22(4):322–336.
    [32] DALAL N, TRIGGS B. Histograms of oriented gradients for human detec-tion[C]//IEEE Computer Society Conference on Computer Vision and PatternRecognition. San Diego, CA, USA.: IEEE, 2005,1:886–893.
    [33] BALAN A, SIGAL L, BLACK M, et al. Detailed human shape and pose fromimages[C]//2007 IEEE Conference on Computer Vision and Pattern Recogni-tion. .[S.l.]: [s.n.] , 2007:1–8.
    [34] HAN T, NING H, HUANG T. Efficient nonparametric belief propagation withapplication to articulated body tracking[C]//IEEE Computer Society Confer-ence on Computer Vision and Pattern Recognition,. New York, NY, USA.:IEEE, 2006,1:214–221.
    [35] FOSSATI A, SALZMANN M, FUA P. Observable subspaces for 3D humanmotion recovery[C]//IEEE Conference on Computer Vision and Pattern Recog-nition,. Miami, FL, USA: IEEE, 2009:1137–1144.
    [36] ZHAO X, NING H, LIU Y, et al. Discriminative estimation of 3D human poseusing gaussian processes[C]//19th International Conference on Pattern Recog-nition,. Tampa, Florida, USA.: IEEE, 2009:1–4.
    [37] AGARWAL A, TRIGGS B. Recovering 3D human pose from monocular im-ages[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2005, 28(1):44–58.
    [38] AGARWAL A, TRIGGS B. 3D human pose from silhouettes by relevance vectorregression[J]. 2004.
    [39] STAUFFER C, GRIMSON W. Adaptive background mixture models for real-time tracking[C]//IEEE Computer Society Conference on Computer Vision andPattern Recognition,. Santa Barbara, CA, USA.: IEEE, 2002,2.
    [40] REN X, MALIK J. Learning a classification model for segmentation[J]. 2003.
    [41] STEIN A, HOIEM D, HEBERT M. Learning to find object boundaries usingmotion cues[C]//Computer Vision, 2007. ICCV 2007. IEEE 11th InternationalConference on. .[S.l.]: [s.n.] , 2007:1–8.
    [42] ELGAMMAL A, HARWOOD D, DAVIS L. Non-parametric model for back-ground subtraction[J]. Computer Vision―ECCV 2000, 2000:751–767.
    [43] ROSS M, KAELBLING L. Learning static object segmentation from motionsegmentation[C]//Proceedings of The National Conference on Artificial Intelli-gence. Pittsburgh, Pennsylvania,: AAAI Press, 2005,20:956.
    [44] STEIN A, HEBERT M. Combining local appearance and motion cues for occlu-sion boundary detection[C]//British machine vision conference (BMVC). War-wick, UK.: British Machine Vision Association, 2007.
    [45] HORN B, SCHUNCK B. Determining optical ?ow[J]. Artificial intelligence,1981, 17(1-3):185–203.
    [46] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//MachineLearning-International Workshop then Conference. Massachusetts: MorganKaufmann, 2001:282–289.
    [47] WANG Y, JI Q. A dynamic conditional random field model for object segmen-tation in image sequences[J]. 2005.
    [48] FELZENSZWALB P, HUTTENLOCHER D. Efficient belief propagation for earlyvision[J]. International journal of computer vision, 2006, 70(1):41–54.
    [49] SUN J, ZHENG N, SHUM H. Stereo matching using belief propagation[J]. IEEETransactions on Pattern Analysis and Machine Intelligence, 2003:787–800.
    [50] AN S, AN D. Stochastic relaxation, Gibbs distributions, and the Bayesianrestoration of images[J]. IEEE Trans. Pattern Anal. Machine Intell, 1984,6(6):721–741.
    [51] BHATT M, DESAI U. Robust image restoration algorithm using Markov ran-dom field model[C]//1992 IEEE International Symposium on Circuits and Sys-tems, 1992. ISCAS’92. Proceedings. San Diego, Ca.: IEEE, 1994,5:2473–2476.
    [52] ROTH S, BLACK M. Fields of experts: A framework for learning image pri-ors[J]. 2005.
    [53] BOUMAN C, SHAPIRO M. A multiscale random field model for Bayesian imagesegmentation[J]. Image Processing, IEEE Transactions on, 1994, 3(2):162–177.
    [54] FREEMAN W, PASZTOR E, CARMICHAEL O. Learning low-level vision[J].International Journal of Computer Vision, 2000, 40(1):25–47.
    [55] MURPHY K, WEISS Y, JORDAN M. Loopy belief propagation for approximateinference: An empirical study[J]. 1999:467–475.
    [56] YEDIDIA J, FREEMAN W, WEISS Y. Understanding belief propagation andits generalizations[J]. Exploring artificial intelligence in the new millennium,2003, 8:236–239.
    [57] JOLLIFFE I. Principal component analysis[J]. 2002.
    [58] DE LA TORRE F, BLACK M. A framework for robust subspace learning[J].International Journal of Computer Vision, 2003, 54(1):117–142.
    [59] OLIVER N, ROSARIO B, PENTLAND A. A Bayesian computer vision systemfor modeling human interactions[J]. Pattern Analysis and Machine Intelligence,IEEE Transactions on, 2002, 22(8):831–843.
    [60] YANG J, ZHANG D, FRANGI A, et al. Two-dimensional PCA: a new approachto appearance-based face representation and recognition[J]. IEEE Transactionson Pattern Analysis and Machine Intelligence, 2004:131–137.
    [61] MOON T. The expectation-maximization algorithm[J]. Signal Processing Mag-azine, IEEE, 1996, 13(6):47–60.
    [62] CHEN T, MARTIN E, MONTAGUE G. Robust probabilistic PCA with missingdata and contribution analysis for outlier detection[J]. Computational Statistics& Data Analysis, 2009, 53(10):3706–3716.
    [63] LIU Y, HUANG T, FAUGERAS O. Determination of camera location from 2Dto 3D line and point correspondences[C]//Computer Vision and Pattern Recog-nition, 1988. Proceedings CVPR’88., Computer Society Conference on. .[S.l.]:[s.n.] , 1988:82–88.
    [64] LIU Y, ZHANG X, HUANG T. Estimation of 3D structure and motion fromimage corners[J]. Pattern recognition, 2003, 36(6):1269–1277.
    [65] LOWE D. Object recognition from local scale-invariant fea-tures[C]//International Conference on Computer Vision. Kerkyra, Corfu,Greece: IEEE, 1999:1150.
    [66] HUANG T, NETRAVALI A. Motion and structure from feature correspondences:a review[J]. Proceedings of the IEEE, 1994, 82(2):252–268.
    [67] CIPOLLA R, BLAKE A. Surface shape from the deformation of apparent con-tours[J]. International Journal of Computer Vision, 1992, 9(2):83–112.
    [68] WANG G, WU Q, JI Z. Pose Estimation from Circle or Parallel Lines in a SingleImage[C]//Asian Conference on Computer Vision. Tokyo, Japan.: Springer,2007,4844:363–372.
    [69] COLOMBO C, DEL BIMBO A, PERNICI F. Metric 3D Reconstruction and Tex-ture Acquisition of Surfaces of Revolution from a Single Uncalibrated View[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005:99–114.
    [70] UTCKE S, ZISSERMAN A. Projective Reconstruction of Surfaces of Revolu-tion[J]. LECTURE NOTES IN COMPUTER SCIENCE, 2003:265–272.
    [71] SETHI A, RENAUDIE D, KRIEGMAN D, et al. Curve and Surface Duals andthe Recognition of Curved 3D Objects from their Silhouettes[J]. InternationalJournal of Computer Vision, 2004, 58(1):73–86.
    [72] FURUKAWA Y, SETHI A, PONCE J, et al. Robust Structure and Motion fromOutlines of Smooth Curved Surfaces[J]. IEEE Transactions on Pattern Analysisand Machine Intelligence, 2006:302–315.
    [73] WONG K, MENDONC?A P, CIPOLLA R. Reconstruction of surfaces of revo-lution from single uncalibrated views[J]. Image and Vision Computing, 2004,22(10):829–836.
    [74] MENDONCA P, WONG K, CIPPOLLA R. Epipolar geometry from profiles undercircular motion[J]. Pattern Analysis and Machine Intelligence, IEEE Transac-tions on, 2001, 23(6):604–616.
    [75] ZHENG Y, MA W, LIU Y. Another Way of Looking at Monocular CirclePose Estimation[C]//International Conference on Image Processing. San Diego,California, USA.: IEEE, 2008.
    [76] COMANICIU D, MEER P. Mean Shift: A Robust Approach Toward FeatureSpace Analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 2002:603–619.
    [77] MEER P, GEORGESCU B. Edge detection with embedded confidence[J]. IEEETransactions on Pattern Analysis and Machine Intelligence, 2001:1351–1365.
    [78] CHRISTOUDIAS C, GEORGESCU B, MEER P. Synergism in low level vi-sion[C]//16th International Conference on Pattern Recognition, Proceedings.Quebec, Canada.: IEEE, 2002,4:150–155.
    [79] LIU J, MUNDY J, FORSYTH D, et al. Efficient recognition of rotationallysymmetric surfaces andstraight homogeneous generalized cylinders[C]//IEEEComputer Society Conference on Computer Vision and Pattern Recognition,Proceedings. New York City, NY, USA.: IEEE, 1993:123–128.
    [80] HARTLEY R, ZISSERMAN A. Multiple View Geometry in Computer Vi-sion[M].[S.l.]: Cambridge University Press, 2003.
    [81] SEMPLE J, KNEEBONE G. Algebraic Projective Geometry[J]. Bull. Amer.Math. Soc. 59 (1953), 571-572. DOI: 10.1090/S0002-9904-1953-09763-4 PII:S, 1953, 2(9904):09763–4.
    [82] WEI Y, FRADET L, TAN T. Head pose estimation using gabor eigenspacemodeling[C]//Proceedings of the IEEE International Conference on Image Pro-cessing (ICIP). Rochester, New York,: IEEE, 2002,1:281–284.
    [83] GALL J, LEMPITSKY V. Class-specific hough forests for object detection[J].2009.
    [84] AMIT Y, GEMAN D. Shape quantization and recognition with randomizedtrees[J]. Neural computation, 1997, 9(7):1545–1588.
    [85] BREIMAN L. Random forests[J]. Machine learning, 2001, 45(1):5–32.
    [86] BARINOVA O, LEMPITSKY V, KOHLI P. On detection of multiple objectinstances using hough transforms[C]//IEEE Conf. Computer Vision and PatternRecognition. San Francisco, USA.: IEEE, 2010.
    [87] ROGEZ G, RIHAN J, RAMALINGAM S, et al. Randomized trees for human posedetection[C]//IEEE Conference on Computer Vision and Pattern Recognition,.Anchorage, Alaska, USA.: IEEE, 2008:1–8.
    [88] DUDA R, HART P. Use of the Hough transformation to detect lines and curvesin pictures[J]. Communications of the ACM, 1972, 15(1):11–15.
    [89] GONZALEZ R, WOODS R, EDDINS S. Digital image processing using MAT-LAB[M].[S.l.]: Pearson Education India, 2004.
    [90] BALLARD D. Generalizing the Hough transform to detect arbitrary shapes[J].Pattern recognition, 1981, 13(2):111–122.
    [91] ILLINGWORTH J, KITTLER J. A survey of the Hough transform[J]. Computervision, graphics, and image processing, 1988, 44(1):87–116.
    [92] XU L, OJA E, KULTANEN P. A new curve detection method: randomizedHough transform (RHT)[J]. Pattern Recognition Letters, 1990, 11(5):331–338.
    [93] LEPETIT V, FUA P. Keypoint recognition using randomized trees[J]. IEEETransactions on Pattern Analysis and Machine Intelligence, 2006:1465–1479.
    [94] GARCIA D. Robust smoothing of gridded data in one and higher dimen-sions with missing values[J]. Computational Statistics & Data Analysis, 2010,54(4):1167–1178.
    [95] MARTIN D, FOWLKES C, MALIK J. Learning to detect natural image bound-aries using local brightness, color, and texture cues[J]. Pattern Analysis andMachine Intelligence, IEEE Transactions on, 2004, 26(5):530–549.
    [96] FELZENSZWALB P, HUTTENLOCHER D. Pictorial structures for object recog-nition[J]. International Journal of Computer Vision, 2005, 61(1):55–79.
    [97] RAMANAN D, FORSYTH D, ZISSERMAN A. Tracking people by learning theirappearance[J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 2007, 29(1):65.
    [98] E. SUDDERTH W F, A. IHLER, A.WILLSKY. Nonparametric belief propaga-tion[C]//IEEE Computer Society Conference on Computer Vision and PatternRecognition. Madison, WI, USA.: IEEE, 2003,1.
    [99] ISARD M. Pampas: Real-valued graphical models for computer vi-sion[C]//IEEE Computer Society Conference On Computer Vision and PatternRecognition. Madison, WI, USA.: IEEE, 2003,1.
    [100] SUDDERTH E, MANDEL M, FREEMAN W, et al. Distributed occlusion reason-ing for tracking with nonparametric belief propagation[J]. Advances in NeuralInformation Processing Systems, 2004, 17:1369–1376.
    [101] DEUTSCHER J, BLAKE A, REID I. Articulated body motion capture by an-nealed particle filtering[C]//IEEE Computer Society Conference on ComputerVision and Pattern Recognition. Hilton Head, SC, USA.: [s.n.] , 2000,2.
    [102] KIRKPATRICK S. Optimization by simulated annealing: Quantitative studies[J].Journal of Statistical Physics, 1984, 34(5):975–986.
    [103] CˇERN Y` V. Thermodynamical approach to the traveling salesman problem: Anefficient simulation algorithm[J]. Journal of optimization theory and applica-tions, 1985, 45(1):41–51.
    [104] GRANVILLE V, KRIV A′NEK M, RASSON J. Simulated annealing: A proof ofconvergence[J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 1994:652–656.
    [105] DAS A, CHAKRABARTI B. Quantum annealing and related optimization meth-ods[M].[S.l.]: Springer Verlag, 2005.
    [106] WEINBERGER E. Correlated and uncorrelated fitness landscapes and how totell the difference[J]. Biological Cybernetics, 1990, 63(5):325–336.
    [107] HORN B. Obtaining shape from shading information[C]//Shape from shading..[S.l.]: [s.n.] , 1989:123–171.
    [108] BLYTHE D, GRANTHAM B, KILGARD M, et al. Advanced graphics program-ming techniques using opengl[J]. Course Note# 29 of SIGGRAPH, 1999, 99.