数字视频中的实时人脸姿态估计研究

英文题名：Real-Time Face Pose Estimation in Video Sequence
作者：杨倩倩
论文级别：硕士
学科专业名称：计算机系统结构
中文关键词：人脸姿态估计 ; 人脸检测 ; 肤色模型 ; 人脸跟踪 ; 连续自适应均值移动算法
英文关键词：Face Pose Estimation ; Face Detection ; Skin color Model ; Face Tracking ; ContinuouslyAdaptive Mean ShiftAlgorithm
学位年度：2009
导师：孙伟平
学科代码：081201
学位授予单位：华中科技大学
论文提交日期：2009-05-26

摘要

人脸姿态估计是判断静态图像或者视频序列中的人脸在三维空间中的姿态的过程。人脸姿态估计作为计算机视觉领域的一个重要研究课题,在人机交互、智能视频监控、虚拟现实以及人脸识别领域有着广阔的应用前景和巨大的应用价值,是近年来的研究热点。
     目前大多数人脸姿态估计方法针对静态图像,方法复杂,计算复杂度高,不适合用于对视频图像中人脸姿态进行估计。提出了一种基于嘴巴在人脸中相对位置的人脸姿态估计方法,该方法简单有效,速度较快,完全能满足视频序列中人脸姿态估计的实时性要求。
     对视频图像中的人脸姿态估计涉及到人脸检测、人脸跟踪、人脸姿态估计等相关课题。在比较和总结了现有的人脸检测方法的基础上,选择了对姿态变化不敏感的肤色模型在视频序列的前几帧中检测出人脸区域,然后利用CAMShift(ContinuouslyAdaptive Mean Shift,连续自适应均值移动)算法对检测出的人脸区域进行跟踪,并用椭圆模型对人脸区域进行拟合,计算出人脸区域在视频帧中的位置。随后在检测到的人脸区域中利用唇色信息定位出嘴巴,根据嘴巴在人脸区域中的相对位置以及拟合人脸的椭圆模型的方向角来对视频图像中的人脸姿态进行估计。最后实现了一个人脸姿态估计原型系统,用录制的视频文件进行了实验,验证了该方法的可行性。
Face Pose Estimation is a process that is through dealing with the static images orvideo sequences to determine the posture of human face in three dimensional space. As animportant research topic in the field of CV(computer vision), Face Pose Estimation inhuman computer interaction, intelligent video surveillance, virtual reality, as well as facerecognition has a wide application prospect and great value, and it is the research hotspot inrecent years.
     At present, a majority of Face Pose Estimation methods is for static images, which iscomplex, of high computational complexity, and is not suitable for estimating dynamicvideo images. In this thesis, a novel Face Pose Estimation method is proposed, which isbased on estimating the relative location of mouth in whole face. The method is simple,effective, fast and fully able to meet the real time requirement for estimating face pose invideo sequences.
     Estimating face pose in video sequences relates to Face Detection, Face Tracking,Face Pose Estimation and other related topics. To begin with, a comparison and summary ofthe existing face detection methods is given, color model which is non sensitive for posturechange is chosen for obtaining the detected face region in first few frames in the videosequences, and then the detected face region is being tracked by the CAMShift(Continuously Adaptive Mean Shift) algorithm, and at the same time which is fitted byellipse model, which is in order to locate human face region in video sequences.Subsequently, using the information about lip color, mouth is located in the detected faceregion and according to the relative location of mouth in whole face and the direction angleof ellipse model for fitted face regions, estimating human face posture in video sequences ismade, and then a prototype system for estimating the face posture is implemented, andsome experiments are made using the recorded video files, which has verified the feasibilityof the method.

引文

[1] Ji Q., Hu R.. 3D Face pose estimation and tracking from a monocular camera. Imageand Vision Computing, 2002,20(7):499~511
    [2] Zhu Y., Fujimura K.. Head Pose Estimation for Driver Monitoring. IEEE IntelligentVehicles Symposium,2004:501~506
    [3] Murphy Chutorian E., Doshi A., Trivedi M. M.. Head Pose Estimation for DriverAssistance Systems: A Robust Algorithm and Experimental Evaluation. In: IEEEIntelligent Transportation Systems Conference, 2007(ITSC2007): 709~714
    [4] Prince. S.., Elder J., Warrell J. et al. Tied Factor Analysis for Face Recognition acrossLarge Pose Differences. IEEE Trans. on Pattern Analysis and Machine Intelligence,2008,30(6):970~984
    [5] Murphy Chutorian E., Trivedi M. M.. Head Pose Estimation in Computer Vision: ASurvey. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009, 31(4):607~625
    [6] Vatahska T., Bennewitz M., Behnke S.. Feature based Head Pose Estimation fromImages. In: Proceedings of IEEE RAS 7th International Conference on HumanoidRobots, Pittsburgh, USA, 2007:330~335
    [7] Krinidis M., Nikolaidis N., Pitas I.. 3D Head Pose Estimation in Monocular VideoSequences Using Deformable Surfaces and Radial Basis Functions. IEEE Trans. onCircuits and Systems for Video Technology,2009,19(2):261~272
    [8] Ebisawa Y.. Head Pose Detection with One Camera Based on Pupil and NostrilDetection Technique. In: IEEE International Conf. on Virtual Environment,Human Computer Interfaces, and Measurement Systems, 2008(VECIMS 2008),14 16:172~177
    [9] Hu Yuxiao, Huang T. S.. Subspace learning for human head pose estimation. In: IEEEInternational Conference on Multimedia and Expo,2008:1585~1588
    [10] Ma Bingpeng, Shan Shiguang, Chen Xilin et al. Head Yaw Estimation FromAsymmetry of Facial Appearance. IEEE Trans. on Systems, Man and Cybernetics,Part B: Cybernetics, 2008,38(6):1501~1512
    [11]刘坤,罗予频,杨士元.光照变化情况下的静态头部姿态估计.计算机工程,2008年5月,34(10):16~18
    [12]梁国远,查红彬,刘宏.基于三维模型和仿射对应原理的人脸姿态估计方法.计算机学报,2005,28(5):792~800
    [13]张洪明,赵德斌,高文.基于肤色模型,神经网络和人脸结构模型的平面旋转人脸检测.计算机学报,2002,25(11):1250~1256
    [14]叶航军,白雪生,徐光佑.基于支持向量机的人脸姿态判定.清华大学学报(自然科学版),2003,43(1):67~70
    [15]胡步发,邱丽梅.基于多点模型的3D人脸姿态估计方法.中国图像图形学报,2008,13(7):1353~1358
    [16]刘淼,郭东伟,马捷等.基于椭圆模型和神经网络的人脸姿态估计方法.吉林大学学报,2008,46(4):687~690
    [17]史东承,于德海,杨冬.一种多视角人脸姿态估计算法.长春工业大学学报(自然科学版),2004,25(1):19~24
    [18] Hjelmas E., Low B. K. Face Detection: A Survey. Computer Vision and ImageUnderstanding, 2001,83:236~274
    [19]梁路宏,艾海舟,徐光等.人脸检测研究综述.计算机学报,2002,25(5):449~458
    [20] Yang M. H., Kriegman D. J., Ahuja N.. Detecting Faces in Images: A Survey. IEEETrans. on PatternAnalysis and Machine Intelligence, 2002,24(1):34~58
    [21]周杰,卢春雨,张长水等.人脸自动识别方法综述.电子学报,2000,28(4):102~106
    [22]唐伟,陈兆乾,吴建鑫等.静态灰度图像中的人脸检测方法综述.计算机科学,2002,29(2):134~137
    [23] Chellappa R., Wilson C. L., Sirohey S.. Human and machine recognition of faces: Asurvey. Proceedings of IEEE, 1995,83:705~740
    [24] Wang J G, Tan T N. Anew face detection method based on shape information. PatternRecognition Letters, 2000,21(6 7):463~471
    [25] Govindaraju V, Srihari S N, Sher D B. A computational model for face location. In:Proc. IEEE Conf. on Computer Vision, Osaka, Japan, 1990:718~721
    [26] Reisfeld D., Wolfson H., Yeshurun Y.. Context free attentional operators: Thegeneralized symmetry transform. International Journal of Computer Vision,1995(14):119~130
    [27] Zhou Jie, Li Yanda, Wang Kunxiang. Directional symmetry transform for human facelocation. Proceedings of SPIE,1998,3545:321~324
    [28] Yang G., Huang T. S.. Human Face Detection in Complex Background. PatternRecognition, 1994,27(1):53~63
    [29] Lu X G, Zhou J, Zhang C S. A novel algorithm for rotated human face detection. In:Proc. IEEE Conference on Computer Vision and Pattern Recognition, Hilton HeadIsland, South Carolina, USA, 2000:760~764
    [30]孙伟平,陈加忠,高毅.一种基于视频流的快速人脸检测算法.计算机工程与应用,2008,44(11):23~25
    [31] Sung K, Poggio T. Example based learning for view based human face detection.IEEE Trans. PatternAnalysis and Machine Intelligence, 1998,20(1):39~51
    [32] Vassili V. V., Sazonov V., Andreeva A.. A Survey on Pixel Based Skin ColorDetection Techniques. In: Proc. Graphicon 2003, 2003:85~92
    [33] Kovac J., Peer P., Solina F.. Human skin color clustering for face detection. In: Proc.of International Conference on Computer as a Tool, 2003,2:144~148
    [34] Yang M., Ahuja N.. Guasssian Mixture Model for human skin color and itsapplication in image and video databases. In: Proc. of the SPIE: Conf. on Storage andRetrieval for Image and Video Databases(SPIE99), 1999, 3656: 458~466
    [35] Jones M. J., Regh J. M.. Statistical color models with applications to skin detection.In: Proc. of the CVPR’99, 1999,1:274~280
    [36] Bojic N., Pang K. K.. Adaptive skin segmentation for head and shoulder videosequence. Visual Communication and Image Processing, 2000,4067:704~711
    [37] Hsu R.L., Mohamed A.M., Jain A K.. Face Detection in Color Images. IEEE Trans.on PatternAnalysis and Machine Intelligence, 2002,24(5):696~706
    [38] Tomaz F., Candeias T., Shahbaskia H.. Fast and accurate skin segmentation in colorimages. In: IEEE Proc. of the First Canadian Conf. on Computer and Robot Vision,2004:180~187
    [39] Lee J. Y., Yoo S. I.. An elliptical boundary model for skin color detection. In: Proc. ofthe International Conf. on Imaging Science, Systems, and Technology,Las Vegas,USA, 2002:579~584
    [40] Brand J., Mason J.. A comparative assessment of three approaches to pixel levelhuman skin detection. In: Proc. of the International Conference on PatternRecognition, 2000,1:1050~1059
    [41] Brown D., Craw I.., Lewthwaite J.. A som based approach to skin detection withapplication in real time systems. In: Proc. of the British Machine Vision Conference,2001:491~500
    [42] Zarit B. D., Super B. J., Quek F.K.H.. Comparison of five color models in skin pixelclassification. In: ICCV’99 Int’l Workshop on Recognition, Analysis and Tracking offaces and gestures in Real Time Systems, 1999:58~63
    [43] Soriano M., Huovinen S., Martinkauppi B. et al. Skin detection in video underchanging illuminantion conditions. In: Proc 15th International Conference on PatternRecognition, 2000,1: 839~842
    [44] Oliver N., Pentland A., Berard F.. Lafter: Lips and face real time tracker. In: Proc.Computer Vision and Pattern Recognition, 1997:123~129
    [45] Yang J., Lu W., Waibel A.. Skin color modeling and adaptation. In: Proceedings ofACCV 1998, 1998:687~694
    [46] Mckenna S. J., Gong S., Raja Y.. Modelling facial colour and identity with Gaussianmixtures. Pattern Recognition, 1998,31(12):1883~1892
    [47] Siganl L., Sclaroff S., Atbitsos V.. Estimation and prediction of evolving colordistributions for skin segmentation under varying illumination. In: Proc. IEEE Conf.on Computer Vision and Pattern Recognition, 2000,2:152~159
    [48] Terrillon J. C., Shirazi M. N., Fukamachi H. et al.. Comparative performance ofdifferent skin chrominance models and chrominance spaces for the automaticdetection of human faces in color images. In: Proc. of the International Conferenceon Face and Gesture Recognition, 2000:54~61
    [49] Phung S. L., Bouzersoum A., Chai D.. A novel skin color model in ycbcr color spaceand its application to human face detection. In: IEEE International Conf. on ImageProcessing(ICIP’2002), 2002,1:289~292
    [50] Menser B., Wien M.. Segmentation and tracking of facial regions in color imagesequences. In: Proc. SPIE Visual Communications and Image Processing 2000,2000:731~740
    [51]罗三定,周磊,沙莎.一种新的快速多人脸检测算法.计算机应用研究,2008,25(4):1079~1080
    [52] Yang J., Waible A.. Tracking human faces in real time. CMU, Carnegie MellonUniversity ,CA: Technical Report CMU_CS_95_210, 1995
    [53]高建坡,王煜坚,杨浩等.以颜色和直方图为线索的粒子滤波人脸跟踪.中国图象图形学报,2007,12(3):466~473
    [54]刘俊杰,刘超,沈海涛等.基于3D直方图联合面部特征的人脸跟踪系统.系统仿真学报,2007,19(22):5304~5310
    [55]王以孝,王春生,程义民.引入光流法的活动轮廓模型.电路与系统学报,2003,8(1):77~80
    [56]刘爱平,周焰,关鑫璞.基于Hausdorff距离和改进ASM的人脸跟踪方法.计算机应用研究,2007,24(10):172~175
    [57]李生平,贾振堂,贺贵明等.综合利用人脸特征和活动轮廓技术的人脸检测及跟踪算法.小型微型计算机系统,2003,24(10):1837~1840
    [58]邢昕,汪孔桥,沈兰荪.基于器官跟踪的人脸实时跟踪方法.电子学报,2000,28(6):29~31
    [59] Antoszczyszyn P. M., Hannah J. M.,Grant P.M.. Tracking of the motion of importantfacial features in model based coding. Signal Processing, 1998,66(2):249~260
    [60] Bradski G R. Computer vision face tracking as a component of a perceptual userinterface. Proceedings of IEEE Workshop Applications of Computer Vision,1998:214~219
    [61] Fukunaga K, Hostetler LD. The estimation of the gradient of a density function withapplications in pattern recognition. IEEE Trans. on Information Theory, 1975,21(1):32~40
    [62] Cheng Y.. Mean shift, mode seeking and clustering. IEEE Trans. on Pattern Analysisand Machine Intelligence,1995,17(8):790~799
    [63] Huang J., Shao X., Wechsler H.. Face Pose Discrimination Using Support VectorMachines(SVM).In: Proc. 14th Int’l Conf. Pattern Recognition,1998:154~156
    [64] Zhang Z., Hu Y., Liu M. et al. Head Pose Estimation in Seminar Room Using MultiView Face Detectors. Multimodal Technologies for Perception of Humans: Proc.First Int’l Workshop Classification of Events, Activities and Relationships, R.Stiefelhagen and J. Garofolo, eds.,2007:299~304
    [65] Duda R., Hart P., Stork D.. Pattern Classification, second ed. John Wiley & Sons,2001:116~117.
    [66] Mckenna S., Gong S.. Real Time Face Pose Estimation. Real Time Imaging,1998,4(5):333~347
    [67] Roweis S., Saul L., Nonlinear Dimensionality Reduction by Locally LinearEmbedding. Science,2000,290(5500):2323~2326
    [68] Belkin M., Niyogi P.. Laplacian Eigenmaps for Dimensionality Reduction and DataRepresentation. Nerual Computation, 2003,15(6):1373~1396
    [69] Heinzmann J., Zelinsky A.. 3D Facial Pose and Gaze Point Estimation Using aRobust Real Time Trackoing Paradigm.In: Proc. IEEE Int’l Conf. Automatic Faceand Gesture Recognition,1998:142~147
    [70] Horprasert T., Yacoob Y., Davis L.. An Anthropometric Shape Model for EstimationHead Orientation. Proc. Third Int’l Workshop Visual Form,1997:247~256
    [71] Hu Y., Chen L., Zhou Y. et al. Estimating Face Pose by Facial Asymmetry andGeometry. In: Proc IEEE Int’l Conf. Automatic Face and Gesture Recognition,2004:651~656
    [72] Zhu Y., Fujimura K.. Head Pose Estimation for Driver Monitoring. Proc IEEEIntelligence Vehicles Symp.,2004:501~506
    [73] Huang K., Trivedi M.. Robust Real Time Detection, Tracking and Pose Estimation ofFaces in Video Streams. Proc. 17th Int’l Conf. Pattern Recognition,2004:965~968
    [74] Sherrah J., Gong S.. Fusion of Perceptual Cues for Robust Tracking of Head Pose andPosition. Pattern Recognition,2001,34(8):1565~1572

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700