人脸表情识别算法的研究

英文题名：The Research of Facial Expression Recognition Algorithms
作者：张庆
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：表情识别 ; 链码 ; 几何特征 ; 表情特征 ; 支持向量机
英文关键词：Facial Expression Recognition ; Chain Code ; Geometrical feature ; Expression Feature ; Support Vector Machine
学位年度：2012
导师：屈磊 ; 韦穗
学科代码：081002
学位授予单位：安徽大学
论文提交日期：2012-05-01

摘要

人脸表情识别作为人脸检测领域的一个重要组成部分,是人工智能领域的个新兴的研究课题,涉及到计算智能、模式识别、图像处理,甚至还包括生理学和心理学等学科领域,属于一种交叉性学科。表情识别的研究目标是让计算机能够自动识别出人的表情信息,从而能够更进一步地增强人机交互的友好性及智能性。同时,随着社会的生活水平不断提高,人们对于生活品质的安全要求也越来越高。针对现实生活中的很多场景,诸如驾驶监控、医疗监护等,若计算机能够自动识别人脸面部表情,则可大大降低悲惨事件发生的可能性并对人类的安全提供有效的保障。所以人脸表情识别研究具有极高的潜在应用价值和广阔的应用前景。
     目前对于正脸的人脸检测技术已经基本成熟,表情识别作为人脸检测技术的一个延伸却处于起步阶段,仍然没有一个比较成熟的算法。目前已经存在的多种表情特征提取算法,总体可以将其归纳成静态图像和图像序列的表情特征提取算法两种形式。而经典LBP(Local Binary Patterns)和LBP_TOP(Local Binary Patterns from Three Orthogonal Planes)算法分别作为该两种形式的常用算法,但是它们的实时性及识别率却没有达到令人满意的程度。本文以该两种常用算法为基础,对表情区域选取进行了改进,在不降低识别率的前提下,有效地提高了表情特征提取的速度。此外,针对图像几何特征提取,本文提出了基于链码的思想,该方法对静态图像和图像序列均可提取较鲁棒的几何特征。将两种几何特征分别与改进后的LBP和LBP TOP表情特征进行有效融合,大大提高了最终表情识别的识别率。实验数据证明了本文所提取的几何特征具有合理性和有效性。最终本文有效地结合了静态图像和图像序列的两种表情特征,构建出人脸表情识别的实时检测系统。具体研究内容及创新点如下：
     (1)对静态人脸图像或图像序列进行人脸检测、特征点定位和面部归一等图像预处理操作,得到本文规定的标准人脸图像或图像序列,为后期特征提取打下必备的基础。
     (2)基于链码思想,对静态图像的各面部目标上的关键特征点进行循环链码编码,并对其进行有效的顺序串接组合及归一,得到静态图像的几何特征。基于经典的LBP表情特征提取算法,并对表情区域选取方面做了部分改进,在不降低识别率的前提下,有效地降低了LBP表情特征的维数,得到改进的LBP表情特征。将静态图像的几何特征与改进的LBP表情特征进行有效融和,构成本文最终的静态图像表情特征。
     (3)针对图像序列,本文对各序列图像中的对应关键特征点的位置在同一坐标系下进行统计,以形成各关键特征点的运动轨迹,这些运动轨迹的组合可以描述某人脸表情的形成过程。对各运动轨迹进行非循环链码编码及有序组合归一,得到图像序列的几何特征。基于经典的LBP_TOP表情特征提取算法,对图像序列进行LBP_TOP表情特征提取。将图像序列的几何特征与经典LBP_TOP表情特征进行有效融和,构成本文最终的图像序列表情特征。
     (4)采用支持向量机多分类算法中的“一对一”分类算法及径向基核函数对表情特征进行表情模板训练及表情分类。
     (5)开发人脸表情识别实时检测系统,对测试者进行图像序列提取、人脸图像预处理、静态图像和图像序列表情特征双重提取,并对两种表情特征分别进行表情分类得到两种表情结果。遵守以图像序列表情特征为主,静态图像表情特征为辅的原则,对两种表情结果进行最大概率推断,得到最终最为可能的表情结果。
As an important part of the face detection technology, facial expression recognition is a new research topic in the field of artificial intelligence. Facial expression recognition is an interdisciplinary topic since it related to computational intelligence, pattern recognition, image processing, even physiology and psychology and so on. The goals of expression recognition are to enable computer to recognize facial expression information automatically, and further enhance the friendliness and intelligence of man-machine interaction. At the same time, along with the improvement of living standard, the security requirements of people's living qualities are gaining higher and higher attention. Considering many scenes in real life, such as driving monitor and medical care and so on, if computer is capable of recognizing facial expression automatically, the possibility of tragic events can be greatly reduced and effective protection can be provided for the human safety. So the facial expression recognition research has high potential application values and broad application prospects.
     Now the technology of the frontal face detection is approaching mature, but the technology of facial expression recognition as an extension of the face detection is in its infancy, which still do not has a relatively mature algorithm. The current facial feature extraction algorithm can be divided into two categories:static images based and image sequences based. The classic LBP (Local Binary Patterns) and LBP_TOP (Local Binary Patterns from Three Orthogonal Planes) are two most commonly used algorithms which belong to the above mentioned categories respectively. However, their computation time and the recognition rates are not satisfactory. Based on these two classic algorithms, we modified the expression region selection method. As a result, the expression features extraction speed was greatly improved without reducing the recognition rate. In addition, for the geometric feature extraction, we propose a chain code based method which can extract the geometric feature robustly for both static images and image sequences. By combining the geometric features with the improved LBP and LBP_TOP expression features effectively, the expression recognition rates can be greatly improved. Experimental results showed the reasonability and validity of proposed geometric features extraction method. Finally, a real time facial expression recognition system which combining the features extracted from static and sequence image was built. Below is the detailed research content and innovation of this thesis:
     (1)As the preprocessing step, the face detection, facial feature points positioning and facial images normalization for static and sequence images was performed, this laid solid foundation for the following feature extraction procedure.
     (2)Based on the idea of chain coding, the geometrical features of static image were obtained by key feature points circularly chain coding, sequential combining and normalizing. Based on the classical LBP facial expression feature extraction algorithms, we decrease the dimension of LBP facial expression feature effectively by modifying the facial expression region selection method. The final static image facial expression features were obtained by effectively combining geometrical features with improved LBP features.
     (3) For image sequences, the movement pattern of key feature points was obtained by analyzing the position of the corresponding key feature points. The combination of these movement patterns can describe the formation of different facial expressions. The geometrical features of image sequence can be obtained by normalizing the non-circularly chain coding and orderly combining the movement patterns. The final facial expression feature is a combination of geometrical features which extracted from image sequence and the classical LBP_TOP features.
     (4) Using support vector machine which use the'one to one'classification algorithm and RBF kernel function to perform the expression features template training of expression classification.
     (5)A real-time facial expression recognition system was developed. This system constitutes the following functions:image sequences extraction, face image preprocessing, facial expression features extraction, and facial expression classification. In order to get a more reasonable result, the maximum probability inference was performed on the results obtained from static and sequential images by taking the features extracted from image sequences as the main judging factor, and the features extracted from static images as a lesser judging factor.

引文

[1]Mehrabian A, Russell J A. An Approach to Environmental Psychology[M]. Cambridge:MIT Press,1974.
    [2]Darwin C. The Expression of the Emotions in Man and Animals[M]. John Murray: reprinted by University Chicago Press,1872.
    [3]Ekman P, Friesen W V. Facial Action Coding System:A Technique for the Measurement of Facial Movement[M]. Palo Alto:Consulting Psychologists Press,1978.
    [4]李文.人脸表情识别方法[J].电子科技,2007,213(6)：63-68.
    [5]Suwa M, Sugie N, Fujimora K. A Preliminary Note on Pattern Recognition of Human Emotional Expression[C]//Proceedings of the 4th International Joint Conference on Pattern Recognition. Kyoto, Japan:Institute of Electrical and Electronics Engineers,1978:408-410.
    [6]高文,金辉.面部表情图像的分析与识别[J].计算机学报,1997,20(9)：782-789.
    [7]何良华.人脸表情识别中若干关键技术的研究[D].南京：东南大学,2005.
    [8]周书仁,梁昔明,朱灿等.基于ICA和HMM的表情识别[J].中国图象图形学报,2008,13(12)：2321-2328.
    [9]Ma L, Khorasani K. Facial Expression Recognition Using Constructive Feedforward Neural Networks[J]. IEEE Transactions on SMC-Part B:Cybernetics, 2004,34(3):1588-1595.
    [10]Jiang Bin, Yang Guo-sheng, Zhang Huan-long. Comparative study of dimension reduction and recognition algorithms of DCT and 2DPCA[C]//BInternational Conference on Machine Learning and Cybernetics. Kunming, China,2008:407-410.
    [11]Kyperountas M, Tefas A, PitasI I. Salient feature and reliable classifier selection for facial expression classification[J]. Pattern Recognition,2010,43(3):972-986.
    [12]付晓峰.基于二元模式的人脸识别与表情识别研究[D].杭州：浙江大学,2008.
    [13]Xiaofeng Fu, Wei Wei. Centralized Binary Patterns Embedded with Image Euclidean Distance for Facial Expression Recognition[C]//Fourth International Conference on Natural Computation. Hangzhou:Zhejiang Univ,2008,115-119.
    [14]Tslakanidou F, Malassiotis S. Real-time 2D+3D facial action and expression recognition[J]. Pattern Recognition,2010,43(5):1763-1775.
    [15]Anderson K, Mcowan P. A Real-Time Automated System for the Recognition of Human Facial Expression[J]. IEEE Transactions on SMC-Part B: Cybernetics,2006,36(1):96-105.
    [16]杨国亮.人工心理相关技术研究—面部表情识别与情感建模[D].北京：北京科技大学,2006.
    [17]Kotsia I, Pitas I. Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines[J]. IEEE Transactions on Image Processing,2007,16(1):172-187.
    [18]Pantic M, Patras I. Dynamics of Facial Expression:Recognition of Facial Actions and Their Temporal Segments from Face Profile Image Sequences[J]. IEEE Transactions on Systems, Man, and Cybernetics-Part B:Cybernetics,2006,36(2): 433-449.
    [19]Chan T S, Yip R K K. Line Detection Algorithm[C]//Proceedings of the 13th International Conference on Pattern Recognition, Vienna.1996:126-130.
    [20]Koplowitz J, Plante S. Corner detection of chain code curves[J].Pattern Recognition,1995,28(6):843-852.
    [21]张显全,王继军.基于Freeman链码的圆识别方法[J].计算机工程,2007,33(15)：196-198.
    [22]P. Viola, M. Jones. Rapid Object Detection Using a Boosted Cascade of Simple Features[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),2001:511-518.
    [23]Mohan A, Papageorgiou C, Poggio T. Example-based Object Detection in Images by Components[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001,23(4):349-361.
    [24]Freund Y, Schapire R. E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting[J]. Journal of Computer and System Sciences.1997,55(1):119-139.
    [25]P. Bartlett, Y. Freund, etc. Boosting the Margin:A New Explanation for the Effectiveness of Voting Methods[J]. The Annals of Statistics,1998,26(5):1651-1686.
    [26]A.J. Colmenarez, T.S. Huang. Face Detection with Information-Based Maximum Discrimination[C]//Proc. IEEE Conf. Computer Vision and Pattern Recognition.1997: 782-787.
    [27JT.F. Cootes, C.J.Taylor, D.H.Cooper, etc. Active Shape Models-Their Training and Application[J]. Computer Vision and Image Understanding,1995,61(1):38-59.
    [28]T.F.Cootes, GJ.Edwards, C.J.Taylor. Active appearance models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,23(6):681-685.
    [29]Coots T F, Taylor C J. Statistical models of appearance for computer vision [Online], available:http://www.isbe.man.ac.uk/-bim/Models/app-models.pdf,2001.
    [30]T. Ahonen, A. Hadid, and M. Pietikainen, Face Recognition with Local Binary Patterns[C]//Proc. Eighth European Conf. Computer Vision.2004:469-481.
    [31]T. Ahonen, A. Hadid, and M. Pietikainen, Face Description with Local Binary Patterns:Application to Face Recognition[J]. Pattern Analysis and Machine Intelligence,2006,28(12):2037-2041.
    [32]X. Feng, M. Pietikainen, and A. Hadid, Facial Expression Recognition with Local Binary Patterns and Linear Programming[J]. Pattern Recognition and Image Analysis,2005,15(2):546-548.
    [33]M.S. Bartlett, G. Littlewort, I. Fasel, and R. Movellan, Real Time Face Detection and Facial Expression Recognition:Development and Application to Human Computer Interaction[C]//Computer Vision and Pattern Recognition Workshop.2003: 53-53.
    [34]G Littlewort, M. Bartlett, I. Fasel, etc. Dynamics of Facial Expression Extracted Automatically from Video[J]. Image and Vision Computing,2006,24(6):615-625.
    [35]Bassili, John N. Emotion Recognition:The Role of Facial Movement and the Relative Importance of Upper and Lower Areas of the Face[J]. Personality and Social Psychology,1979,37(11):2049-2058.
    [36]Ojala T, Pietikainen M, Harwood D. A comparative study of texture measures with classification based on feature distributions[J]. Pattern Recognition.1996,29(1): 51-59.
    [37]Caifeng Shan. Robust facial expression recognition using local binary patterns[C]//IEEE International Conference on Image Processing. London Univ,2005: 11-14.
    [38]Shan C, Gong S, McOwan P W. Facial expression recognition based on local binary patterns:A comprehensive study[J]. Image and Vision Computing,2009,27(6): 803-816.
    [39]G. Doretto, A. Chiuso, S. Soatto, etc. Dynamic Textures[J]. International Journal of Computer Vision,2003,51(2):91-109.
    [40]R. Peteri, D. Chetverikov. Dynamic Texture Recognition Using Normal Flow and Texture Regularity[J]. Pattern Recognition and Image Analysis,2005,3523(2005): 9-23.
    [41]D. Chetverikov and R. Peteri. A Brief Survey of Dynamic Texture Description and Recognition[J]. Computer Recognition Systems,2005,30(2005):17-26.
    [42]G Zhao and M. Pietikainen. Dynamic Texture Recognition Using Volume Local Binary Patterns[J]. Dynamical Vision,2007,4358(2007):165-177.
    [43]G Zhao and M. Pietikainen. Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions[J]. Pattern Analysis and Machine Intelligence,2007,29(6):915-928.
    [44]Lyons M J, Budynek J, Akamatsu S. Automatic classification of single facial images[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1999, 21(12):1357-1362.
    [45]T. Kanade, J.F. Cohn, Y. Tian, Comprehensive Database for Facial Expression Analysis[C]//Proc. Int'l Conf. Automatic Face and Gesture Recognition,2000:46-53.
    [46]蒋斌,贾克斌,杨国胜.人脸表情识别的研究进展[J].计算机科学,2011,38(4),25-31.
    [47]Cohen I, Sebe N, etc. Facial expression recognition from video sequences: temporal and static modeling[J]. Computer Vision and Image Understanding,2003, 91(1/2):160-187.
    [48]Wang Te-hsun, Lien J. Facial expression recognition system based on rigid and non-rigid motion separation and 3D pose estimation[J]. Pattern Recognition,2009, 42(5):962-977.
    [49]Zhan Yong-zhao, Cheng Ke-yang, Chen Ya-bi, et al. A New Classifier for Facial Expression Recognition:Fuzzy Buried Markov Model[J]. Journal of Computer Science and Technology,2010,25(3):641-650.
    [50]张学工.统计学习理论的本质[M].北京：清华大学出版社,2000.
    [51]Burges C. A tutorial on support vector machines for pattern recognition[J]. Date Mining and Knowledge Discovery.1998,2(2):1-43.
    [52]徐文晖,孙正兴.面向视频序列表情分类的LSVM算法[J].计算机辅助设计与图形学学报,2009,21(4)：542-548.
    [53]徐琴珍,章品正,裴文江,等.基于混淆交叉支撑向量机树的自动面部表情分类方法[J].中国图象图形学报,2008,13(7)：1329-1334.
    [54]Vladimir N. Vapnik. The Nature of Statistical Learning Theory[D]. New York: Springer,1995.
    [55]王晓辉.支持向量机在人脸识别中的应用[D].哈尔滨：哈尔滨工业大学,2006.
    [56]C. Cortes and V. Vapnik. Support-Vector Networks[J], Machine Learning,1995, 20(3):273-297.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700