低比特率真实感人脸视频编码研究

英文题名：Research on Realistic Face Video Coding at Low Bit Rate
作者：於俊
论文级别：博士
学科专业名称：模式识别与智能系统
中文关键词：人脸表情运动跟踪和提取 ; 人脸表情识别 ; 参数编码 ; MPEG-4 ; 人脸动画 ; 粒子滤波
英文关键词：Facial expression motion tracking and extraction ; Facial Expression Recognition ; Parameter coding ; MPEG-4 ; Facial animation ; Particle filter
学位年度：2010
导师：汪增福
学科代码：081104
学位授予单位：中国科学技术大学
论文提交日期：2010-04-01

摘要

人机情感接口(人脸表情运动参数的跟踪和提取、表情识别、参数传输以及高真实感语音同步人脸动画的合成)是当今计算机视觉和图形学领域的一个研究热点,它在人机交互、视频编码、娱乐和虚拟现实等方面有着非常多的应用。在过去的三十年中,虽然相关领域取得了长足的发展与进步,但仍存在许多亟待解决的问题。其中,如何在发送端根据人脸视频快速获取准确的人脸运动和表情参数,并根据这些人脸运动和表情参数,在接收端合成高真实感的语音同步人脸动画是一个富于挑战性的研究课题。本课题涉及运动分析、人脸表情识别、信源和信道编码、人脸运动学和动力学建模及其表示、协同发音机制建模以及文本驱动人脸动画等诸问题。
     本文以极低比特率下模型基人脸视频编、解码为研究对象,对相关的人机情感接口问题进行深入研究,重点探讨人脸表情运动参数的跟踪和提取、参数化视频编码以及高真实感语音同步人脸动画合成等问题。
     本文的创新点和主要工作如下：
     (1)提出了一种基于单幅帧图像的人脸自动适配算法。首先,从输入视频中检测出首帧包含目标人脸的图像,然后以该图像为处理对象,利用改进的支持向量机算法(SVM)实现对其中的人脸的定位,利用Adaboost+Camshift+AAM (Active appearance model)算法实现对人脸特征点的定位：接着,利用上述人脸及其特征点的特定信息,在编码端对一个简洁人脸通用三维模型进行特定化处理以得到待处理人脸的构造参数(FDP:Facial definition parameter);在此基础上,构建在解码端使用的特定化精细人脸三维模型。
     (2)提出了一种基于在线模型匹配与更新的人脸三维表情运动跟踪算法。具体言之,利用自适应的统计观测模型来建立在线外观模型,利用自适应的状态转移模型和改进的粒子滤波算法实现对观测场景的确定性和随机化搜索,同时通过融合目标的多种测量信息以减少光照和个体相关性的影响。利用所提出的人脸三维表情运动跟踪算法既可以得到反映目标人脸整体姿态的全局刚体运动参数,又可以得到反映人脸表情变化的局部非刚体运动参数。
     (3)对人脸表情识别算法进行了深入研究。首先提出了一种静态人脸表情识别算法,该算法在提取人脸表情运动参数后,根据与表情相关的生理学方面的知识完成对表情的分类识别。接着,为了克服静态人脸表情识别算法的不足,提出了一种结合表情静、动态信息的表情识别算法。该算法在多表情马尔可夫链模型和粒子滤波的框架下结合表情的生理模型完成对人脸运动和表情的同步识别。
     (4)提出了一种面向MPEG-4人脸表情运动参数(FAP:Facial animation parameter)的压缩算法。该算法利用面部运动基函数(FBF)来组合FAP,可以在无编码延迟的情况下,通过帧间和帧内编码来达到降低码率的目的。
     (5)提出了一种基于MPEG-4的三维人脸表情动画合成算法。该算法采用参数模型和肌肉模型相结合的方式来生成人脸动画,可在FAP流的驱动下生成真实感较强的三维人脸表情动画。此外,还对协同发音机制进行了建模,利用该模型可生成与英语音素对应的人脸视素动作。这样,根据由文本解析得到的音素信息、附加的表情信息和持续时间信息,对视素之间的动画采用非均匀有理B样条函数进行插值可以获得与英语语音同步的表情人脸动画。
     (6)在前述研究的基础上,在国际上首次设计并实现了一个集人脸表情运动参数跟踪／提取、表情识别、参数传输以及真实感语音同步人脸动画合成等功能的视频编解码演示系统。该演示系统可在解码端根据解码后的参数合成真实感的人脸动画。
Human machine emotional interface (facial expression parameters tracking and extraction, facial expression recognition, parameters transmission and high realistic synchronized speech facial animation) is a hot topic of research in the field of Computer Vision & Computer Graphics and has a lot of applications in Human-Computer Interfaces, Video Coding, Entertainment, and Virtual Reality, etc. In the past 30 years, great progress and developments have been made in these areas. However, at present, it still has a lot of problems. Therefore, how to obtain correct facial motion and expression parameters quickly from video containing face on transmitter, how to transmit these parameters specially using human facial knowledge, how to obtain synchronized speech driven high realistic facial animation according these parameters on receiver, and how to obtain high rate of expression recognition result are challenges, they concern many problems including the motion analysis in computer vision, facial expression recognition, source and channel coding, the kinematic and dynamic modeling and representation of individualized face, the mechanism of co-articulation and text driven facial animation, etc.
     Facing to ultra-low bitrate model based facial video coding/decoding area, in this paper, we study human machine emotional interface related problems in some aspects, and pay more attention to the issues of facial expression parameters tracking and extraction, parameterized video coding, and high realistic synchronized speech facial animation specially.
     The innovation aspects and majoy work in this paper are as follows:
     (1) A face adaptation algorithm based on single image is proposed. Firstly, the first frame containing face in input video is detected. Based on this frame, improved SVM (Support Vector Machine) is utilized for face detection, Adaboost+Camshift+AAM (Active appearance model) are utilized for feature localization. Then the coder gets FDP (Facial Definition Parameter) through Face adaptation of a simple universal triangular model. Finally the decoder adapts a complex universal triangular model using these FDP.
     (2) A 3D facial expressional motion tracking algorithm based on online model adaptation and updating is proposed. The algorithm constructs the online model using an adaptive statistic observation model, and statistic search and determinately search are applied to observation scene simultaneously using the combination of adaptive state transition model and improved particle filter. Multi-measurements are infused to decrease lighting influence and person dependence. Then, not only global rigid motion parameters can be obtained, but also local non rigid expressional parameters can be obtained.
     (3) Based on deeply research on facial expression recognition, an algorithm for static facial expression recognition is proposed firstly, facial expression is recognized after facial actions are retrieved according to facial expression knowledge based on particle filter. Coping with shortage of static facial expression recognition, an algorithm combining static facial expression recognition and dynamic facial expression recognition is proposed, facial actions as well as facial expression are simultaneously retrieved using a stochastic framework based on multi-class expressional Markov chains, particle filter and facial expression knowledge.
     (4) An algorithm for compressing MPEG-4 facial animation parameters (FAP) is proposed. Facial action basis function (FBF) are used to group FAP, then we can lower bit rate by combing intraframe and interframe coding scheme, and it does not introduce any interframe delay.
     (5) A 3D facial expression animation algorithm based on MPEG-4 is proposed. This algorithm produces facial animation combing parameterized model and muscle model, and can produce high realistic facial expression animation with FAP flow. Furthermore, this algorithm could produce facial viseme actions considering the co-articulation effect in speech. Then according to phonemes from text analysis, phoneme duration, additional expression information, and interpolation between viseme using NURBS, synchronized speech facial expressional animation are obtained.
     (6) According to the above researches, internationally for the first time, a facial expression parameters tracking and extraction, facial expression recognition, parameters transmission, high realistic synchronized speech facial animation Demo System is constructed. The system could produce high realistic facial animation from decoded parameters on decoder.

引文

[Aizawa K 1987] Aizawa K, Harashima H, Saito T. Model-based synthetic image coding system[C]. In Proceedings of the Picture Coding Symposium 87, Stockholm,1987:95-99
    [Aizawa K 1987] Aizawa K, et al. Model-based synthesis image coding system modeling a person's face and synthesis of facial expression[C]. In Proceedings of the GLOBECOM-87,1987: 45-49
    [Aizawa K 1989] Aizawa K, Harashima H, Saito T. Model-based analysis synthesis image coding (MBASIC) system for a person's face[C], Signal Processing:Image Communication,1989, 1(2):139-152
    [A. D. Jepson 2003] A. D. Jepson, D. J. Fleet, and T. F. El-Maraghi. Robust online appearance models for visual tracking[J]. IEEE Transaction on PAMI,2003,25(10):1296-1311.
    [Anderson B D 1979] Anderson B D, Moore J B. Optimal filtering[M].New Jersey:Prentice-Hal, 1979
    [Andrieu 2004] Andrieu C, Doucet A, Singh S S, et al. Particle methods for change detection, system identification, and control[J]. Proceedings of the IEEE,2004,92(3):423-438
    [Abboud B 2004] Abboud B, Davoine F, Appearance factorization based facial expression recognition and synthesis[C]. In Proceedings of International Conference on Pattern Recognition, Cambridge, UK,2004,4:163-166
    [Basmajian 1985] J. V. Basmajian, C. J. Deluca. Muscles alive:Their functions revealed by electro-myography (5th ed.)[M]. Baltimore:Williams & Wilkins,1985.
    [Bergeron 1985] P. Bergeron and P. Lachapelle. Controlling facial expressions and body movements in advanced computer animation[C]. SIGGRAPH'85 Tutorials, ACM, New York. Volume 2,1985:61-79.
    [Bernstein 2000] L. E. Bernstein, M. E. Demorest, and Tucker, P.E. Speech perception without hearing[J]. Perception & Psychophysics, Vol.62(2),2000:233-252.
    [Bernstein 2003] L. E. Berenstein. Visual speech perception in Audiovisual Speech Processing[M], E. Vatiokis-Bateson, G. Bailly & P. Perrier (Eds.).2003
    [Blanz 1999] V Blanz, T Vetter. A morphable model for the synthesis of 3D faces[C]. SIGGRAPH'99 Conf. Proc. Los Angeles, USA,1999:187-194.
    [Blanz 2003] Volker Blanz, Curzio Basso, Tomaso Poggio, Thomas Vetter:Reanimating Faces in Images and Video[C]. Comput. Graph. Forum 22(3),2003:641-650.
    [Bourne 1973] G. H. Bourne. Structure and function of muscle. In Physiology and Biochemistry[M]. Second edition, Volume III. Academic Press, New York,1973.
    [Brand 1999] M. Brand. Voice puppetry[C]. In Proceedings of ACM SIGGRAPH 1999. ACM Press/Addison-Wesley Publishing Co.1999:21-28.
    [Bregler 1997] C. Bregler, M. Covell, and M. Slaney. Video Rewrite:Driving Visual Speech with Audio[C]. Proc. SIGGRAPH 97, Los Angeles, CA,1997:353-360.
    [Basu S 1998] Basu S.3D modeling and tracking of human lip motions[C]. Proceedings of the International Conference on Computer Vision, IEEE Computer Society Press,1998:337-343.
    [Blom 1988] Blom H A, Bar-Shalom Y. The interacting multiple model algorithm for systems with Markovian switching coefficients[J].IEEE Trans on Automatic Control,1988,33(8):103-123
    [Bucy 1971] Bucy R S, Senne K D. Digital synthesis of nonlinear filters[J]. Automatica,1971,7(3):287-298
    [Bucy 1969] Bucy R S.Bayes. theorem and digital realization for nonlinear filters[J].J of Astronautical Sciences,1969,17(2):80-94
    [Berzuinic 1997] Berzuinic, Best N G, Gilksw R, et.al. Dynamic conditional independence models and Markov Chain Monte Carlo methods[J]Journal of the American Statistical Association,1997,92(440):1403-1412
    [Bruno 2004] Bruno M G S. Bayesian methods for multiaspect target tracking in image sequences[J]. IEEE Trans on SignalProcessing,2004,52(7):1848-1861
    [Bruno 2002] Bruno, Teulievev, Garcia J M. Parallel particle filtering[J]. Journal of Parallel and Distributed Computing,2002,62(7):1186-1202
    [Buciu I 2004] Buciu I, Pitas I. Application of non-negative and local nonnegative matrix factorization to facial expression recognition[C]. In Proceedings of International Conference on Pattern Recognition[C], Cambridge, UK,2004,1:288-291
    [Bourel F 2002] Bourel F, Chibelushi C C. Robust facial expression recognition using a state-based model of spatially-localized facial dynamics[C]. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, Washington, DC,USA,2002:106-111
    [Braathen B 2002] Braathen B, Bartlett M S, Littlewort G, et al. An approach to automatic recognition of spontaneous facial actions[C]. In:Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, Washington, DC, USA,2002:231-235
    [Bartlett M S 2005] Bartlett M S, Littlewort G, Frank M, et al. Recognizing facial expression: machine learning and application to spontaneous behavior[C]. In:Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA,2005,2: 568-573
    [Buciu L 2005] Buciu L, Kotsia I, Pitas I. Facial expression analysis under partial occlusion[C]. In:Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, PA, USA,2005,5:453-456
    [B Moghaddam 1995] B Moghaddam, A Pentland. An automatic system for model based coding of faces[C]. IEEE Data Compression Conference, Snowbird, Utah, March,1995,2:568-573
    [Bartlett M 2004] Bartlett, M, Littlewort, G, Lainscsek, C. Machine learning methods for fully automatic recognition of facial expressions and facial actions[C]. In IEEE international conferenceon systems, man and cybernetics,2004,2:668-673
    [Cassell 1994] J. Cassell, C. Pelachaud, N. Badler. Animated conversation:Rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents[C]. In Proceedings of ACM SIGGRAPH,1994:413-420.
    [Chai 2003] Chai J X, Xiao J, Hodgins J. Vision-based Control of 3D Facial Animation[C]. Eurographics/SIGGRAPH Symposiumon Computer Animation,2003:193-206.
    [Chernoff 1971] H. Chernoff. The use face to represent points in n-dimensional space graphically[R]. Technical Report Project NR-042-993, Office of Naval Research, Washington DC, December 1971.
    [Cohen 1990] M. Cohen and D. Massaro. Synthesis of Visible Speech[J]. Behavioral Research Methods and Instrumentation.1990,22(2):260-263.
    [Cohen 1993] M. Cohen and D. Massaro. Modeling Coarticulation in Synthetic Visual Speech[M]. Models and Techniques in Computer Animation. Springer-Verlag,1993.
    [Cohen 1994] M. Cohen and D. Massaro. Development and experimentation with synthetic visual speech[J]. Behavioral Research Methods, Instrumentation, and Computers.1994,26:260-265.
    [Cohen 1996] M. M. Cohen, R. L. Walker, and D. W. Massaro. Perception of synthetic visual speech[M]. In:Speech reading by humans and Machines, D.G. Stroke and M.E. Hennecke (Eds.), New York:Springer 1996:153-168.
    [C. S. Wiles 2001] C. S. Wiles, A. Maki, and N. Matsuda. Hyperpatches for 3D model acquisition and tracking[J], IEEE Trans on PAMI,2001,23:1391-1403.
    [Casella G 1996] Casella G, Robert C P. Rao-Blackwellisation of sampling schemes[J]. Biometrika,1996,83(1):81-94
    [Chan B L 2003] Chan B L, Doucet A, Tadic V B. Optimization of particle filters using simultaneous perturbation stochastic approximation[C]. Proc of IEEE Int Confon Acoustics, Speech, and Signal Processing.Hong Kong:IEEE Signal Processing Society,2003,6:681-684
    [Chang Y 2004] Chang Y, Hu C, Turk M. Probabilistic expression analysis on manifolds[C]. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, Washington DC, USA,2004,2:520-527
    [Chen X 2003] Chen X, Huang T. Facial expression recognition:a clustering based approach[J]. Pattern Recognition Letters,2003,24(9-10):1295-1302
    [Cohen I 2003a] Cohen I, Sebe N, Garg A, et al. Facial expression recognition from video sequences:Temporal and static modeling[J]. Computer Vision and Image Understanding,2003, 91(1-2):160-187
    [Cohen I 2003b] Cohen I, Sebe N, Cozman F G, et al. Learning bayesian network classifiers for facial expression:recognition with both labeled and unlabeled data[C]. In:Proceedings of International Conference on Computer Vision and Pattern Recognition, Madison, W isconsin,USA,2003,1:595-604
    [Bradford 1996] C. Bradford. The quickhull algorithm for convex hulls. ACM Trans. On Mathematical Software.1996,22(4):469-483
    [De Luca 1997] De Luca. The use of surface electromyography in bio-mechanics[J]. Journal of Biomechanics.1997,13:135-163.
    [D. DeCarlo 2000] D. DeCarlo, D. Metaxas. Optical flow constraints on deformable models with applications to face tracking[J]. International Journal on Computer Vision.2000,38(72):99-127
    [Douceta 2000] Douceta, Godsill S, Andrieu C. On sequential Monte Carlo sampling methods for Bayesian filtering[J]. Statistics and Computing,2000,10(1):197-208
    [Douceta 2001] Douceta A, Gondon N J. Sequential Monte Carlo Methods in Practice[M]-New York:Springer-Verlag,2001
    [Darwin C 1872] Darwin C. The Expression of the Emotions in Man and Animals[M]. London:J. Murray,1872
    [Donato G 1999] Donato G, Bartlett S, Hager C,et al. Classifying facial actions[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence,1999,21(10):974-989
    [Dubuisson S 2002] Dubuisson S, Devoine F, Masson M. A solution for facial expression representation and recognition[J]. Signal Processing:Image Communication,2002,17(9): 657-673
    [Ekman 1978] P. Ekman and W. V. Friesen, Manual for the Facial Action Coding System[M]. Consulting Psychologists Press, Inc., Palo Alto, CA,1978.
    [Epstein 2002] M. Epstein, N. Hacopian, P. Ladefoge. Dissection of the Speech Production Mechanism[M], Los Angeles:The UCLA Phonetics Laboratory,2002:12-15.
    [Ezzat 1998] T. Ezzat T. Poggio. MikeTalk:A talking facial display based on morphing visemes[C]. In Proc.Computer Animation Conference, Philadelphia, USA,1998:456-459.
    [Ezzat 2000] T. Ezzat, and T. Poggio. Visual speech synthesis by morphing visemes[J]. International Journal of Computer Vision,38,2000:45-57.
    [Ezzat 2002] T. Ezzat, G. Geiger and T. Poggio. Trainable videorealistic speech animation[J], ACM Transactions on Graphics,2002,21(3):388-398.
    [Fung 1993] Y. Fung. Biomechanics:Mechanical Properties of Living Tissues[M]. Springer Verlag,1993.
    [F. Dornaikan 2004a] F. Dornaikan, J. Ahlberg. Face and facial feature tracking using deformable models[J]. International Journal of Image and Graphics,2004,4(3):499-532.
    [F. Dornaikan 2004b] F. Dornaika, J. Ahlberg. Fast and reliable active appearance model search for 3-D face tracking[J]. IEEE Trans. System, Man, Cybernetics,2004,34(8):1838-1853.
    [F. Dornaikan 2006] F. Dornaika, F. Davoine. On appearance based face and facial action tracking[J], IEEE Transactions on Circuits and Systems for Video Technology,2006,16(9): 1107-1124.
    [F. Dornaikan 2008] Fadi. Dornaika, Bogdan Raducanu. Detecting and Tracking of 3D Face Pose for Human-Robot Interaction[C], IEEE International Conference on Robotics and Automation, 2008:1716-1721.
    [Forchheimer R 1989] Forchheimer R. Image coding:From waveforms to animation[J]. IEEE Transactions on ASSP,1989,37(12):2008-2023
    [Forchheimer R 1983] Forchheimer R, Fahlander O. Low bit-rate coding through animation[C]. In Proceedings of the Picture Coding Symposium (PCS-83), Davis, CA,1983.113-114
    [Forchheimer R 1987] Forchheimer R. The motion estimation problem in semantic image coding[C]. In Proceedings of the Picture Codig Symposium (PCS-87), Stockholm,1987.171-172
    [Fong 2002] Fong W, Godsill S J, Doucet A, et al. Monte Carlo smoothing with application to audio signal enhancement[J]. IEEE Trans on Signal Processing,2002,50(2):438-449
    [Fox 2001]Fox D, KLD-Sampling:adaptive particle filters[C]. Proc of the 14th Neural Information Processing Systems Conference, Canada:NIPS Press,2001
    [Fasel B 2003] Fasel B, Luettin J. Automatic facial expression analysis:A survey [J]. Pattern Recognition,2003,36(1):259-275
    [Feng X 2004] Feng X. Facial expression recognition based on local binary patterns and coarse-to-fine classification[C]. In:Proceedings of International Conference on Computer and Information Technology, Wuhan, China,2004:178-183
    [Gillenson 1974] M. L. Gillenson. The Interactive Generation of Facial Images on a CRT Using a Heuristic Strategy[D]. PhD thesis, Ohio State University, Computer Graphics Research Group, Columbus, OH, March 1974.
    [Guiard-Marigny 1994] T. Guiard-Marigny, A. Adjoudani, and C. Benoit. A 3D model of the lips for visual speech synthesis[C]. In Proc.2nd ETRW on Speech Synthesis, New Platz, New York 1994:49-52.
    [Gabriel Antunes Abrantes 1999] Gabriel Antunes Abrantes. MPEG-4 Facial Animation Technology:Survey, Implementation, and Results [J]. IEEE Transactions on Circuits and Systems for Video Technology,1999,9(2):290-305.
    [Gordon 1993] Gordon N J, Salmond D J. Novel approach to nonlinear/non-Gaussian Bayesian state estimation[J]. IEE Proceedings-F,1993,140(2):107-113
    [Geweke 1989] Geweke J. Bayesian inference in econometrics models using Monte Carlo integration[J]. Econometrics,1989,57(6):1317-1339
    [Crisan D 2002] Crisan D, Doucet A. A survey of convergence results on particle filtering methods for practitioners[J]. IEEE Trans on Signal Processing,2002,50(3):736-746
    [Gokturk S B 2002] Gokturk S B, Bouguet J Y, Tomasi C, et al. Model-based face tracking for view-independent facial expression recognition[C]. In:Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, Washington, DC, USA,2002:272-278
    [Gueorguieva N 2003] Gueorguieva N, Georgiev G, Valova I. Facial expression recognition using feed forward neural networks[C]. In:Proceedings of the International Conference on Artificial Intelligence, Las Vegas, NV, USA,2003:285-291
    [Guo G D 2005] Guo G D, Dyer C R. Learning from examples in the small sample case:face expression recognition[J]. IEEE Transactions on System, Man and Cybernetics-PartB, Special Issue on Learning in Computer Vision and Pattern Recognition,2005,35(3):477-488
    [Grassberger etal 1998] P. Grassberger and H. Frauenkron and W. Nadler. PERM:a monte carlo strategy for simulating polymers and other things[C]. In Monte Carlo Approach to Biopolymers and Protein Folding, eds.1998:21-27
    [Grassberger 1997] P. Grassberger. The pruned-enriched rosenbluth method:simulations of theta polymers of chain length up to 1,000,000[J]. In Physical Review E,1997,56:3682-3693
    [Gary R. Bradski 2000] Gary R. Bradski. Computer Vision Face Tracking For Use in a Perceptual User Interface[R]. Microcomputer Research Lab, Santa Clara, CA, Intel Corporation, 2000:51-57
    [Gert C 2001] Gert C. Tomaso P. Incremental and decremental support vector machine learning[C]. Advances in neural information processing systems, Cambridge M A:MIT Press, 2001:159-167
    [Hill 1988] D. R. Hill, A. Pearce, B. Wyvill. Animating speech:an automated approach using speech synthesis by rules[J]. The Visual Computer,1988,3:277-289
    [Horn 1981] B. K. P Horn and B. G. Schunk. Determining optical flow[J]. Artificial Intelligence, 1981,17:185-203.
    [H. Li 1993] H. Li P, Roivanen.3D motion estimation in model based facial image coding[J]. IEEE Trans on PAMI,1993,15:545-556.
    [Hammersley 1954] Hammersley J M. Morton K W. Poorman's Monte Carlo[J]. Journal of the Royal Statistical Society B,1954,16(1):23-38
    [Higuchit 1997] Higuchit. Monte Carlo filtering using genetics algorithm operators[J]. Journal of Statistical Computation and Simulation,1997,59(1):1-23
    [Huang 2004] Huang Y, Peter M D. A hybrid importance function for particle filtering[J]. IEEE Signal Processing Letters,2004,11(3):404-406
    [Hong 2004] Hong S, Bolic M, Djuric P M. An efficient fixed-point implementation of residual resampling scheme for high-speed particle filters[J]-IEEE Signal Processing Letters,2004,11(5):482-485
    [Hu C 2003] Hu C, Feris R, Turk M. Real-time view-based face alignment using active wavelet networks[C]. In:Proceedings of IEEE International Workshop on Analysis and Modeling of Faces and Gestures, Nice, France,2003:215-221
    [Huang X 2004] Huang X, Zhang S, Wang Y, et al.A hierarchical framework for high resolution facial expression tracking[C]. In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshop, Washington, DC, USA,2004:22-22
    [Haneda K 2003] Haneda K, Muraguchi T, Nakamura O. The recognition of facial expressions using expert system[C]. In:Proceedings of IEEE Canadian Conference on Electrical and Computer Engineering, Montrea, Canada,2003,2:1195-1198
    [Hong Z Q 1991] Hong Z Q. Algebraic featrue extraction of image for recognition[J]. Pattern Recognition,1991,24(3):211-219
    [Hearst 1998] Hearst M. A., Dumais S.T., Osman E., et al. Support vector machines[J]. IEEE Intelligent Systems,1998,13(4):18-28
    [I. Fasel 2007] I. Fasel, B. Fortenberry, and J. R. Movellan. A generative framework for realtime object detection and classification[J]. Computer Vision and Image Understanding, Elsevier Science,2007,2(1):65-78.
    [Isard M 1998] Isard M, Blake A. Condensation-conditional density propagation for visual tracking [J]. Int J of Computer Vision,1998,29(1):5-28
    [J. Strom 2002] J. Strom. Model-based real-time head tracking[J], EURASIP Journal on Applied Signal Processing 2002,10:1039-1052.
    [J. Strom 2001] J. Strom et al. Model-based Real-time Face Tracking with Adaptive Texture Update[R]. Technical Report LiTH-ISY-R-2342, Linkoping University, Sweden,2001.
    [J. Ahlberg 2003] J. Ahlberg, R. Forchheimer Face tracking for model-based coding and face animation[J], International Journal on Imaging Systems and Technology,2003,13(1):8-22.
    [J. Ahlberg 1999] J. Ahlberg, H. Li. Representing and Compressing Facial Animation Parameters Using Facial Action Basis Functions[J], IEEE Transaction on Circuits and Systems for Video Technology,1999,9(3):405-410.
    [J. Ahlberg 2002] J. Ahlberg. Model-based coding:Extraction, coding, and evaluation of face model parameters[D]. PhD thesis, No.761, Link"oping University, Sweden,2002.
    [J. Ahlberg 2001] J. Ahlberg. CANDIDE3-an updated parametrized face[R]. Technical Report LiTH-ISY-R-2326, Department of Electrical Engineering, Linkoping University, Sweden,2001.
    [Jazw 1970] Jazw inski A H. Stochastic Processes and Filtering Theory[M]. New York: Academic Press,1970
    [Julier 2000] Julier S J, Uhlmann J K, Durrant-Whyte H F. A new method for the nonlinear transformation of means and covariance in filters and estimators[J]. IEEE Trans on Automatic Control,2000,45(3):477-482
    [Jiang Lu 2004] Jiang Lu, Zhang Pin-zheng, Shu Hua-zhong. Moment application to human facial expression recognition[J]. Journal of Southeast University(Natural Science Edition), 2004,34(4):557-560
    [Jin Hui 2003] Jin Hui, Gao Wen. Analysis and application of the facial expression motions based on eigen-flow[J]. Journal of Software,2003,14(12):2098-2105
    [Jin Hui 2002] Jin Hui, Gao Wen. Analysis and recognition of facial expression image sequences based on HMM[J]. Acta Automatica Sinica,2002,28(4):646-650
    [Kalberer 2002] G. A. Kalberer, P. Mueller, and L. V. Gool. Speech animation using viseme space[C]. In Vision, Modeling, and Visualization 2002. Akademische Verlagsgesellschaft Aka GmbH, Berlin, Germany.2002:463-470.
    [Kalra 1991] P. Kalra, A. Mangili, N. Magnenat-Thalmann, and D. Thalmann. SMILE:a multi layered facial animation system[C]. In IFIP WG, Tokyo,1991:189-198.
    [Kent 1977] R. D. Kent and F. D. Minifie. Coarticulation in recent speech production models[J]. Journal of Phonetics,1977,5:115-135.
    [Kshirsagar 2000] S. Kshirsagar, and N. Magnenat-Thalmann. Lip Synchronization Using Linear Predictive Analysis[C], Proceedings of IEEE International Conference on Multimedia and Expo, New York, USA,2000:1077-1080.
    [Kuehn 1976] D. P. Kuehn, K. L. Moll. A cineradiographic study of VC and CV articulatory velocities[J]. Journal of Phonetics,1976,4:303-320.
    [Kalra P 1994] Kalra P. Modeling of vascular expressionsin facial animation[M]. Computer Animation, IEEE Computer Society Press,1994:50-58.
    [Kanazawa 1995] Kanazawa K, Koller D, Russell S J. Stochastic simulation algorithms for dynamic probabilistic networks[C]. Proc of the 11th Annual Conference on Uncertainty in AI. Canada:Morgan Kaufmann Publishers,1995:346-351
    [Kong A 1994] Kong A, Liu J S, Wongw H. Sequential imputations and Bayesian missing data problems[J]. Journal of the American Statistical Association,1994,89(426):278-288
    [Kitagawa 1996] Kitagawa G. Monte Carlo filter and smoother for non Gaussian non linear state space models[J]. Journal of Computational and Graphical Statistics,1996,5(1):1-25
    [Kunsch 2001] Kunsch H R. State space and hidden Markov models[M]. Complex Stochastic Systems-London:Chaman & Hall,2001:109-173
    [Kotecha J 2003] Kotecha J, Djuric P M. Gaussian particle filtering[J]. IEEE Trans on Signal Processing,2003,51(10):2592-2601
    [Kwok 2003] Kwok C, Fox D. Adaptive real-time particle filter for robot localization[C]. Proc of the IEEE Int confon Robotics and Automation. Taipei IEEE Press,2003,2:2836-2841
    [Kapoor A 2003] Kapoor A, Qi Y, Picard R W. Fully automatic upper facial action recognition[C]. In:Proceedings IEEE International Workshop on Analysis and Modeling of Faces and Gestures, Nice, France,2003:195-202
    [Kanade T 2000] Kanade T, Cohn J F, Tian Y. Comprehensive database for facial expression analysis[C]. In:Proceedings of the Fourth International Conference of Face and Gesture Recognition, Grenoble, France,2000:46-53
    [Koray 2007] Koray Balci, Elena Not, et.al. Xface Open Source Project and SMIL-Agent Scripting Language for Creating and Animating Embodied Conversational Agents[C], Proceedings of the 15th international conference on Multimedia, Augsburg, Germany, ACM, 2007:1013-1016.
    [Lee 1995]Y. C. Lee, D. Terzopoulos, and K. Waters. Realistic face modeling for animation[C]. In Proceedings of SIGGRAPH'95,1995:55-62.
    [Le Goff 1994] B. Le Goff, T. Guiard-Marigny, M. Cohen, and C. Benoit. Real-time analysis-synthesis and intelligibility of talking faces [C]. In Proc.2nd ETRW on Speech Synthesis, New Platz, New York,1994:53-56.
    [Lewis 1987] J. P. Lewis and F. I. Parke. Automatic lip-synch and speech synthesis for character animation[C]. In Proc. Graphics Interface'87 CHI+CG'87, Canadian Information Processing Society, Calgary,1987:143-147.
    [Li 2000]Z. Li, E. C. Tan, I. McLoughlin, T. T. Teo. Proposal of standards for intelligibility test of Chinese speech[J]. IEE Proc.-Vis. Image Signal Process,2000,147(3):254-260.
    [Liu 2003]Liu Wen-tao, Yin Bao-cai, Jia Xi-bin, Kong De-hui. A Realistic Chinese Talking Face[C],1st Indian International Conference on Artificial Intelligence(IICAI-03),2003: 1244-1254.
    [Lofqvist 1990] A. Lofqvist. Speech as audible gestures[M]. W.J.Hardcastle and A.Marchal, editors, Speech Production and Speech Modeling. Kluwer Academic Publishers, Dordrecht,1990: 289-322.
    [L. Zhang 1997] L. Zhang, Automatic adaptation of a face model using action units[C], Proc. Picture Coding Symp(PCS'97),1997:243-248.
    [Liu J S 1998] Liu J S, Chen R. Sequential Monte Carlo methods for dynamic systems[J]. Journal of the American Statistical Association,1998,93(443):1032-1044
    [Liu J 2001] Liu J, Westm. Combined parameter and state estimation in simulation-based filtering[M]. Journal of Sequential Monte Carlo Methods in Practice. New York:Springer-Verlag, 2001:197-217
    [Lyons M 1999] Lyons M, Budynek J, Akamastu S. Automatic classification of single facial images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1999,21(12): 1357-1362
    [Lien J 1998] Lien J. Automatic Recognition of Facial Expression Using Hidden Markov Models and Estimation of Expression Intensity[D]. Pittsburgh:The Robotics Institute, CMU,1998
    [Littlewort G 2004] Littlewort G, Bartlett M, Fasel I, et.al. Dynamics of facial expression extracted automatically from video[C]. In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Workshop on Face Processing in Video, Washington DC, USA, 2004:80-80
    [Liu Zi-cheng 1999] Liu Zi-cheng, Zhang Zheng-you, Jacobs Chuck. Rapid modeling of animated faces from video[R]. Microsoft Research Technique Report, TR00-11. Redmond: Microsoft,1999:21-28.
    [刘2006]刘晓,谈华春,章毓晋.人脸表情识别研究的新进展[J].中国图象图形学报,2006：11(10)：1359-1368
    [刘2000]刘云海,虞露,姚庆栋.人脸序列图像的模型基编码[J].计算机学报,2000,23(12)： 1297-1305.
    [LIBSVM 2010] http://www.csie.ntu.edu.tw/-cjlin/libsvm/
    [Magnenat-Thalmann 1988] N. Magnenat-Thalmann, N. E. Primeau, and D. Thalmann. Abstract Muscle Action Procedures for Human Face Animation[J], The Visual Computer,3(5):290-297, March 1988.
    [McGurk 1976] Harry Mc Gurk and John Mac Donald. Hearing lips and seeing voices[J]. Nature 1976,264,746-748.
    [Morishima 1993] S. Morishima and H. Harashima. Facial animation synthesis for human-machine communication system[C]. In Proc.5th International Conf. on Human-Computer Interaction, ACM, New York, Volume II,1993:1085-1090.
    [Moubaraki 1996] L. Moubaraki, J. Ohya. Realistic 3D Mouth Animation Using a Minimal Number of Parameters[C], IEEE International Workshop on Robot and Human Communication, Tsukuba, Japan,1996:201-206.
    [M. Malciu 2000] M. Malciu, F. Pretuex. A robust model-based approach for 3d head tracking in video sequences[C], In International conference on Automatic Face and Gesture Recognition, 2000:169-174.
    [M. La Cascia 2000] M. La Cascia, S. Sclaroff. Fast, reliable head tracking under varying illumination:An approach based on registration of texture-mapped 3D models[J], IEEE Trans Pattern Anal,2000,22:322-336.
    [M. Kampmann 2002] Markus Kampmann. Automatic 3-D face mode adaption for model-based coding of VideoPhone Sequences[J], IEEE Transaction on Circuits and Systems for Video Technology,2002,12(3):172-182.
    [M. J. Black 1997] M. J. Black et al. Recognizing facial expressions in image sequences using local parameterized models of image motion[J], International Journal of Computer Vision, 1997,25(1):23-28.
    [M. Fischler 1981] M. Fischler et al. Random sample consensus:A paradigm for model fitting with applications to image analysis and automated cartography [J]. Communication ACM,1981, 24(6):381-395.
    [Musso C 2001] Musso C, Oudjane N, Legland F. Improving regularized particle filters[M]. Journal of Sequential Monte Carlo Methods in Practice. New York:Springer-Verlag, 2001:247-272
    [Mehrabian A 1968] Mehrabian A. Communication without words[J]. Psychology Today,1968, 2(4):53-56
    [Ma L 2004] Ma L, Khorasani K. Facial expression recognition using constructive feedforward neural networks[J]. IEEE Transactionson Systems, Man and Cybernetics, Part B,2004,34(3): 1588-1595
    [Mitra S 2004] Mitra S, Liu Y. Local facial asymmetry for expression classification[C]. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA,2004,2:889-894
    [Minamitani H 2003] Minamitani H, Hoshino Y, Hashimoto H, et al.Computerized diagnosis of facial nerve palsy based on optical flow analysis of facial expressions[C]. In:Proceedings of the IEEE International Conference of Engineering in Medicine and Biology Society, Cancun, Mexico, 2003:663-666
    [Matsugu M 2003] Matsugu M, Mori K, Mitari Y, et al. Subject independent facial expression recognition with robust face detection using a convolutional neural network[J]. Neural Networks, 2003,16(5-6):555-559
    [Muller S 2002] Muller S, Wallhoff F, Hulsken F, et.al. Facial expression recognition using pseudo 3-D hidden Markov models[C]. In:Proceedings of International Conference on Pattern Recognition, Quebec City, Canada,2002,2:32-35
    [M Bichsel 1994] M Bichsel, A Pentland. Human Face Recognition and the Face Image Set's Topology[C]. CVGIP, Image Understanding,1994,59(2):254-261
    [M Kirby 1990] M Kirby, L Sirovich. Application of the Karhunen-Loeve procedure for the characterization of human faces[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1990,12(1)
    [MSDN]http://msdn.microsoft.com/downloads/sdks/platform/platform.asp
    [North 2000] North, B, Blake, A., Isard, M. Learning and classification of complex dynamics[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(9):1016-1034.
    [Ohta 1990] Ohta Naoya. Optical flow detection by color images[J]. NEC Research and Development,1990,97:78-84.
    [Ohta 1996] Ohta Naoya. Uncertainty models of the gradient constraint for optical flow computation[J]. IEICE Transactions on Information and Systems,1996, E79-D (7):958-964.
    [Ostry 1985] D. J. Ostry, K. G. Munhall. Control of rate and duration of speech movements[J]. Journal of the Acoustical Society of America,1985,77:640-648.
    [OpenCV 2010] http://www.opencv.org.cn/
    [Pandzic 1999] Pandzic, I.S., Ostermann, J. and Millen, D. User evaluation:Synthetic talking faces for interactive services[J]. The Visual Computer,1999,15:330-340.
    [Papamichalis 1987] P. E. Papamichalis. Practical approaches to speech coding[M], Prentice Hall, Englewood Cliffs. NJ,1987.
    [Parke 1972] F. I. Parks. Computer generated animation of faces[D]. Master's thesis, University of Utah, Salt Lake City, UT, June 1972. UTEC-CSc-72-120.
    [Parke 1974] F. I. Parke. A Parameteric Model for Human Faces[D]. PhD thesis, University of Utah, Salt Lake City, UT, December 1974, UTEC-CSc-75-047.
    [Parke 1975] F.I.Parke. A model for human faces that allows speech synchronized animation[J]. Journal of Computers and Graphics,1975, 1(1):1-4.
    [Parke 1982] F. I. Parke. Parameterized models for facial animation[J] IEEE Computer Graphics, 1982,2(9):61-68.
    [Parke 1990] F. I. Parke, editor. State of the Art in Facial Animation[C], SIGGRAPH'90, Course Notes #26. ACM, New York, August 1990.
    [Parke 1991] F. I. Parke. Control Parameterization for facial animation[M]. Computer Animation, Tokyo:Springer-Verlag,1991:3-13.
    [Parke 1996] F. I. Parke. K. Waters. Computer Facial Animation[M]. Wellesley, MA:A. K. Peters,1996:1-365.
    [Pearce 1986] A. Pearce, B. Wyvill, G. Wyvill, and D. Hill. Speech and expression:A computer solution to face animation[C]. In Proc. Graphics Interface'86, Canadian Information Processing Society, Calgary,1986:136-140.
    [Pelachaud 1991] C. Pelachaud. Communication and Coarticulation in Facial Animation[D]. PhD thesis, University of Pennsylvania, Philadelphia, October 1991. Technical Report MS-CIS-91-77.
    [Pelachaud 1996] C. Pelachaud, N. Badler, and M. Steedman, Generating Facial Expressions for Speech[J]. Cognitive Science,1996,20(1):1-46.
    [Platt 1980] S. M. Platt. A system for computer simulation of the human face[D]. Master's thesis, The Moore School, University of Pennsylvania, Philadelphia,1980.
    [Platt 1981] S. M. Platt and N. I. Badler. Animating facial expressions[J]. Computer Graphics, 1981,15(3):245-252.
    [Pighing F 1998] Pighing F. Synthesizing realistic facial expressions from photograph[C]. SIGGRAPH'98,1998:75-84.
    [Peter E 2003] Peter E. MPEG-4 facial animation in video analysis and synthesis[J], International Journal of Imaging Systems and Technology.2003,2(3):27-34.
    [潘泉1997]潘泉,戴冠中,张洪才.交互式多模型滤波器及其并行实现研究[J].控制理论与应用,1997,14(4)：544-550.
    [Pitt M K 1997] Pitt M K, Shephard N. Filtering via simulation:auxiliary particle filters[J]. Journal of the American Statistical Association,1999,94(446):590-599
    [Pantic M 2000] Pantic M, Rothkrantz L. Automatic analysis of facial expressions:the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(12): 1424-1445
    [Pantic M 2004] Pantic M, Rothkrantz L. Facial action recognition for facial expression analysis from static face images[J]. IEEE Transactions on Systems, Man and Cybernetics-PartB,2004, 34(3):1449-1461
    [Pardas M 2002] Pardas M, Bonafonte A, Landabaso J L. Emotion, recognition based on MPEG4 facial animation parameters[C]. In:Proceedings of IEEE Acoustics, Speech, and Signal Processing, Orlando, FL, USA,2002,4:3624-3627
    [Paul Viola 2001] Paul Viola. Rapid Object Detection using a Boosted Cascade of Simple Features[C]. International Conference on Computer Vision and Pattern Recognition,2001,4: 624-627
    [Quackenbush 1988] S. R. Quackenbush, T. P. Barnwell III, M. A. Clements. Objective measures of speech quality[M]. Prentice Hall, Englewood Cliffs,1988.
    [Robert 1999] Robert C P, Casella G. Monte Carlo Statistical Method[M]. New York: Springer-Verlag,1999
    [Sirovich 1987] L. Sirovich, M. Kirby. Low-dimensional procedure for the characterization of human face[J]. J. Opt. Soc. Am.1987,4:519-524.
    [Song 2003] Mingli Song, Chun Chen, Jiajun Bu, and Ronghua Liang.3D Realistic Talking Face Co-driven by Text and Speech[C]. IEEE International Conference on Systems, Man and Cybernetics, Washington, D.C, USA,2003:2175-2186.
    [Steeneken 1992] H. J. M. Steeneken. Quality evaluation of speech processing systems[M]. In INCE, A. N. (Ed.):Digital speech processing:speech coding, synthesis and recognition. Kluwer Academic Publishers,1992.
    [S. B. Gorkturk 2001] S. B. Gorkturk, J. Y. Bouguet. A data-driven model for monocular face tracking[C]. In International conference on Computer Vision,2001:701-708.
    [S. Zhou 2003] S. Zhou, R. Chellappa, and B. Mogghaddam. Adaptive visual tracking and recognition using particle filters[C]. in Proc. IEEE Int. Conf. Multimedia.2003:349-352.
    [S. Zhou 2004] Visual tracking and recognition using appearance adaptive models in particle filters[J], IEEE Trans. Image Process.2004,13(11):1491-1506.
    [Shinohara Y 2004] Shinohara Y, Otsu N. Facial expression recognition using fisher weight maps[C]. In:Proceedings of IEEE Conference on Automatic Face and Gesture Recognition, Seoul, Korea,2004:499-504
    [Seyedarabi H 2004] Seyedarabi H, Aghagolzadeh A, Khanmohammadi S,et al. Recognition of six basic facial expressions by feature-points tracking using RBF neural network and fuzzy inference system[C]. In:Proceedings of IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, China,2004:1219-1222
    [Sebe N 2002] Sebe N, Cohen I, Garg A, et al. Emotion recognition using a Cauchy naive Bayes Classifier[C]. In:Proceedings of International Conference on Pattern Recognition, Quebec City, Canada,2002,1:17-20
    [Sebe N 2004] Sebe N, Lew M, Cohen I, et al. Authentic facial expression analysis[C]. In: Proceedings of International Conference on Automatic Face and Gesture Recognition, Seoul, Korea,2004:517-522
    [Keerthi 2000] S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, et al. A fast iterative nearest point algorithm for Support Vector Machine classifier design[J], IEEE Transaction On Neural Network,2000,11(1):124-136
    [Terzopoulos 1990] D. Terzopoulos, K. Waters. Physically based Facial Modeling, Analysis, and Animation[J]. Journal of Visualization and Computer Animation,1990, 1(4):73-80.
    [Terzopoulos 1991] D. Terzopouls, K. Waters, Techniques for Realistic Facial Modeling and Animation[J]. In:Proceeding of Computer Animation, Geneva, Switzerland, Springer-Verlad, Tokyo,1991:59-74.
    [Terzopoulos 1993] D. Terzopouls, and K. Waters, Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1993,15(6):569-579.
    [T. F. Cootes 2000] T. F. Cootes, Statistic models of appearance for computer vision[R], Tech. Report, Division of Image Science and Biomedical Engineering, University of Manchester,2000.
    [Tian Y 2002] Tian Y, KanadeT, Cohn J. Evaluation of Gabor wavelet-based facial action unit recognition in image sequences of increasing complexity[C]. In:Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, Washington, DC, USA, 2002:26-30
    [Tao H 1999] Tao H, Huang T. Explanation-based facialmotion tracking using a piecewise Bezier volume deformation model[C]. In:Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Collins, CO, USA,1999:23-25
    [Tian Y 2001] Tian, Y., Kanade, T, Cohn, J. F. Recognizing action units for facial expression analysis[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence,2001,23:97-115
    [UCI Adult 2010] http://archive.ics.uci.edu/ml/datasets/Adult
    [UCIWeb 2010] http://archive.ics.uci.edu/ml/datasets/Web
    [Vatikiotis-Bateson 1996] Vatikiotis-Bateson, E, Munhall, K. G, Hirayama, M, Lee, Y. C. The dynamics of audiovisual behavior in speech[M]. In:Speech reading by humans and machines (NATO-ASI Series F), D. Stork & M. Hennecke (Eds.), Berlin:Springer-Verlag,1996:150, 221-232.
    [Voiers 1983] W. D. Voiers. Evaluating processed speech using the diagnostic rhyme test[J]. Speech Technol,1983,30-39.
    [Vander M R 2000] Vander M R, Doucet A, et al. The Unscented Particle Filter [EB/OL].London:Cambridge University,2000. http://citeseer.ist.psu.edu/325754.html
    [Vapnik 1995] Vapnik V N. The Nature of Statistical Learning Theory[M]. New York:Springer Verilag,1995.
    [Waters 1987] K. Waters. A Muscle Model for Animating Three Dimensional Facial Expression [J]. Computer Graphics (SIGGRAPH'87),1987,22(4):17-24.
    [Waters 1991] Waters K, Terzopoulos D. Modeling and animating faces using scanned data[J]. Visualization and Computer Animation,1991,2(1):123-128.
    [Waters 1993] K. Waters, T. M. Levergood. DECface:An automatic lip-Synchronization algorithm for synthetic faces [R]. DEC Cambridge Research Laboratory,1993.
    [Williams 1990] L. Williams. Performance Driven Facial Animation[J]. Computer Graphics (ACM SIGGRAPH'90),1990,24(4):235-242.
    [Wohlert 2000] A. B. Wohlert, V. L. Hammen. Lip muscle activity related to speech rate and loudness[J]. Journal of Speech, Language, and Hearing Research,2000,43:1229-1239.
    [Wu 1994]Y. Wu, N. Magnenat-Thalmann, and D. Thalmann. A Plastic-Visco-Elastic Model for Wrinkles in Facial Animation and Skin Aging[C]. In Proc. Pacific Graphics'94,1994:201-214.
    [Wyvill 1988] B. Wyvill, D. R. Hill, and A. Pearce. Animating speech:An automated approach using speech synthesized by rules[J]. The Visual Computer,1988,3(5):277-289.
    [Wen Z 2003] Wen Z, Huang T. Capturing subtle facial motions in 3d face tracking[C]. In: Proceedings of IEEE International Conference on Computer Vision, Nice, France,2003,2: 1343-1350
    [Wang Y 2004] Wang Y, Ai H, Wu B, et al. Real time facial expression recognition with adaboost[C]. In:Proceedings of International Conference on Pattern Recognition, Cambridge, UK, 2004,3:926-929
    [Wang H 2003] Wang H, Ahuja N. Facial expression decomposition[C]. In:Proceedings of IEEE International Conference on Computer Vision, Nice, France,2003,2:958-965
    [Won-Sook 2000] Won-Sook L. Fast head modeling for animation[J]. Journal of Image and Vision Computing,2000,4(3):355-364.
    [徐2004]徐成华,王蕴红,谭铁牛.三维人脸建模与应用[J].中国图形图像学报,2004, 9(8)：893-903.
    [薛2003]薛文通,宋建社,袁礼海等.图象压缩技术的现状与发展[J].计算机工程与应用,2003,2：65-67.
    [晏1998]晏洁.文本驱动的唇动合成系统[J].计算机工程与设计,1998,19(1)：31-34.
    [晏1999a]晏洁.具有真实感的三维人脸合成方法的研究与实践[D].哈尔滨：哈尔滨工业大学,1999.
    [晏1999b]晏洁,高文.基于模型的头部运动估计和面部图像合成[J].计算机辅助设计与图形学学报,1999,11(5)：389-394.
    [Yin 1997]B. C Yin, W. Gao. Radial Basis Function Interpolation on Space Mesh[C]. Virtual Proceedings of ACM SIGGRAPH97.1997:150-159.
    [尹1999]尹宝才,高文.基于模型的头部运动估计和面部图象合成[J].计算机研究与发展,1999,36(1)：67-71.
    [袁2003]袁泽剑,郑南宁,贾新春.高斯-厄米特粒子滤波器[J].电子学报,2003,31(7)：970-973
    [Yang M 2002] Yang M, Kriegman D J, Ahuja N. Detecting faces in images:A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(1):34-58
    [Yeasin M 2004] Yeasin M, Bullot B, Sharma R. From facial expression to level of interest:a spatio-temporal approach[C]. In:Proceedings of International Conference on Computer Vision and Pattern Recognition, Washington, DC, USA,2004,2:922-927
    [Yuille A L 1989] Yuille A L, Cohen D S, Hallinan P W. Feature extraction from faces using deformable templates[C]. Proceedings of Computer Vision and Pattern Recognition. San Diego, CA, USA,1989.104-109
    [杨2006]杨小军,潘泉,王睿.粒子滤波进展与展望[J].控制理论与应用,2006,23(2)：261-267
    [於2006]於俊, 周维.一种基于壳向量的SVM快速增量学习算法.电子测量与仪器学报,2006,20(6)：94-97
    [Zhang 2004] Y. Zhang, E. C. Prakash, E. Sung. A new physical model with multilayer architecture for facial expression animation using dynamic adaptive mesh[J]. IEEE Trans, on Visualization and Computer Graphics,2004,10(3):339-352.
    [Zhu Y 2002] Zhu Y, DeSilvaL C, KoC C. Using moment invariants and HMM in facial expression recognition[J]. Pattern Recognition Letters,2002,23(1-3):83-91
    [Zhang Y 2005] Zhang Y, Ji Q. Active and dynamic information fusion for facial expression understanding from image sequences[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(5):699-714
    [Zalewski L 2004] Zalewski L, Gong S. Synthesis and recognition of facial expressions in virtual 3D views[C]. In:Proceedings of IEEE 6th International Conference on Automatic Face and Recognition, Seoul, Korea,2004:493-498
    [Zhou S 2004] Zhou, S., Krueger, V. Probabilistic recognition of human faces from video[J]. Computer Vision and Image Understanding,2003,91(1-2):214-245.
    [张2003]张跃武.文本驱动可视语音系统研究与实现[D].西南交通大学硕士学位论文,2003
    [周2008]周维.汉语语音同步的真实感三维人脸动画研究[D].中国科学技术大学博士学位论文,2008

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700