基于样本学习的人脸个性化卡通模型系统性能分析与优化设计

英文题名：Performance Analysis of a Sample-Based Cartoon Generation System and System Optimal Design
作者：胡玲
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：基于样本的学习 ; 线条画 ; 统计学习 ; 非参数采样
英文关键词：Sample-based Learning ; Sketch ; Statistic Learning ; Non-Parametric Sampling
学位年度：2005
导师：王桥
学科代码：081002
学位授予单位：东南大学
论文提交日期：2005-03-21
答辩委员会主席：朱秀昌

摘要

本文首先介绍了基于样本学习的人脸线条画生成系统;该系统可以自动地从输入的图像生成带有特定风格的人脸线条画。系统设计了用于人脸线条画绘制的灵活模板,使用了线条出现的控制开关使模板具有灵活的表现能力。线条通过一组对应的特征点控制,隐含了线条画对线条的光滑性和连续性的要求。该系统通过训练样本得到线条画的先验模型,充分结合了人脸的结构信息。在线条画和原始图像之间使用了与人脸结构相关的局部似然模型。在生成过程中直接使用样本构造的非参数化的概率形式。并提出了启发式的搜索算法,可以有效的在线条画模板参数的变维空间中求解。
     随后在线条画绘制的基础上介绍了一个基于图像的个性化卡通头像生成系统。该系统提供了一套方便的交互式操作界面,可以使普通用户简单快捷的生成带有各种表情和动画效果的卡通头像。然而该系统是完全基于Client端的实现,在算法安全性、系统性能以及产权保护等方面都无法满足系统实用化的需求。
     因此,本文的主要工作集中在将现有的个性化卡通生成系统与MSN Messenger系统进行集成。系统集成的目标是向用户提供一套有偿的付费系统,该系统具有用户数量大,访问量高的特点。现有的个性化卡通生成系统的体系架构无法满足实用化的需求;因此本文首先对现有的个性化卡通生成系统进行功能模块的划分,将系统从功能级划分为几个独立的模块,然后对各个模块进行综合性能测试;在分析和讨论性能测试结果的基础上,权衡可能影响系统性能的各个方面的因素,实现了一套基于Client-Server的优化的体系结构。
In this paper, a sample-based facial sketch system is introduced first. This algorithm system can automatically generate a sketch from an input image, by learning from sample sketches drawn with a particular style by an artist. And then based on the sample-based facial sketch system, a Cartoon system is implemented, which can generate a personalized Cartoon face from an input image. However the system is based on simple-Client architecture which has algorithm security issues.
     The main work of this paper is to integrate the existing Cartoon System into MSN Messenger System. We aim to provide a chargeable service to the user, who should not be able to get the generated pictures before he pays, and should not be able to crack our client and generate sketches in a standalone computer. So under several security considerations and common performance requirements, a more efficient architecture is implemented based on the detailed test result of the existing Cartoon System which is called Client-Server Architrave System.

引文

[1] Hsu, S.C, Lee, I.H.H., Wiseman, N.E. Skeletal strokes. In: UIST’93 Proceedings of the ACM SIGGRAPH and SIGCHI Symposium on User Interface Software and Technology Secrets of the Face. 1993. 197~206.
    [2] Ostromoukhov, V. Digital facial engraving. In: Proceedings of the ACM SIGGRAPH 1999. 1999. 417~424.
    [3] Durand, F. Decoupling strokes and high-level attributes for interactive traditional drawing. In: Proceedings of the Eurographics Rendering Workshop 2001. 2001. 71~82.
    [4] Ruttkay, Z., Noot, H. Animated chartoon faces. In: Proceedings of the 1st International Symposium on Non-Photorealistic Animation and Rendering 2000. 2000. 91~100.
    [5] Flash. http://www.macromedia.com/software/.
    [6] Litwinowicz, P.C. Inkwell: a 2.5-d animation system. Computer Graphics, 1991,25(4):113~122.
    [7] Rhodes G. Secrets of the face. New Zealand Journal of Psychology, 1994,23(1):3~17.
    [8] Brennan, S.E. Caricature generator [MS. Thesis]. Cambridge, MA: MIT Press, 1982.
    [9] Tominaga, M., Fukuoka, S., Murakami, K., et al. Facial caricaturing with motion caricaturing in PICASSO system. In: Proceedings of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics’97. 1997. 30~37.
    [10] http://www.koshi-lab.sccs.chukyo-u.ac.jp/~fuji/pica2.
    [11] Li Y, Kobatake H. Extraction of facial sketch based on morphological processing. In: Proceedings of the 1997 IEEE International Conference on Image Processing. 1997. 316~319.
    [12] Nishino J, Kamyama T, Shira H, Odaka T, Ogura H. Linguistic knowledge acquisition system on facial caricature drawing system. In: Proceedings of the 1999 IEEE International Conference on Fuzzy Systems. 1999. 1591~1596.
    [13] Iwashita S, Takeda Y, Onisawa T. Expressive facial caricature drawing. In: Proceedings of the 1999 IEEE International Conference on Fuzzy Systems. 1999. 1597~1602.
    [14] Librande SE. Example-Based character drawing [MS. Thesis]. Cambridge, MA: MIT, 1992.
    [15] Freeman WT, Tenenbaum JB, Pasztor E. An example-based approach to style translation for line drawings. MERL Technical Report, MERL-TR-99-11, Cambridge, MA, 1999.
    [16] Freeman WT, Pasztor E. Learning low-level vision. In: Proceedings of the 7th International Conference on Computer Vision. 1999. 1182~1189.
    [17] M.Kass, A.Witkin, and K.Terzopoulos. Snakes: Active contour models International Journal of Computer Vision, 1(4):321-331, 1987
    [18] T.F.Cootes, C.J.Taylor: D.H.Cooper, and J.Graham. Active shape models – their training and application. Computer Vision and Image Understanding: CVIU, 61(1):38-59, January 1995.
    [19] Edwards, B. The new drawing on the right side of the brain. Harper Collins, 1999.
    [20] A.L. Yille, D.S. Cohen, and P.W.Hallinan. Feature extraction from faces using deformable templates. In Proceeding IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Pages 104-109, SanDiego, June 4-8 1989.
    [21] Chung-Lin Huang and Ching Wen Chen. Human facial feature extraction for face interpretation and recognition. Pattern Recognition, 25(12): 1435-1444. 1992.
    [22] Cassidy J. Curtis, Sean E. Anderson, Joshua E. Seima, Kurt W.Fleischer, and David H.Salesin. Computer-generated watercolor. In Turner Whitted, editor, SIGGRAPH 97 Conference Proceedings, Anuual Conference Series, pages 421-430. ACM SIGGRAPH, Addison Wesley,August 1997. ISBN 089791-896-7.
    [23] Luiz Velho and Jonas de Miranda Gomes. Digital halftoning with space filling curves. Computer Graphics, 25(4): 81-90, July 1991.
    [24] Siu Chi Hsu and Irene H.H. Lee. Drawing and animation using skeletal strokes. In Andrew Glassner, editor, Proceedings of SIGGRAPH’ 94 (Orlando, Florida, July 24-29, 1994), Computer Graphics Proceedings, Annual Conference Series, Pages 109-118. ACM SIGGRAPH, ACM Press, July 1994. ISBN 0-89791-667-0.
    [25] Sherstinsky, A., Picard, R.W. M-lattice: a novel non-linear dynamical system and its application to halftoning. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vol 2. 1994. II/565~II/568.
    [26] Wong, E.C. Artistic rendering of protrait photographs [MS. Thesis]. Cornell University, 1999.
    [27] Bregler, C., Covell, M., Slaney, M. Video rewrite: driving visual speech with audio. In: Proceedings of the ACM SIGGRAPH’97. 1997. 353~360.
    [28] Morishima, S., Aizawa, K., Harashima, H. An intelligent facial image coding driven by speech and phoneme. In: Proceedings of the IEEE ICASSP. 1989. 1795-1798
    [29] Efros, A.A., Leung, T.K. Texture synthesis by non-parametric sampling. In: Proceedings of the 7th International Conference on Computer Vision. 1999. 1033~1038.
    [30] Xu, Ying-qing, Guo, Bai-ning, Shum, H. Chaos mosaic: fast and memory efficient texture synthesis. Technical Report, MSR-TR-2000-32, Microsoft Research, 2000.
    [31] Rabiner, L., Juang, B.H. Fundamentals of Speech Recognition. Prentice Hall, 1993.
    [32] Bouman, C.A. Cluster: an unsupervised algorithm for modeling Gaussian mixtures. Software Manual, http://www.ece.purdue.edu/ ~bouman.
    [33] Poggio T, Girosi F. Networks for approximation and learning. In: Proceedings of the IEEE. 1990. 1481~1497.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700