唇部检测算法的研究改进与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
唇读(lip—reading/speech—reading),可以通过观察说话者的口型变化,“读出”或“部分读出”其所说的内容。唇读研究的目的是利用视觉信道信息补充听觉信道信息,提高计算机系统的理解力。唇读技术源于听力弱者或者听力障碍者学习、了解正常人的表达的一种技巧,它亦可用于特定场合的信息获取(如情报等)。如今,该技术被广泛应用于语音识别、身份识别、人机智能接口以及多媒体系统等领域。
     唇部检测作为唇读系统的首要环节,主要包含两个方面的内容,第一是在说话者环境中检测出脸部的人脸识别技术,第二是在已经识别出的人脸图像基础上的唇部识别技术。由于人脸识别技术已经有比较成熟的检测方法,本文主要研究在人脸彩色图像基础上的唇部检测算法。
     本文基于人脸的彩色图像,对不同人种的唇色和肤色的R,G,B分布进行了细致研究,提出一种基于唇色肤色色度差异的唇部检测算法。该算法充分利用了R,G,B三个分量的分布关系定位唇部,简单高效,具有较好的鲁棒性,适用于白色人种和黄色人种。本文将该算法与经典的Chromatic Feature Extraction算法和Red Exclusion算法进行比较,实验表明,本文算法在诸多方面有较大的进步。
     最后本文将所提出的算法用硬件描述语言加以实现,结果表明,新算法在速度,硬件开销上都符合嵌入式系统的应用要求。
Lip-reading or speech-reading systems can understand or partially understand what a speaker says via his/her lip movements. The lip-reading research aims at compensating for the audio channel information by video information channel in order to enhance the computers' intelligent level. The technology of lip-reading that comes from the skills which are often used by poor listeners to understand what others say can also be made full use of to get information in certain cases (e.g. intelligence). Nowadays, this technology is widely used in the realm of voice identification, identity identification, human-computer intelligent interface and multi-media etc.
     Lip detecting acted as one of the most important steps of lip-reading systems contains 2 facets, one is human face identification technology using which to detect the face of a speaker, the other is lip-detecting technology on the base of a human face that is already detected by face identification technology. In this paper, research was mainly carried out on the base of color images of human faces because the methodology of detecting face is mature enough in current times.
     In this paper, the R, G, B distribution of lip and skin color was detailedly researched in different race based on human face chromatic images and a new lip detecting algorithm was brought forward. This simple, effective, robust new algorithm that was fit for both white and yellow race made full use of the diversity of R, G, B distribution diversity between lip and skin to locate the lip area. The comparisons were done among this new algorithm and other two famous algorithm called Chromatic Feature Extraction algorithm and Red Exclusion algorithm. The results of experiments showed this new algorithm had many improved features.
     Finally, hardware description language was used to implement the new algorithm which was proved to be fit for embedded systems by the guide line of processing speed and hardware spending.
引文
[1].姚鸿勋,高文,王瑞等.视觉语言--唇读综述.电子学报,2001,vol.29:239-246.
    [2].单卫.计算机唇读系统的研究与实践:[学位论文].哈尔滨:哈尔滨工业大学,2002.
    [3].王良民,张建明,牛德娇等.实时视频图像快速唇部检测方法的研究与实现.计算机应用.2004.24(1):70-72.
    [4].Tarcisio Coianiz,Lorenzo Torresani,Bruno Caprile.2D Deformable Models for Visual Speech Analysis:D Stork and M Hennecke.Speech Reading by Humans and Machines.NY:Springer-Verlag,1996:391-398.
    [5].Ye-Peng Guan.Automatic Extraction of Lip Based on Wavelet Edge Detection[C].Proceedings of the Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing(SYNASC'06),Romania,September 2006:125-132
    [6].Hua Ouyang,Tan Lee.A new lip feature representation method for video-based bimodal authentication[C].Proceedings of the 2005NICTA-HCSNet Multimodal User Interaction Workshop.Sydney,April 2006:33-37
    [7].Abu Sayeed Md,Sohail,Prabir Bhattacharya.Automated Lip Contour Detection Using the Level Set Segmentation Method[C].Proceedings of the 14th International Conference on Image Analysis and Processing (ICIAP 2007),September 2007:425-430
    [8].吕琳.人脸检测方法综述.电脑知识与技术,2005-36:159-162.
    [9].李春明,李玉山,张大朴.一种视频图像序列人脸检测方法.电子测量与仪器学报,2006.6,20(3):28-32.
    [10].赵丽红,刘纪红,徐心和.人脸检测方法综述.计算机应用研究.2004-9:1-4
    [11].姚鸿勋,刘明宝,高文等.基于彩色图像的色系坐标变换的面部定位与跟踪法.计算机学报,2000,2.23(2):158-165.
    [12].张晓华,山世光,曹波等.CAS-PEAL大规模中国人脸图像数据库及其基本评测介绍.计算机辅助设计与图形学学报.2005.17(1):9-16.
    [13].张全海,施鹏飞.基于HSV空间彩色图像的边缘提取方法.计算机仿真,2000.11,17(6):25-27
    [14].Wang S.L,Lau W.H,Leung S.H,Yan H.A Real-Time Automatic Lipreading System:Circuits and Systems,2004.ISCAS '04.Proceedings of the 2004 International Symposium on Volume 2:23-26 May 2004:11-101-4 Vol.2
    [15].R.R.Rao,Russell M.Mersereau.Lip Modeling for Visual Speech Recognition:Proceeding of 28th Annual Asilomar Conference on Signals Systems and Computers,Pacific Grove,CA,1994.
    [16].T Wark,S Sridharan,V Chandran.An Approach to Statistical Lip Modeling for Speaker Identification via Chromatic Feature Extraction:In Proceedings of the IEEE International Conference on Pattern Recognition,August 1998:123-125.
    [17].Trent W.Lewis,David M.W.Powers.Lip Feature Extraction Using Red Exclusion:PanSydney Workshop on Visual Information Processing.Sydney:December,2000:61-67.
    [18].Kinmanlam,Yan H.Locating and extracting the eye in human face images.Pattern Recognition,1996,29(5):771-779.
    [19].张志文.唇部检测算法的研究与实现:[学位论文].杭州:浙江大学,2007.
    [20].Otsu.N.A Threshold Selection Method from Gray-Level Histograms.IEEE trans.Systems,man,and cybernetics,voI.SMC-9,no.1:62-66.
    [21].张建明,陶宏,王良民等.基于SVD的唇动视觉语音特征提取技术.江苏大学学报.2004,9.25(5):426-429.
    [22].鹿佳,姚鸿勋.改进AdaBoost对基于HMM的唇读系统识别率的提高.哈尔滨商业大学学报.2005,10.21(5):604-607.
    [23].单卫,姚鸿勋,高文.唇读中序列口型的分类.中文信息学报.2002.16(1):31-36.
    [24].柴秀娟,姚鸿勋,高文.唇读识别中的基本口型分类.计算机科学.2002.29(2):130-133.
    [25].徐铭辉,姚鸿勋.基于句子级的唇读识别技术.计算机工程与应用.2005-8:86-88.
    [26].刘党辉,沈兰荪,Kin.Man Lam.人脸检测研究进展.计算机工程与应用.2003-28:5-9.
    [27].Duy Nguyen,David Halupka.,ReaI-Time Face Detection and Lip Feature Extraction Using Field-Programmable Gate Arrays[C].lEEE trans on systems,man,and cybernetics,Part B:cybernetics,vol.36,No.4,August 2006:902-912
    [28].Jeffrey F.Cohn,Foundations of human computing:facial expression and emotion[C].Proceedings of the 8th international conference on Multimodal interfaces,Canada,Sep 2006:233-23
    [29].王瑞,高文,马继涌.一种快速、鲁棒的唇动检测与定位方法.计算机学报,2001,24(8):866-871.
    [30].劭承会,唐可洪,杨志刚等.水岸彩色图像自适应阈值边缘提取算法.光电子.激光,2004,15(8):985-989.
    [31].Chiou G I,Hwang J N.Lipreading From Color Motion Video:IEEE Transactions on image processing,1997,6(8):1192-1195.
    [32].N.Eveno,A.Caplier,P.-Y.Codon.A New Color Transformation for Lips Segmentation:IEEE Workshop on Multimedia Signal Processing.Cannes:2001-10:3-5
    [33].Zhang Jian-ming,Wang Liang-min,Niu De-jiao,et al.Research and Implementation of A Real Time Approach to Lip Detection in Video Sequences:Proceedings of 2003 International Conference on Machine Learning and Cybernetics.Piscataway:IEEE,2003.
    [34].Rafael C.Gonzalez,Richard E.Woods,Steven LEddins 著,阮秋琦 等译. 数字图像处理(MATLAB版).北京:电子工业出版社,2006.
    [35].佚名,图像分割与模式识别研究综述,http://www.image2003.com/paper/open.asp?id=2620,2007-3-28.
    [36].陈中华,模式识别综述,http://www.china-vision.net/technology/sicd/200701/3974.html,2007-1-4.
    [37].Pun T.A New Method for Gray Level Picture Thresholding Using the Entropy of the Histogram.Signal Process,1980.2:223-237.
    [38].陈敏.一种自动识别最优阈值的图像分割方法.计算机应用与软件,200623(4):85-86
    [39].Silsbee.Peter L,Bovik.Alan C.Computer Lipreading for Improved Accuracy in Automatic Speech Recognition:IEEE Transactions on Speech and Audio Processing:1996,4(5):337-351.
    [40].姚鸿勋,高文,李静梅等.用于口型识别的实时唇定位方法.软件学报,2000,11(8):1126-1132.
    [41].郝颖明,朱枫.2维Otsu自适应阈值的快速算法.中国图象图形学报,2005,10(4):484-488.
    [42].陈怀琛,吴大正,高西全 编著.MATLAB 及在电子信息课程中的应用.北京:电子工业出版社,2002.1.
    [43].J.Bhasker 著,孙海平 等译.Verilog HDL综合实用教程.北京:清华大学出版社,2004.1.
    [44].J.Bhasker 著,徐振林 等译.Verilog HDL 硬件描述语言.北京:机械工业出版社,2000.7.
    [45].Erik Hjelmas,Boon Kee Low.Face Detection:A Survey.Computer Vision and Image Understanding,2001:236-274.
    [46].王晓冬.数据结构与算法技术.北京:电子工业出版社,2002.3.
    [47].雷江华,王庭照,方俊明.聋生唇读语音识别中熟悉效应的试验研究.心理科学,2005.28(5):1120-1121

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700