用户名: 密码: 验证码:
基于HMM模型的混合特征参数语音识别方法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
作为一个交叉学科,语音识别技术与许多学科的研究领域都有联系,这些领域的科研成果也成为推进语音识别技术发展的重要因素。当前,语音识别技术已经得到广泛应用。它正在家电产品、智能玩具、商业系统的数据库语音查询、工业生产部门的语声控制、电话与电信系统的自动拨号等领域发挥着重要作用,并且极有可能成为下一代操作系统的用户界面。虽然语音识别技术已经取得了一些成就,但是,由于语音信号的多样性和复杂性,目前的语音识别效率还有待提高,因而开发高效率语音识别模型和算法成为语音识别研究中的一个重要课题。
     本文分别对语音识别技术中的预处理、特征提取和模式识别进行了细致的研究,其主要内容有:
     1.对于现有的语音识别系统模型进行研究,探讨了语音识别系统预处理、特征提取和模式识别等几个方面的理论,并根据论文中实验系统的需要,选择合适的参数以及信号处理方法,建立语音识别系统。
     2.在对特征提取理论研究的基础上,提出了基于混合特征参数的语音识别方法。在实验中,分析讨论了混合参数的合成阶数比例对语音识别系统识别率的影响。
     3.在对文中列出的语音识别技术分析讨论的基础上,本文在Matlab环境下实现了整个语音识别系统的仿真,并且分别建立了基于LPCC参数、MFCC参数和混合参数的三种语音识别系统。在此基础上,分析了特征参数的不同对于语音识别系统识别率的影响。
     通过对上述仿真实验的数据分析,得出以下结论,混合特征参数具有较好的语音信号描述能力;使用混合特征参数建立的语音识别系统与分别使用LPCC参数和MFCC参数建立的语音识别系统相比较,具有系统识别率高的优点。
Speech recognition technique is an interdisciplinary. There are many connections between the speech recognition and a lot of disciplines in research areas. Some of the achievements in these disciplines has become an important factor in promoting the development of the speech recognition. At present, the speech recognition technique has already been widely applied in various areas, such as, household electronics, intellectual toys, the database voice inquiry of the business system, industrial voice control, autodialing of telephone and telecom system, etc. It will affirmatively become the interface of the next generation operating system. Although there has some advancements, speech recognition will need to improve the recognition rate, because of the speech signals' diversity and complexity. It is an important problem that new model and high efficiency algorithm in speech recognition will need to be developed continuously.
     In this thesis, speech recognition technique has been studied deeply in the parts of pre-processing, feature extraction and pattern recognition. Its main contents are as follows:
     1. The existing speech recognition system has been studied in detail. The theories such as speech signal pre-processing, feature extraction and pattern recognition have been discussed systematically. According to the experiment's demands, the appropriate parameters and signal processing methods were selected, then the speech recognition system has been established completely.
     2. On the basis of the studying of feature extraction theory, the speech recognition method based on the mixed parameter was presented and the composition of the mixed feature parameter has been researched. In the experiment, the influence for the recognition accuracy of speech recognition system which was effected by the composite order ratio of the mixed parameter was analyzed and was discussed.
     3. For the discussional speech recognition technique in this thesis, the simulation of the entire speech recognition system was constructed in the environment of Matlab. This simulation included three kinds of speech recognition systems, which were separately based on the LPCC parameter, the MFCC parameter and the mixed parameter. On this basis, the impact for the recognition accuracy of speech recognition system which was effected by the differences of feature parameter was analyzed here.
     The results of the experiment demonstrated that mixed feature parameter is the one owned the best speech signal describing ability. Compared with the speech recognition system established by using usual feature parameter, the speech recognition system constructed by adopting mixed feature parameter had the advantages at higher system recognition rate.
引文
[1]王仁华,刘庆峰.开创语音技术产业的新纪元[J].微电脑世界,2000(52).
    [2]王炳锡,屈丹等.实用语音识别基础[M].北京:国防工业出版社,2005年1月第1版.
    [3]何湘智,语音识别的研究与发展[J].计算机与现代化,2002,(1):3-6.
    [4]聂敏.语音识别及其关键技术.微波与卫星通讯,1999.4.
    [5]鲍长春.数字语音编码原理[M].西安电子科技大学出版社,2007.1.
    [6]韩纪庆,张磊,郑铁然.语音信号处理[M].北京:清华大学出版社,2004.
    [7]马峻.语音识别技术研究[D].哈尔滨工程大学硕士学位论文,2004.
    [8]X.Huang, AAcero, H.Hon. Spoken Language Processing. Prentice Hall.2001(19):42-102.
    [9]Paulo S.R.Diniz, Eduardo A.B.Da Silva, sergil L.Netto.数字信号处理:系统分析与设计[M].北京:电子工业出版社,2004.
    [10]胡光锐,韦晓东.基于倒谱特征的带噪语音端点检测[J].《电子学报》,2000年第10期.
    [11]唐尧.基于DSP平台的语音识别算法的研究与实现[D].南京航空航天大学硕士学位论文,2007.
    [12]王炳锡.语音编码[M].西安:西安电子科技大学出版社,2002.
    [13]易克初,田斌,付强.语音信号处理[M].北京:国防工业出版社,2000.
    [14]Y.K.Hwe, W.H.Chuan.Robust. Features for Speech Recognition Based on Temporal Trajectory Filtering of Short-time Autocorrelation Sequences. Speech Communication,1999,28(1):13-24.
    [15]叶洪涛.特定词汉语语音识别系统应用研究[D].广西师范大学硕士学位论文,2006.
    [16]张雄伟,陈亮,杨吉斌.现代语音处理技术及应用[M].北京:机械工业出版社,2003:19-34.
    [17]陈尚勤,罗承烈.近代语音识别[M].成都:电子科技大学出版社,1991.
    [18]吴晓平,崔光照.基于DTW算法的语音识别系统实现[J].电子工程师,2004,30(7):17-19.
    [19]岳喜才等.文本无关的说话人识别综述[M].模式识别与人工智能,2001.6.
    [20]王春玲.隐马氏模型的建立及其应用[D].国防科学技术大学硕士学位论文,2002..
    [21]徐丽娜.神经网络控制[M],电子工业出版社,2003.
    [22]姚天任,江太辉编.数字信号处理(第二版).武汉:华中科技大学出版社,2000.65-89.
    [23]张志刚.基于神经网络/HMM的语音识别算法的研究[D].2006.
    [24]谢锦辉.HMM及其在语音处理中的应用[M].华中理工大学出版社,1995.
    [25]L.R.Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition[J]. Proceeding of IEEE,1989.77 (2):257-286.
    [26]Steve Young, Gunnar Evermann, Dan Kershaw. The HTK Book[M]. Cambridge University Engineering Department.Version3.2.1,2001.
    [27]Donovan R E, Woodland P C.A Hidden Markov Model Based Trainable Speech Synthesiser[J]. Computer Speech and Language,1999,13 (3):223-242.
    [28]Ben Zeghiba, Mohamed Faouzi. Hybrid HMM/ANN and GMM combination for user-customized Password speaker verification[J]. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing-Proceedings.2003,2:225-228.
    [29]朱民雄,闻新,黄健群等.计算机语音技术[M].北京:北京航空航天大学出版社,2002.
    [30]赵力.语音信号处理[M].机械工业出版社,2003.
    [31]苗苗.基于HMM的语音识别技术的研究[D].西安建筑科技大学硕士论文,2007.
    [32]晶皎等.语音识别中HMM自组织神经网络的混合模型[J].东北大学学报,1999.02.
    [33]张众.语音识别中隐马尔可夫模型状态数的研究[J].南京理工大学学报,1998.6.
    [34]导向科技编著.Matlab6.0程序设计与实例应用[M].中国铁道出版社,2001.12.
    [35]陈怀深.MATLAB及其在理工课程中的应用指南[M].西安电子科技大学出版社.2000.01.
    [36]Nicolas Moreau. HTK. (v.3.1):Basic Tutorial http://htk.eng.cam.ac.uk/.
    [37]杨行峻.迟惠生.语音信号数字处理[M].北京:电子工业出版社,1995.
    [38]何强,何英.Matlab扩展编程[M].清华大学出版社,2002.6.
    [39]雷静.语音识别技术的研究及其基本实现[D].武汉理工大学硕士学位论文,2002:5-13.
    [40]Donghoon H, Euisun C. Optimizing feature extraction for speech recognition. IEEE Trans on Speech And Audio Processing.2003,11 (1):80-87.
    [41]Rabiner L.R., Juang B.H. Fundamental of Speech Recognition. Prentice Hall Internation,1993.
    [42]S.T.Panagiotis, D.Apostolos. D.Vassilis. Configurable Logic Based Architecture for Real-time Continuous Speech Recognition Using Hidden Markov Models[J]. Journal of VLSI Signal Processing Systems,2000,24(2-3):223-240.
    [43]Hidden Markov Model Estimation. http://www.csc.gatech.edu/yank.
    [44]LiangSheng Huang. A Novel Approach To Robust Speech Endpoint Detection in Car Enviroments.2000年声学、语言及信号处理国际会议论文集Vol 3 No 2.
    [45]Roch, M.Hurtig. The integral decode:a smoothing technique for robust HMM-based speaker recognition IEEE Transactions on Speech and Audio Proceeding[J].2002.315-324.
    [46]石洪波.采用连续混合密度隐马尔可夫模型的语音识别[J].华北工学院学报,1997.Vol 18.No.4.
    [47]胡光锐等.基于倒谱特征的带噪声语音端点检测[J].电子学报,2000.10.
    [48]徐丽娜.神经网络控制[M].电子工业出版社,2003.
    [49]徐文盛.一种抗噪孤立字语音识别模型[J].中国科学技术大学学报,2000.12.
    [50]毛用才,胡奇英.随机过程[M].西安电子科技大学出版社,2004.
    [51]L.R.Rabiner. An Introduction to Hidden Markov Models IEEE Acoustic. Speech and Signal Processing Magazine,1986.
    [52]L.R.Rabiner et al, "High Performance Connected Digit Recognition Using Hidden Markov Models" Trans. IEEE on ASSP 1989.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700