基于蚁群算法的语音识别系统的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
语音识别的研究工作始于上个世纪50年代,至今已经形成了完整的理论体系,目前语音识别的研究也已经进入了商品化阶段,基础性理论相当完善,各种各样的产品也相继涌现。然而语音识别在实现过程中通常涉及多种因素,需要同时考虑,并且它作为一门交叉学科,涉及到了信号处理、模式识别、人工智能、计算机科学、语言学和认知科学等众多学科,所以语音识别距离理想目标仍有很大距离,相关的技术难关还有待克服。
     本文对语音识别的主要过程进行了详细的介绍。语音识别首先对输入的语音信号必须进行预处理,以保证系统获得一个比较理想的处理对象。在语音的特征参数提取阶段,文中介绍了在实际应用中常用到的特征参数:线性预测倒谱参数(LPCC)、Mel频率倒谱参数(MFCC)等。在识别阶段,介绍了基于矢量量化的识别技术以及动态时间归整的识别技术(DTW)。在此基础上,引入了蚁群算法的基本原理。
     蚁群算法是最新发展的一种模拟昆虫王国中蚂蚁群体智能行为的仿生优化算法,它具有较强的鲁棒性、优良的分布式计算机制、易于与其他方法相结合等优点。蚁群算法作为一种新的用于解决复杂优化问题的全局搜索方法,已经成功应用于求解TSP问题、调度问题、指派问题等,显示出了蚁群算法在处理复杂优化问题方面的优越性。
     本文利用蚁群算法优化机制,结合传统的DTW算法,提出了一种新的基于蚁群算法的动态时间规划算法来搜索语音信号特征参数序列之间匹配的全局最优路径,进而以此衡量语音信号之间的相似度,从而使系统的识别效果有了进一步的提高。
     文中最后对新的语音识别系统各模块进行了仿真测试,给出了仿真计算结果。实验结果表明,采用基于蚁群算法的语音识别系统识别效果要好于采用传统DTW算法的语音识别系统。
The speech recognition which has been researched since the 1950s, has developed to an integrated theory and been commoditized with perfect basic theory and lots of products successively emerging. However, the practice of speech recognition is related to various factors, which must be considered simultaneously in the process. As a cross-discipline, it also has everything to do with many subjects, such as signal processing, pattern recognition, artificial intelligent, computer science, linguistics and epistemic science. Therefore, there are still many associated technological difficulties to be conquered and the current speech recognition is still far from the final target.
     The main process of speech recognition is analyzed and investigated thoroughly. First, the input of speech signals must be pre-processed in the system in advance, so that the object for the system to process is comparatively ideal. Secondly, the frequently-used characteristic parameters, such as LPCC and MFCC, are introduced in detail when coming to abstracting characteristics while some key techniques including VQ and DTW are analyzed in the recognition step. Then, the basic principles of ant colony algorithm are introduced.
     Ant colony algorithm which is one of the algorithms latest developed, is a bionic optimization algorithm by simulating the intelligence of ants swarm in insect kingdom. As a new algorithm used to solve complex optimization problems of global search method, the ant colony algorithm with its robustness, good distributed computing mechanism and easy-combination with other methods has been successfully applied into TSP, scheduling and assignment problems, showing many advantages in dealing with the complex optimization problems.
     By combining ant colony algorithm optimization mechanism with the traditional DTW algorithm, a new dynamic time programming algorithm based on the ant colony algorithm is proposed, which is used to search the speech signals characteristic parameters sequences for the global optimal path, by which the similarity between the speech signals is measured. Thus, the recognition result of the system has been further improved.
     The new speech recognition system is tested by simulating every single module and evaluated with the result figures shown in the final part. The experimental results illustrate that the speech recognition system based on ant colony algorithm has better performance than that based on traditional DTW algorithm.
引文
[1]韩纪庆,张磊,郑铁然.语音信号处理[M].北京:清华大学出版社,2005
    [2]蔡莲红,黄德智,蔡锐.现代语音技术基础与应用[M].北京:清华大学出版社,2004
    [3]M.Dorigo,G DiCaro.The Ant Colony Optimization[A].NewMeta-Heuristic,Proceedings of the Congress on Evolutionary Computation[C],London,UK,1999:11-32
    [4]陈峻,沈洁,秦玲.蚁群算法求解连续空间优化问题的一种方法[J].软件学报,2002.13(12):2317-2323
    [5]马良.基于蚂蚁算法的函数优化[J].控制与决策,2002,17(增刊):719-726.
    [6]易克初,田斌,付强.语音信号处理[M].北京:国防工业出版社,2000
    [7]何强,何英.Matlab扩展编程[M].北京:清华大学出版社,2002
    [8]姚天任.数字语音处理.湖北:华中科技人学出版社,2002
    [9]General Aspects of Digital Transmission System,Coding of Speech At 8kbit/s Using Conjugate-sturcture Algebraic-code-excited Linear-prediction(CS-ACELP).JTU-T Recommendation G.729
    [10]周德俊,杨莉.G729语音压缩编码及其DSP实现.通信技术,2001(4)
    [11]杨行峻,迟惠生.语音信号数字处理[M].北京:电子工业出版杜,2003
    [12]胡航.语音信号处理(第2版).哈尔滨:哈尔滨工业人学出版社,2000
    [13]M.H.Savoji.A Robust Algorithm for Accurate End Pointing of Speech.Speech Communication,1989,8(2):45-60
    [14]R.Bhiksha,S.Rita.Classifier-based Non-linear Projection for Adaptive End Pointing of Continuous Speech.Computer Speech&Language,2003,17(1):5-26
    [15]聂敏.语音识别及其关键技术[J].微波与卫星通信,1999,4:53-56
    [16]C.Lee,D.hyun,C.Nadeu.Optimizing feature extraction for speech recognition[J].IEEE Transactions on Speech and Audio Processing,2003,11(1):80-86
    [17]王炳锡.语音编码[M].西安:西安电子科技大学出版社,2002
    [18]赵力.语音信号处理.北京:机械工业出版社,2003
    [19]徐宵鹏,吴及.孤立词语音识别算法性能研究与改进.计算机工程与应用,2001.21:144-146
    [20]L,Rabiner,B.H.Junag.Fundamentals of Speech Recognition.PTR Prentice Hall,1993
    [21]Dupont,Stephane,Cheboub,Leila.Fast seaker adaptation of artificial neural networks for automatic speech recognition.IEEE Intemational Coneferneeon Acoustics,Speech and Signal Processing-Proceedings,2000
    [22]A.Colomi,M.Dorigo,V.Maniezzo.Distributed optimization by ant colonies.Proceedings of the lst European Conference on Artificial Life,1991:134-142
    [23]M.Dorigo.Optimization,learning and natural algorithms.Ph.D Thesis,Department of Electronics,Politecnico di Milano,Italy,1992
    [24]M.Dorigo,V.Maniezzo,A.Colomi.Ant system:optimization by a colony of cooperating agents.IEEE Transaction on Systems,Man,and Cybernetics-Part B,1996,26(1):29-41
    [25]M.Dorigo,V.Maniezzo,A.Colomi.Positive feedback as a search strategy.Technical Report 91-016,Dipartimento di Elettronica,Politecnico di Milano,Italy,1991
    [26]V.Maniezzo,A.Colomi,M.Dorigo.The ant system applied to the quadratic assignment problem.Technical Report IRIDIA/94-28,IRIDIA,Universite Libre de Bruxelles,Belgium,1994
    [27]L.M.Gambardella,E.D.Taillard,M.Dorigo.Ant colonies for the QAP.Technical Report IDSIA-4-97,IDSIA,Lugano,Switzerland,1997
    [28]L.M.Gambardella,E.D.Taillard,M.Dorigo.Ant colonies for the quadratic assignment problem.Journal of the Operational Research Society,1999,50(2):167-176
    [29]A.Colomi,M.Dorigo,V.Maniezzo.Ant system for job-shop scheduling.Belgian J.Oper.Res.Statist.Comput.Sci,1994,34:39-53
    [30]D.Costa,A.Hertz.Ants can colour graphs.Journal of the Operational Research Society,1997,48(3):295-305
    [31]S.H.Ahn,S.G.Lee,T.C.Chung.Modified ant colony system for coloring graphs.Proceedings of the 2003 Joint Conference of the 4th Intemational Conference on Information,Communication and Signal Processing and the 4th Pacific Rim Conference on Multimedia,2003,3:1849-1853
    [32]M.Dorigo,V.Maniezzo,A.Colomi.Positive Feedback as a Search Strategy.Technical Report 96-106
    [33]段海滨,王道波.一种快速全局优化的改进蚁群算法及仿真.信息与控制,2004,33(2):241-244
    [34]詹士昌,徐婕,吴俊.蚁群算法中有关算法参数的最优选择.科技通报,2003,9(5):381-386.
    [35]H.M.Botee,E.Bonabeau.Evolving ant colony optimization.Advances in Complex Systems, 1998,1(2):149-159
    [36]Z.W.Wanda,O.Tokunbo.Formant and Pitch Detection Using Time-frequency Distribution.International Journal of SpeeehTeehnolog,1999,3(1):35-49
    [37]段海滨.蚁群算法及其在高性能电动仿真转台参数优化中的应用研究.南京:南京航空航天大学博士学位论文,2005
    [38]J.Makhoul,A.Gray.Linear Prediction of Speech.Springer-Verlay,1996
    [39]W.H.Shin.Speech/non-speech Classification Using Multiple Features for Roust Endpoint Detection.Proceedings of IEEE ICASSP,Istanbul,2000,3:1399-1402
    [40]C.S.Huang,H.C.Wang.Bandwidth-adjusted LPC Analysis for Robust Speech Recognition.Pattern Recognition Letters,2003,24(9):1593-1597
    [41]R.Bennetl,A.Syndal,S.Greenspan.Applied speech technology.USA Florida:CPC press,1995
    [42]江太辉.基于DTW算法的语音识别电话系统.电声技术,2005,8:31-34
    [43]N.T lay,F.W.Say,D.Silva,ete.Speech Emotion Recognition Using Hidden Makrov Models.Speech Communication,2003,41(4):603-623
    [44]王玥,陶洪久.蚁群优化算法在TSP中的应用.武汉理工大学学报信息与管理工程版,2006,28(11):24-26
    [45]段海滨,王道波,朱家强等.蚁群算法理论及应用研究的进展.控制与决策,2004,19(12):1321-1326

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700