基于DSP平台的语音识别算法的研究与实现

英文题名：Research and Realization on Speech Recognition Algorithm Based on DSP Platform
作者：唐尧
论文级别：硕士
学科专业名称：交通信息工程及控制
中文关键词：语音识别 ; 数字信号处理器 ; 美尔频标倒谱系数 ; 连续隐马尔可夫模型 ; 盲信号分离
英文关键词：ASR ; DSP ; MFCC ; CHMM ; BSS
学位年度：2007
导师：周洁敏
学科代码：082302
学位授予单位：南京航空航天大学
论文提交日期：2007-03-01

摘要

自动语音识别系统(简称ASRS)的实用化研究是近十年语音识别研究的一个主要方向,目前在嵌入式系统中的应用主要为语音命令控制,它使得原本需要手工操作的工作用语音就可以方便地完成。使用语音作为人机交互的途径对于使用者来说是最自然的一种方式,同时设备的小型化也要求省略键盘以节省体积。本文论述了一种ASRS由计算机辅助的设计方案,并给出了具体的实现方法。
     本文采用的识别方法类属于小词汇量孤立词语音识别,主要应用根据文献设计的算法进行语音信号的采集、特征抽取、概率计算、建立数学模型并处理,最终获得识别结果。
     本文实现的ASRS在功能上主要由硬件设备和相应的算法软件组成。硬件设备构成该系统的硬件平台,通过麦克风实现语音信号的采集,然后由高性能A/D转换芯片接收,并将采集的信号传送至数字信号处理器的存储器,按照复杂可编程逻辑器件输出的时序进行处理,得到最终的识别结果输出。软件部分主要由语音信号的采集算法、预处理算法、前向后向算法、训练算法及维特比算法组成,先将编写的算法在数学运算软件Matlab环境下仿真成功,然后将代码移植到基于TI公司的DSP开发软件CCS平台,实现了硬件仿真。
     为了满足语音识别在实际应用环境中的抗噪声需要,本文还探讨了基于盲信号分离思想的语噪分离算法,并在Matlab平台下仿真成功。
Auto Speech Recgnition System (ASRS)'s utility research has been a leading direction in the research of speech recognition for 10 years.Nowadays,most of it's appliances on embedded-systems are speech controling,which makes the complex manual operation easy and convenient.It's one of the most natual mode of communication between human and computer.Meanwhile,the miniaturization of equipment also requests omitting the keyboard to save volume.In this paper,a kind of ASRS design project is dissertated and put forward.
     The method applied in this paper belongs to the small glossary's isolated words' speech recognition,mainly bases on the algorithm proved by the reference literature,which accomplished the assignment of sampling,extracting,computing,modeling and marking,finally,the result is obtained.
     The function of this ASRS primarily formed by the software and the hardware.The hardware structured the hardware platform,first,it samples the speech signal through a microphone,then receives the sampled data by a high performance A/D,and transmits the data to the RAM of the DSP.These sampled data,will be processed by the programed algorithm here to output the final result.On the other hand,the software includes programes to implement the algorithm of sampling,pre-processing,forward and backward,training,viterbi and so on.Firstly,these algorithms are simulated in the Matlab,then transplanted to the DSP's CCS platform,to emulated the code on the DSK board.
     To meet the requirement of the anti-noise property, this paper also discusses a kind of algorithm based on the BSS,and triumphantly simutates the founctions in Matlab.

引文

【1】 Anastasakos Anastasios, Dubala Francis, Makhoul John, et al. Adaptation to new microphones using tied-mixture normalization[J]. IEEE, ICASSP, 1994, (1):433～436
    【2】S.J.Young and P.C.Woodland, “Tree-based state tying for high accuracy acoustic modeling,” Proc. Human Language Technology Workshop, pp.307-312, March 1994.
    【3】Reichl, W. and Chou, W., “Decision trees state tying based on segmental clustering for acoustic modeling”,in Proc. Int. Conf. Acoustics, Speech, Signal Processing’98, pp.801-804
    【4】Yong, S., Kershaw, D., Odell, J., Ollason, D.,Valtchev, V. and Woodland, P., “The HTK Book (for HTK Version 2.2)”, Cambridge University (1999)
    【5】LAWRENCE R.RABINER,FELLOW, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition",Proceeding of the IEEE, IEEE, vol. 77, no. 2, pp 257-286, FEBRUARY.1989.
    【6】Douglas A. Reynolds and Richard C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Transactions on Speech and Audio Processing, Volume 3, No. 1, January 1995.
    【7】J.D.Markel and A.H.Gray,Jr.,LinearPrediction of Speech.New York,NY:Springer-Verlag,1976.
    【8】A. Acero. Acoustical and Environmental Robustness in Automatic Speech Recognition. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, 1990.
    【9】Marc A. Zissman and Elliot Stinger, “Automatic Language Identification of Telephone Speech Messages Using Phone Recognition and N-gram Modeling”, Proceedings of IEEE ICASSP, Volume 1, pp 305-308, Minneapolis, USA, 1994.
    【10】S.E.Levinson,”Continuous speech recognition by means of acoustic-phonetic classification obtained from a hidden Markov model”,in Proc.ICASSP’87(Dallas TX),Apr.1987
    【11】Rabiner L R ,Schafer R W .Digital Processing of Speech Signals.Englewood Cliffs(New Jersey) : Prentice-Hall Inc.,1978
    【12】A.J.Viterbi,”Error bounds for convolutional codes and an asymptotically optimal decoding algorithm,”IEEE Trans.Informat Theory,vol.IT-13,pp.260-269,Apr.1967.
    【13】Laird N M, Lange N,Stram D.Maximum Likelihood Computations with Repeated Measures: Applications of the EM algorithm.Journal of the American Statistical Association,1987(82):97~105
    【14】C Myers, L Rabiner.Connected Digit Recognition Using a Level Building DTW Algorithm. IEEE Trans.on ASSP:1981,29:351~363
    【15】G J Mclachlan. Mixture Models. New York: Marcel Dekker,1988
    【16】 Oja E. Principal Components ,Minor Components ,and linear neural networks.[J] Neural Networks,1992,5(6):927～935
    【17】Cao X R , Liu R W.A genera approach to blind source separation [J].IEEE Trans . Signal Processing,1996,44:562～571
    【18】Aapo H.,Juha K.,Erkki Oja,Independent Component Analysis [M]. John Wiley & Sons,INC. 2001
    【19】Aapo H.Fast and Robust Fixed-Point Algorithms for Independent Component Analysis[J] IEEE Trans. Neural Networks,1999,10(3):626～634
    【20】R. Rozman,D. M.Kodek.Improving speech recognition robustness using non-standard windows. Proc.of the Eurocon 2003 International Conference on Computer as a Tool, 2003,2: 171～174.
    【21】戴明臻周建江 TMS320C54XDSP 结构原理及应用北京航空航天大学出版社 2002:5～27
    【22】俞卞章李志均金明录数字信号处理西北工业大学出版社 1995 :1～76
    【23】王炳锡编著语音编码西安电子科学技术大学 , 2002.7 :64～70
    【24】赵力著语音信号处理北京:机械工业出版社,2003.4 :21～118、212～228、54～55
    【25】许丽红.孤立词特定人语音识别鲁棒性研究.硕士论文,上海大学,2002.
    【26】王宏著 MATLAB 6.5 及其在信号处理中的应用清华大学出版社, 2005.9 :2～109
    【27】楼顺天等编著基于 MATLAB 7.x 的系统分析与设计——信号处理—2 版西安电子科技大学出版社, 2005.5 :1～55、328～336
    【28】董长虹编著 Matlab 信号处理与应用北京:国防工业出版社, 2005.1 :20～24
    【29】周炯槃庞沁华等编著通信原理(合订本) 北京邮电大学出版社, 2005 :9～51
    【30】徐霄鹏.特定人孤立词语音识别实用算法的研究.硕士论文,中国科技大学,2001.
    【31】(美)查萨英(Chassaing,R.)著王华等译 DSP 原理及其 C 编程开发技术电子工业出版社2005.7:140～147
    【32】冯博琴主编精讲多练 C 语言西安交通大学出版社 2004.8 :17～181、210～216
    【33】TMS320VC5402 fixed-point digital signal processor(Literature Number: SPRS079E),Texas Instruments Inc,2000:1～67.
    【34】许四虎许飞云贾民平 DSP 集成开发环境中的混合编程及 FFT 算法的实现工业控制技术 1606 5123(2005)12 0088 04
    【35】姚晓通魏元玲宫玉芳基于 DSP 在语音信号处理中的研究与开发实验室研究与探索1006-7167(2004)09-0021-04
    【36】北京合众达电子技术有限责任公司《双 DSP 教学实验系统 SEED-DTK 试验手册》北京合众达电子技术有限责任公司 2003
    【37】李净,徐明星,张继勇,郑方,吴文虎,方棣棠” 汉语连续语音识别中声学模型基元比较:音节、音素、声韵母” 第六届全国人机语音通讯学术会议,267-271 页,2001 年 11 月 20-22 日
    【38】刘媛方景林翁松怡曹继华“基于DSP技术的汉语数码语音识别系统” 仪器仪表学报2003 Vol.24 No.z1 P.537～539
    【39】雷传华张秀彬孙济宇”连接数字语音识别系统的DSP实时实现”School of Electrical Power, Shanghai Jiaotong Univ., Shanghai 200030, China
    【40】丁玉美阔永红高新波《时域离散随机信号处理》西安电子科技大学出版社 2002
    【41】Am29LV800B data sheet,Advanced Micro Devices Inc,2002:1～45.
    【42】IS61LV12816 data sheet(Literature Number:SR203_0C),Integrated Circuit Solution Inc,2000:1～11.
    【43】MAX 7000 Programmable Logic Device Family Data Sheet,Altera Corporation,2001:1～62.
    【44】黄正谨,CPLD系统设计技术入门与应用,北京,电子工业出版社,2002:3～222。
    【45】陆元亮,基于DSP的数据采集与处理系统的设计与实现,[硕士学位论文],南京,南京航空航天大学,2004
    【46】廖裕评,陆瑞强,CPLD数字电路设计——使用MAX+plusⅡ入门篇,北京,清华大学出版社,2001:1～519。

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700