基于ARM的嵌入式语音识别的研究

英文题名：The Study of Embedded Speech Recognition System Based on Arm
作者：郭威
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：语音增强 ; 阵列麦克风 ; 嵌入式操作系统 ; 波束形成 ; 角度分集
英文关键词：Speech enhancement ; Array microphone ; Embedded operating system ; Beam forming ; Angle diversity
学位年度：2010
导师：谭云福
学科代码：081203
学位授予单位：燕山大学
论文提交日期：2010-12-01

摘要

目前的语音识别系统普遍采用PC或者服务器的形式作为系统的工作平台,这种方式不可避免地存在体积大、功耗高、不便于携带、实用性低等问题。并且通常的语音识别系统由于噪声、混响等实际情况而导致语音增强处理的过程过于复杂,无法在嵌入式系统中顺利的使用。针对以上问题,本文在总结传统语音增强技术的基础上,展开了对嵌入式语音识别系统的研究,并就课题中所涉及到的相关理论和关键技术进行了深入的探讨,主要包括以下几个方面的工作。
     首先,介绍了嵌入式系统和语音识别中语音信号增强技术的发展和研究现状,指出了目前语音增强技术存在的问题,论述了课题的主要研究内容;并在介绍几种常用的阵列麦克风拓扑结构设计方案的基础上,全面的分析了各种阵列麦克风语音增强方案的性能指标。
     其次,研究了一种高效实时的在混响环境下带干扰噪声的语音信号增强方案。该方案以阵列麦克风为前端语音拾取设备,对每个麦克风之间采样得到的语音信号进行多径角度分集接收处理,通过分析语音信号之间的相位关系,多波束形成,对相干信号延时处理并加权合并提高信噪比以实现对采集得到的语音信号的增强处理,并通过调整权值矩阵滤除非语音频段信号和噪声,进而进一步降低可能引入的噪声污染。
     再次,对系统的硬件平台进行了详细的设计,介绍了嵌入式操作系统的特点及其移植的相关知识;在基于S3C2440的硬件平台上,详细的阐述了系统引导程序BootLoader的编写及Windows CE 6.0的移植过程;并介绍了系统软件的总体设计和关键的语音增强算法的详细研究过程。
     最后,对系统进行了大量的综合仿真试验,总结系统的各方面能力并分析存在的问题,为进一步的研究提供了方向和宝贵的经验。
The current speech recognition system is widely adopted PC or server form as the system work platform. This way inevitably have a lot of problem such as large in size, high power consumption, not easy to carry and low practicality. And the general speech recognition system is usually due to noise, reverberation and other actual conditions lead to voice pretreatment process is too complex to be successfully used in the embedded systems. Against the above problems, this paper summarizes the traditional speech enhancement technology foundation, launched the study of embedded speech recognition system, and conduct a series research of the theoretical and technical which involved in the subject, mainly including the following work.
     First, this paper introduces the development and research of embedded systems and speech enhancement technology in speech recognition, point out the current problems in speech enhancement technology, and discusses the main aspect of this research topic; Then on the basis of introduce several kinds of array microphone array element design, this paper comprehensive analyse the performance indicators of various microphone array speech enhancement program.
     Secondly, this paper researched one kind of highly effective real-time program under reverberation environment the belt interference noise voice signal enhancement. This method took the array microphone as the front end speech collection equipment, the voice signal which the sampling obtained to each microphone between carries on multi-diameter angle diversity reception processing, through analyzing the phase relation between the voice signal, the multi-beam formation, delay processing of coherent signals and weights combined enhances the signal-to-noise ratio to realize voice signal enhancement processing which obtained to gathering, and through the adjustment weight matrix filters, only if the speech frequency band signal and the noise, then further reduce the noise pollution which possibly introduces.
     Again, this paper designed the hardware platform for system in detail, and introduced the embedded operating system features and transplant-related knowledge. Based on S3C2440 hardware platform, this paper described in detail the system boot process which is BootLoader and the transplant process of Windows CE 6.0. Subsequently, this paper introduces the overall design of the system software and key detailed study process of speech enhancement algorithm.
     Finally, a lot of integrated simulation experiment have been done on the system. Then, this paper summarizes the system of various aspects ability and analyses the existent problems, and provides direction and valuable experience for further study.

引文

1孙延岭,赵雪飞,张红芳等.基于ARM嵌入式系统的微型智能可编程控制器.电力系统自动化, 2010, 33(10): 101-103
    2李小丽.嵌入式系统的选用方法初探.中国集成电路, 2009, 18(6): 23-35
    3李向蔚,桑楠,熊光泽.基于软件复用的嵌入式操作系统的定制.电子科技大学学报, 2007, 36(3): 4-10
    4余宏兵,李宝安,申功勋.基于ARM的WINCE系统定制.现代电子技术. 2008, 31(10): 181-188
    5罗健飞,吴仲城,沈春山等.基于ARM和WinCE下的设备接口驱动设计与实现.自动化与仪表, 2009, 24(3): 380-385
    6张营,李鹏,陈立锋等.嵌入式系统发展综述.电子技术, 2008, 45(6): 25-33
    7 S. Phung. Windows CE 6.0嵌入式高级编程.张冬松,陈芳园译.北京:清华大学出版社, 2009: 442-457
    8李宁,宋薇,周薇.嵌入式开发工具发展趋势.东北大学学报:自然科学版, 2008, 30(12): 331-341
    9赵力.语音信号处理.北京:机械工业出版社, 2008: 35-39
    10 M. R. Weiss, E. Aschkenasy, T. W. Parsons. Study and Development of the INTEL Technique for Improving Speech Intelligibility. Nicolet Scienfic Corp, Final Rep. NSC-FR/4023, Dec. 2008
    11 J. S. Lim, A. V. Oppertheim. Enhancement and Bandwidth Compression of Noisy Speech. Proc. IEEE, Dec. 2009, 97(2): 1585-1604
    12 R. J. McAulay, M. L. Malpass. Speech Enhancement using a Soft-decision Noise Suppression Filter. IEEE Trans. Acoustics, Speech and Signal Processing. April. 2007, Vo1. ASSSP. 44(2): 25-28
    13 Y. Ephraim, D. Malah. Speech Enhancement using a Minimum-mean Square Error Short-time Spectral Amplitude Estimator. Acoustics, Speech, and Signal Processing[see also IEEE Transactions on Signal Processing], Dec. 2004, 52(6): 1109-1121
    14 A. Lallouani, M. Gabrea, C. S. Gargour. Wavelet based Speech Enhancement using Two Different Threshold-based Denoising Algorithms. Electrical and Computer Engineering, May. 2004, 1(2): 315-318
    15 A. Rezayee, S. Gazor. An Adaptive KLT Approach for Speech Enhance-merit. IEEE Transactions and Audio Processing, February. 2009, 17(2): 87-95
    16 M. K. Hasan, M. S. A. Zilany, M. R. Khan. DCT Speech Enhancement with Hard and Soft Thresholding Criteria. Electronics Letters, 2009, 45(13): 669-670.
    17 H. Y. Chang, S. N. Koh, S. Rahardja. An MMSE Speech Enhancement Approach Incorporating masking Properties. Acoustics, Speech, and Signal Processing, 2008. Proceedings. (ICASSP’08). IEEE Intemational Conference on, May. 2008, 1(17): 725-728.
    18居太亮,彭启琮等.麦克风阵列的拓扑结构研究.中国通信学会通信理论与信号处理专业委员会2007年通信理论与信号处理年会, 2007
    19李璇,鄢社锋.波达方向误差对对角加载波束形成的影响研究.声学技术, 2008, (2): 128-134
    20 A. McCowanlain, C. DarrenMoore, S. Sridharan. Near-field Adaptive Beamformer for Robust Speech Recognition. Digital Signal Processing, 2009, 19(1): 87-106
    21 L. Wei, S. Weiss, L. Hanzo. Subband-selective Partially Adaptive Broadband Bearnforming with Cosine-modulated Blocking Matrix. IEEE Transactions on Acoustics, Speech, and Signal Processing. 2009, 17(3): 2913-2916
    22 W. Kellerman. A Self-steering Digital Microphone Array. Intemational Conference on Acoustics, Speech and Signal Processing(ICASSP-08). Toronto:IEEE press, 2008: 3581-3584
    23 S. Gannot, I. Cohen. Speech Enhancement Based on the General Transfer Function GSC and Postfiltering. IEEE Transactions on Speech and Audio Processing. 2004, 12(6): 744-747
    24 L. J. Griffiths and C. W. Jim. An Alternative Approach to Linearly Constrained Adaptive Beamforming. IEEE Tram. Antennas Propagat. 2008, vo1. AP-30: 27-34
    25 O. Hoshuyama, A. Sngiyama. A Robust Adaptive Beamformer for Microphone Arrays with a Blocking Matrix using Constrained Adaptive Filters. Int. Conference. Acoustics, Speech, Signal Processing, Atlanta, GA. 2006, 30(4): 925-928
    26 O. Hoshuyama, A. Sugiyama, and A. Hirano. A Robust Adaptive Beamformer for Microphone Arrays with a Blocking Matrix using Constrained Adaptive Filters. IEEE Trans. Signal Processing. Oct. 2009, 57(2): 677-2684 10(11): 335-338
    28 A. Cichocki, S. Amary. Adaptive Blind Signal and Image Processing. A Wiley-Interscience Publication, 2007, 27(3): 117-121
    29张贤达,保铮.盲信号分离.电子学报, 2007, 35(12): 85-89
    30 S. Araki, S. Makino, R. Aichner etc. Subband based Blind Source Separation with Appropriate Processing for each Frequency Band. 2008, 59(11): 112-118
    31王冬霞,殷福亮,金乃高.基于盲波束形成的麦克风阵列语音增强方法.电子与信息学报, 2007, 29(10): 2321-2324
    32何成林,杜利民,马昕.基于子带广义旁瓣相消器的麦克风阵列语音增强.计算机应用研究, 2006, 23(4): 208-210
    33雷建军,杨震,刘刚等.噪声鲁棒语音识别研究综述.计算机应用研究, 2009, 26(4): 1210-1216
    34张丽艳,殷福亮.一种适用于混响环境的麦克风阵列语音增强方法.信号处理, 2009, 43(5): 720-723
    35居太亮.基于麦克风阵列的声源定位算法研究. [成都电子科技大学博士学位论文]. 2006: 45-52
    36易克初,田斌,付强.语音信号处理.北京:国防工业出版社, 2008: 158-162
    37 T. Murakami, M. Namba, T. Hoya. Speech Enhancement based on a Combined Higher Frequency Regeneration Technique and RBF Networks. TENCON’08. Proceedings. Oct. 2008,Vol.l.pp: 457-460
    38林静然,彭启琮,邵怀宗.基于麦克风均匀圆阵的近场声源定位分离.全国博士生学术论坛文集.成都, 2008: 1265-1269
    39林静然.基于麦克风阵列的语音增强算法. [电子科技大学博士学位论文]. 2007: 85-92
    40 B. Rafaely. Analysis and Design of Spherical Microphone Arrays, IEEE Trans. On Speech And Audio Processing, Jan. 2005, 13(1)
    41杨波,傅汝林,张知易.一种改进的客观音质评价方法.成都理工大学学报, 2009, 12(3): 184-188
    42栗晓丽.基于GSC结构的麦克风阵列语音增强算法研究. [西安电子科技大学硕士学位论文]. 2008: 45-49
    43 W. Kellermarm. A Self-steering Digital Microphone Array. IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, 2008: 3581-3584
    44 W. Liu, S. Weiss, L. Hanzo. Subband Adaptive Generalized Sidelobe Canceller for Broadband Beamforming. Proceedings of the 11th IEEE Workshop on Statistical Signal Processing, 2001: 591-594
    45栗晓丽.基于子带TF-GSC麦克风阵列语音增强.电子科技, 2008, 21(2): 282-287
    46 SAMSUNG ELECTRONICS. S3C2440A 32-bit CMOS Microcontroller user’s manual. 2009: 52-53
    47周建设. Windows CE设备驱动及BSP开发指南.北京:中国电力出版社, 2009: 282-288
    48方敏,浦剑涛,李成荣,台宪青.嵌入式语音识别系统的研究和实现.中文信息学报, 2004, 8(12): 474-477
    49 H. Hermansky, A. Morg. RASTA processing of speech. IEEE Tram on Speech and Audio Processing, 2008, 6(4): 578-589
    50 Microsoft Corporation. Microsoft Speech 5.1 SDK Help. 2009
    51李禹才,左友东,郑秀清.基于SpeechSDK的语音控制应用程序的设计与实现.计算机应用, 2009, 29(6): 155-159
    52胡姗姗,刘加,王国梁.嵌入式实时英语语音识别系统的设计和实现.电子技术应用, 2006, 8(9): 18-23
    53李文,夏秀渝,何培宇等.基于麦克风阵列的近场声源定位.四川大学学报:自然科学版, 2008, 45(2): 307-310
    54王大中,李晓妮.基于麦克风阵列的语音信号实时时延估计.吉林大学学报:信息科学版, 2009, 27(2): 133-138
    55 B. Chen, A Laplacian-based MMSE estimator for speech enhancement. Speech Communication, 2007, 49(2): 134-143
    56李野,吴亚锋,刘雪飞.基于感知小波变换的语音增强方法研究.计算机应用研究, 2009, 26(4): 1313-1315
    57王社国,赵建光.基于ARM的嵌入式语音识别系统研究.微计算机信息, 2007, (12): 148-150

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700