基于MFCC和小波包变换及模糊SVM的飞机舱音识别

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于MFCC和小波包变换及模糊SVM的飞机舱音识别

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Aircraft Cockpit Voice Recognition Based on MFCC and Wavelet Packet Transform and Fuzzy SVM
作者：姜龙生
论文级别：硕士
学科专业名称：模式识别与智能系统
中文关键词：舱音记录器 ; MFCC ; 小波包变换 ; 特征融合 ; 不均衡样本 ; 模糊支持向量机 ; MATLAB混合编程
英文关键词：cockpit voice recorder ; MFCC ; wavelet packet transform ; feature fusion ; imbalanced samples ; fuzzy support vector machine ; MATLAB mixed programming
学位年度：2011
导师：王从庆
学科代码：081104
学位授予单位：南京航空航天大学
论文提交日期：2011-01-01

摘要

全世界每年都要发生一些空难事故,空难事故调查时必须要寻找的证据载体就是黑匣子,一般包括飞行参数记录器(Flight Data Recorder, FDR)和舱音记录器(Cockpit Voice Recorder, CVR)。舱音记录器记录了反映飞机和设备状态的客观声音以及反映飞行员感知描述和情感特征的主观信息,具体包括话语声、航空噪声以及各种背景声等,是飞行事故调查时的重要依据,为重构飞行事故过程、调查飞行事故原因提供了重要证据。
     针对CVR中记录的舱音信息多而复杂、频率范围宽、非平稳等特点,本文结合傅里叶变换和小波包变换及模糊支持向量机等方法对舱音信息进行了分类识别。本文的主要工作如下:
     首先,对中国民用航空总局航空安全技术中心建立的“飞机舱音信息样本库”进行整理和分类,利用Adobe Audition软件对舱音信息进行降噪和截取,就得到了单个独立的警报、开关、旋钮等舱音信息。
     其次,对单个独立的舱音信息分别进行傅里叶变换和小波包变换,并依次提取其梅尔倒谱系数(Mel Frequency Cepstrum Coefficient, MFCC)和小波包分解系数(Wavelet Packet Coefficient, WPC),利用距离可分性判据对MFCC和WPC进行压缩融合,将得到的一组向量作为最终的舱音信息特征向量。
     然后,针对支持向量机在处理含噪奇异样本和数目不均衡样本时性能较差的缺点,本文设计了面向不均衡样本的模糊支持向量机,分别计算每类样本和每类样本内每个舱音信息两个隶属度,然后利用模糊支持向量机对舱音信号进行分类识别,实验表明该方法明显优于常规支持向量机和模糊支持向量机。
     最后,利用MATLAB与VC++混合编程开发了舱音识别软件,该软件充分利用了VC++方便强大的应用程序界面开发功能和MATLAB强大的信号处理、图形显示功能,可以直观、快速、准确的完成飞机舱音信息的分类识别。
     本文的研究对于有效识别CVR非话语背景声,确定飞机事故原因具有重要意义。
There are many air disasters in the world every year. A necessary evidence in the analysis of air disaster is Black Box which includes Flight Data Recorder(FDR) and Cockpit Voice Recorder(CVR). CVR records some objective voices which reflect the condition of aircrafe and equipment,and some subjective information which reflects the perception and emotions of pilot,such as voices,aviation noise and background sound.CVR is an important evidence in the analysis of air disaster.It provides important evidence for the air disaster reconstruction.
     The voice signals in CVR are complex and non-stationary,and they have wide frequency range.This paper studies the classify of cockpit voice according to fourier transform,wavelet packet transform and fuzzy SVM. The major works are summarized as follows:
     First of all,on the basis of the“aircrafe cabin sound sample library”of the center of aviation safety technology CACC,this paper reduces the noise and intercept the cockpit voice with Adobe Auditio.The alarm sounds,switch,knob and other independent samples are successfully separated from the mixed signals.
     Secondly,fourier transform and wavelet packet transform are used for the independent cockpit voice, Mel Frequency Cepstrum Coefficient(MFCC) and Wavelet Packet Coefficient (WPC) are extracted as the initial characteristics.The finally characteristics are determined by geometric distance classifiability criterion.
     Then,the support vector machine (SVM)algorithm is sensitive to outliers and noise present in the datasets and when it comes to imbalanced samples,SVM produces suboptimal classification models. Fuzzy SVM(FSVM) is a variant of the SVM algorithm,which has been proposed to handle the problem of outliers and noise.However,like the normal SVM algorithm,FSVM can also suffer from the problem of imbalanced samples.In this paper,we present a method to improve FSVM for imbalanced samples learning,which can be used to handle the imbalanced samples problem in the presence of outliers and noise.Training samples are assigned two different fuzzy-membership values,and these membership values are incorporated into the SVM learning algorithm. Based on the experiment results,it can be concluded that the proposed method is a very effective method.
     Lastly, a software to classify the cockpit voice with MATLAB and VC++ is designed.The software fully plays the advantage of MATLAB and VC++ which can classify the cockpit voice intuitively,quickly,accurately.
     The study of this thesis will have great signigicance in judging the contents in CVR background voice and determining the cause of the air disasters.

引文

[1]程道来,仪垂杰,姚红宇等,飞机舱音记录器声信息识别方法的初步研究,噪声与振动控制,2006,6(3):81~84
    [2]舒平,钟民主,杨琳,舱音记录器译码系统的改进,探索、创新、交流‐中国航空学会青年科技论坛文集,2004:433~437
    [3]程道来,仪垂杰,郭海荣等,飞机舱音记录器中话语声和开关声小波分析,计算机工程与应用,2006(25):187~196
    [4] Aircraft Accident:DCA01MM022,USS Greenville/Ehime Maru collision Pacific Ocean,near Hawaii,February 9,2001:1~8
    [5] JinXin Wu, JiaoLong Wei. Combining ICA with SVR for prediction of finance time series Proceedings of the IEEE International Conference on Automation and Logistics August 18~21,Jinan,China,2007:95-100
    [6]仪垂杰,连小珉,蒋孝煜,MI-171直升机噪声测试分析[J],空军陆航局,1996
    [7]仪垂杰,直升机结构声辐射研究[J],清华大学博士后出站报告,1996
    [8]朱彦武,仪垂杰,蒋孝煜,飞行员头盔有源耳机研究[J],空军第四研究所,1995
    [9] Chuijie Yi, Xuanli Hu and Peter Dietz, Vibration power flow in beam-plate structure with isolation components [J](part I), J of Computed Assisted Machanics and Engineering Science, 5,1998:345-360
    [10] Chuijie Yi, Tianning Chen, Wei Li and Xieqing Huang, Prediction of sound responses in box-like structures[J], Chinese J.of Mechanical Engineering, (English Edition)Vol.8(3), 1995:248-254
    [11] Chuijie Yi, Xuanli Hu and Peter Dietz, Vibration power flow in beam-plate structure with isolation components [J] (part II),J of Computed Assisted Machanics and Engineering Science,5,1998:361-372
    [12] K.H.Kim,S.E.Bang,S.R.Kim,Emotion recognition system using short-term monitoring of physiological signals.Med.Biol.Eng.Compute.2004(42):419~427
    [13] Manuel Davy, An introduction to statistical signal processing and spectrum estimation, LAGIS/CNRS, BP 48,Cite Scientifique,2005:22~64
    [14]冷建华,傅里叶变换,北京:清华大学出版社,2005:27~28
    [15]杨如民,余成波,周登义,数字信号处理及MATLAB实现,清华大学出版社,2005:67~68
    [16]张德丰,Matlab小波分析,北京:机械工业出版社,2006:158~181
    [17] ZHANG S H, JU G, A real-coded adaptive algorithm and its application research in thermal process identification[C],Processings of CSEE,China,2004,24(2):210-214
    [18] BRETT D M, JUSTIN W, SEUNG H S, BCI competition 2003-Data Set Ia. Combining gamma-band power with slow cortical potentials to improve single-trial classification of electroencephalographic signals [J], IEEE Transactions on Biomedical Engineering, 2004, 51(6):1052-1056
    [19] George Tzanetakis, Chen Ming-Yu,BUILDING AUDIO CLASSIFIERS FOR BROADCAST NEWS RETRIEVAL[J],5th International Worksho Pon Image Analysis for Multimedia Interactive Services(WIAMIS’04), Instituto Superior Tecnico, Lisboa, Portugal, April 21-23, 2004.
    [20] George Tzanetakis, Perry Cook. MULTIFEATURE AUDIO SEGMENTATION FOR BROWSING AND ANNOTATION[J].Proc.1999 IEEE WorkshoPon Application of Signal Processing to Audio and acoustics,New York,Oct.17-20.1990:103-106.
    [21] Saad E M, M.I.EI-Adawy, M.E.Abu-EI-Wafa. A MULTIFEATURE SPEECH/MUSIC DISCRIMINATION SYSTEM[J]. NINETEENTH NATIONAL RADIO SCIENCE CONFERENCE, ALEXANDRIA ,March,19-21,2002:208-213.
    [22] McKinney M F,et al.Feature for audio and Music Classification[EB/OL].
    [23]韩纪庆,冯涛,郑贵滨等,音频信息处理技术[M],北京:清华大学出版社,2007:79-84.
    [24]仪垂杰,程道来,郭健翔等,基于不同方法的飞机舱音背景声频谱特征的获取与分析,振动与冲击,2007,26(8):109~111
    [25] Bojan Kotnik,Damjan Vlaj,Zdravko Kacic,et al.Robust MFCC feature extraction algorithm using efficient additive and convolutional noise reduction procedures.ICSLP’02 Proceedings, Denver,Colorado,USA,PP.2002:445~448
    [26]王雪,测试智能信息处理,北京,清华大学出版社,2008:217~235
    [27]丁爱明,作为说话人识别特征参量的MFCC的提取过程,电子工程师,2006,32(1):51~53
    [28]宫晓梅,王怀阳,噪声环境下MFCC特征提取,模式识别,2007(8):247~249
    [29] Shlomo Karni,Gengsheng Zeng,The analysis of the continuous-Time LMS algorithm.Speech and Signal Processing,1989(4):595~597
    [30] Stanislay Gruden,Baldomir Zajc,Using spectral subtraction for suppression of noise in speech signals with analog integrated circuits.Analog Integrated Circuits and Signal Processing, 1999(18):195~207
    [31]吴婷,颜国华,杨帮华,基于小波包分解的脑电信号特征提取,仪器仪表学报,2007年12月,第28卷第12期,2007
    [32]王峥,连翰,王建军,说话人识别中特征参数提取的一种新方法,复旦学报,2005年2月,第44卷第1期
    [33]齐敏,李大健,郝重阳,模式识别导论,北京:清华大学出版社,2009:21~22
    [34]边肇祺,张学工,模式识别(第二版),北京,清华大学出版社,2000:178~179
    [35]王海祥,基于RBF神经网络的源-目标话音转换.电子测量技术,2006年12月,第29卷第6期,2006
    [36] V.Vapnik,The Nature of Statistical Learning Theory.Berlin,Germany:Springer-Verlag,1995.
    [37] C.Cortes and V.Vapnik,“Support vector networks,”Mach.Learning,vol.20.pp.273-297,1995.
    [38] J.Showe-Taylor and N.Christianini, Support Vector Machines and Other Kernel-based Learning Methods.Cambridge,U.K.:Cambridge Univ.Press,2000.
    [39] C.-F.Lin and S.-D.Wang,“Fuzzy support vector machines,”IEEE Trans. Neural Netw. vol.13,no.2,pp.464-471,Mar.2002.
    [40] K.Veropoulos,C.Campbell,and N.Cristianini,“Controlling the sensitivity of support vector machines,”in Proc.Int.Joint Conf.Artif.Intell.,Stockholm,Sweden,1999,pp.55-60.
    [41] R.Akbani, S.Kwek, and N.Japkowicz,“Applying support vector machines to imbalanced datasets,”in Proc.15th Eur.Conf.Mach.Learning,Pisa,Italy,2004,pp.39-50.
    [42] G.Wu and E.Chang,“Class-boundary alignment for imbalanced dataset learning,”presented at the Int.Conf.Data Mining,Workshop Learning Imbalanced Datasets II, Washington, DC,2003.
    [43] G.Wu and E.Chang,“KBA: Kernel boundary alignment considering imbalanced data distribution,”IEEE Trans.Knowl.Data Eng.,vol.17,no.6,pp.786-795,Jun.2005.
    [44] Zhen-Rui Peng,Pu Gao,Jian-Jun Meng,et al.A preliminary study of airport freight traffic forecasting based on least squares support vector machine. Proceedings of the Fourth International Conference on Machine Learning and Cybernetics,Guangzhou,18-21 August 2005:3680~3685
    [45] Mao K Z.Feature subset selection for support vector machines through discriminative function pruning analysis.IEEE Trans.on Systems,Man and Cybernetics,2004(34): 60~67
    [46]姜长生,王从庆,魏海坤等,智能控制与应用,北京:科学出版社,2007: 75~76
    [47] E.Spyrou,G.Stamou,Y.Avrithis,and S.Kollias,“Fuzzy support vector machines for image classification fusing mpeg-7 visual descriptors,”in Proc.2nd Eur.Workshop Integr. Knowl., Semantics Dig.Media Technol.,London,U.K.,2005,pp.23-30.
    [48] L.Chen et al.,“Fuzzy support vector machines for emg pattern recognition and mycroelectricalprosthesis control,”in Proc.4th Int.Symp.Neural Netw., Nanjing, China, 2007,pp.1291-1298.
    [49] B.Boser,I.Guyon,and V.Vapnik,“A training algorithm for optimal margin classifiers,”in Proc.5th Annu.ACM Workshop Comput.Learning Theory,Pittsburgh,PA,1992,pp.144-152.
    [50] X.Zhang,“Using class centres vectors to build support vector machines,”in Proc.IEEE Signal Process.Soc.Workshop,Madison,WI,1999,pp.3-11.
    [51]石雪飞,薛峰,数字音频编辑Adobe Audition 3.0,北京:电子工业出版社,2009.5:80~81
    [52]李学仁,杜军,张鹏,飞机舱音系统及其应用技术,北京:国防工业出版社,2010,1:107~108
    [53]葛哲学,沙威等,小波分析理论与MATLABR2007实现,北京:电子工业出版社,2007:354~355
    [54]王超龙,陈志华,Visual C++6.0入门与提高,北京:人民邮电出版社,2002.7:1~2
    [55]王素立,高洁,孙新德,MATLAB混合编程与工程应用,北京:清华大学出版社,2008.5:2~3
    [56]吕永林,字正华,基于VC与MATLAB的声目标识别系统设计,2009年9月,第19卷第9期,2009

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700