摘要
针对噪声环境下说话人确认系统性能急剧下降问题,根据人耳听觉感知特性,利用Gammachirp滤波器组来模拟人耳耳蜗听觉模型,提出了一种鲁棒性听觉特征参数(Gammachirp Feature Coefficient,GCFC)的提取方法。在高斯混合模型-通用背景模型(Gaussian Mixture Model-Universal Background,GMM-UBM)下进行仿真实验,研究了不同噪声环境下系统的噪声鲁棒性和适应性。实验结果表明,在说话人确认系统中,新提取的听觉特征参数在噪声鲁棒性、噪声适应性和系统整体确认性能上均优于梅尔倒谱系数和基于Gammatone滤波器的听觉特征参数。
According to human auditory perception characteristics,an robust Gammachirp feature coefficient(GCFC) extraction method is introduced for the matter of sharp performance degradation in speaker verification system under the noisy environment.This method is based on Gammachirp filter banks which can simulate the auditory model of cochlea.The simulation experiments are conducted in Gaussian Mixture Model-Universal Background Model(GMM-UBM) subsequently,including the noise robustness and adaptability experiments of the system under different noisy environments.The experiment results show that,in speaker verification system,the newly extracted GCFC feature parameters are better than Mel cepstral coefficients and GFCC feature parameters based on Gammatone filter banks in the noise robustness,noise adaptability and overall verification performance.
引文
[1] MUDAL,BEGAM M,ELAMVAZUTHI I.Voice Recognition Algorithms Using Mel Frequency Cepstral Coefficient(MFCC) and Dynamic Time Warping(DTW) Technique[J].Journal of Computing,2010,2(3):138-141.
[2] SHAO Y,JIN Z,WANG D L,et al.An Auditory-based Feature for Robust Speech Recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.Taipei,Taiwan,2009:4625-4628.
[3] 王玥,钱志鸿,王雪,等.基于伽马通滤波器组的听觉特征提取算法研究[J].电子学报,2010,38(3):525-528.
[4] IRINOT,PATTERSON R D.A Time-domain,Level-dependent Auditory Filter:ThEachirp[J].Journal of the Acoustical Society of America,1997,101(1):412-419.
[5] ABDALLAH A B,HAJAIEJ Z.Improved Closed Set Text Independent Speaker Identification System Using Gammachirp Filterbank in Noisy Environments[C]//International Multi-conference on Systems,Signals and Devices.Barcelona,Spain,2014:1-5.
[6] BOUCHAMEKHM,BOUSSEKSOU B,BERKANI D.Gammachirp Filterbank Based Speech Analysis for Speaker Identification[C]//WSEAS International Conference on Computational Intelligence,Man-machine Systems and Cybernetics.Canary Islands,Spain,2009:28-32.
[7] REYNOLDSD A,QUATIERI T F,DUNN R B.Speaker Verification Using Adapted Gaussian Mixture Models[M].Orlando:Academic Press,2000:19-41.
[8] DODDINGTON G R,PRZYBOCKI M A,MAARTIN A F,et al.The NIST Speaker Recognition Evaluation-overview,Methodology,Systems,Results,Perspective[J].SpeEunication,2000,31(2):225-254.
[9] SALHI L,OUNI K.Application of Gammachirp Auditory Filter as a Continuous Wavelet Analysis[J].Signal & Image Processing,2011,2(2):114-129.
[10] 王玥.说话人识别中语音特征参数提取方法的研究[D].长春:吉林大学,2009.
[11] IRINOT,UNOKI M.An Analysis Auditory Filterbank Based on an IIR Implementation of the Achirp[J].Journal of the Acoustical Society of Japan,1999,20(6):397-406.
[12] RAHALI H,HAJAIEJ N.ASR Systems in Noisy Environment:Auditory Features Based on Gammachirp Filter Using the AURORA Database[C]//Signal Processing Conference.IEEE.Lisbon,Portugal,2014:696-700.
[13] CHUK K,LEUNG S H.SNR-dependent Non-uniform Spectral Compression for Noisy Speech Recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.Montreal,Canada,2004:973-976.
[14] HERMANSKY H,MORGAN N.RASTA Processing of Speech[J].IEEE Transaction on Speech and Audio Processing,1994,2(4):578-589.
[15] PRASAD N V,UMESH S.Improved Cepstral Mean and Variance Normalization Using Bayesian Framework[C]//IEEE Workshop on Automatic Speech Recognition and Understanding.Olomouc,Czech Republic,2014:156-161.
[16] VARGA A,STEENEKEN H J M.Assessment for Automatic Speech Recognition:Ⅱ.NOISEX-92:A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems[J].SpeEunication,1993,12(3):247-251.
[17] AUCKENTHALER R,CAREY M,LLOYD-THOMAS H.Score Normalization for Text-Independent Speaker Verification Systems[J].Digital Signal Processing,2000,10(1-3):42-54.