说话人辨认中的特征变换和鲁棒性技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
为了提高说话人辨认系统的性能和在实际应用中的鲁棒性,本论文在高斯混合模型特征变换、特征加权补偿变换和自适应直方图均衡化三个方面进行了研究,主要研究成果包括:
     1.提出了基于嵌入变换的对角方差矩阵高斯混合模型的多步聚类算法。为了简便计算,高斯混合模型中的方差矩阵通常直接用对角方差矩阵代替,因而会对相似度的计算产生损失。为了弥补由于采用对角方差矩阵而引起的相似度损失,提出了基于嵌入变换的对角方差矩阵高斯混合模型的多步聚类算法。该方法采用嵌入变换的对角方差矩阵来建立模型;同时将多步聚类算法融入其中,使高斯混合模型能找到其最适合的模型混合数。与普通聚类期望最大(EM)算法相比,多步聚类算法所需的EM估计次数明显减少;与聚类EM估计的GMM方法相比,在同一语音库下平均计算时间降低了约50%,错误识别率平均减少1.4%;在自制和公开的两个语音库下,与嵌入变换的GMM估计方法相比,新方法都可以直接达到说话人辨认错误识别率的最佳点,达到了识别效果和识别时间的统一。
     2.提出了基于高斯混合模型的加权特征补偿变换的抗噪声算法。针对特征加权算法的局限性和归一化补偿变换方法的特性,提出了基于高斯混合模型的加权特征补偿变换的抗噪声算法。一方面根据帧信噪比对特征值的贡献大小进行加权;另一方面根据说话人识别的声学特性对模型输出的似然得分进行变换,补偿了加权因子在某些环境下的局限性。对于不同信噪比的平稳和非平稳噪声环境,在自制语音库下,与特征加权算法相比,该算法平均识别率提高了2.74%和2.82%;与归一化补偿变换方法相比,平均识别率提高了3.56%和1.34%。在另一公开语音数据集下,与特征加权算法相比,该算法平均识别率提高了3.02%和2.56%;与归一化补偿变换方法相比,平均识别率提高了3.9%和1.14%。
     3.提出了基于统计模型的自适应直方图均衡化方法。针对说话人特征的统计特性和直方图均衡化在说话人识别中应用的不足之处,提出了应用于说话人辨认中的自适应直方图均衡化方法。该方法首先用较大的区间长度来构造直方图的累积函数,然后根据各区间内特征值频率增量的大小来自适应确定该区间是否需要再划分以及划分的程度。采用这种方法不仅使计算量降低,而且得到的变换特征值的分布更符合实际特征空间,从而进一步提高了噪声环境下说话人辨认系统的识别率和鲁棒性。在同一测试集下,研究两种常用经典噪声(即White和Babble),与普通直方图均衡化方法相比,自适应直方图均衡化方法的平均识别率分别提高了3%和2.9%。在另一公开对比测试集中,该方法的性能同样有相似的提高。
This dissertation focuses on the research on Transformation-based Gaussian mixture model, weighted features compensation transformation and adaptive histogram equalization to improve the performance of speaker identification and the robustness in practical application environment. Including:
     1. A multi-step clustering algorithm with transformation-based and diagonal-covariance Gaussian mixture model (GMM) is advanced. In order to simplify the computation, Gaussian mixture density functions always use diagonal covariance matrices. However this also reduces the likelihood of the data, which could consequently affect the classification decision. In order to compensate the losing likelihood, the multi-step clustering algorithm is proposed. In this algorithm, the embedded linear transformation is used to integrate both transformation and diagonal-covariance Gaussian mixture into a unified framework. Also a multi-step cluster algorithm is integrated into the estimating process of GMM to search the appropriate mixture number. Compared with, the estimation frequency is obviously reduced. Compared with the traditional cluster expectation-maximization (EM) algorithm, the newly proposed method can save 50% of time and the error rates decrease by 1.4% on average on the same database. Compared with the transformation embedded GMM, the experiment with two databases indicate that the method reformed in the paper can directly reach the best point of saturation with the right mixture number.
     2. A weighted features compensation transformation method based on GMM for robust speaker verification is presented. In the method, the scores of features are weighted through frame SNR, while the frame likelihood probabilities are transformed based on the acoustic characteristic of speaker recognition system. In stationary and non-stationary noise environment with different SNR, compared with the features weighted algorithm, this proposed method can achieve the average recognition rate increase by 2.74% and 2.82%, while the method have the average recognition rate increase of 3.56% and 1.34% compared with the normalization of compensation transform method on the same database. On the another open database, the increments are 3.02% and 2.56% compared with the features weighted algorithm, while compared with the normalization of compensation transform method, the increments are 3.9% and 1.14%.
     3. Based on the statistical characteristics of speaker feature and the particularity of histogram equalization applied to speaker recognition, the adaptive histogram equalization (AHEQ) method for speaker recognition is presented. In this method, the cumulative histogram function is first created with the wide range and then According to the frequency range eigenvalue increment from the size of the interval to determine the need for further delineation and demarcation level. This approach not only reduce the amount of computation, but also the transformation of the eigenvalues more in line with the actual distribution of feature space, making it possible to further improve the recognition rate and robust of Speaker Identification System in noise environment. In the same database, the study used two classic noise (that is, White and Babble), compared with ordinary histogram equalization method, the average recognition rate of AHEQ is increased by 3% and 2.9%. In another comparison testing focused, the performance of the adaptive histogram equalization method is similar improvement.
引文
[1]Furui S,Lee C,Soong F,et al.An Overview of Speaker Recognition Technology,Automatic Speech and Speaker Recognition[M].Kluwer Academic Press,1996.
    [2]Pruzansky S.Pattern-matching procedure for automatic talker recognition [J].Journal of the Acoustical Society of America,1963,35(3):354-358.
    [3]邓菁,电话信道下多说话人识别研究[D].北京:清华大学,计算机科学与技术学院,2006.
    [4]Atal B S.Automatic recognition of speakers from their voices[J].Proc.IEEE,1976,64(4):460-475.
    [5]Davis S B,Mermelstein P.Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences[J].IEEE Transactions on Acoustic,Speech and Signal Processing,1980,28:357-366.
    [6]甑斌,吴玺宏,刘志敏,et al.语音识别和说话人识别中各倒谱分量的相对重要性[J].北京大学学报(自然科学版),2001,37(3):371-378.
    [7]Hermansky H.Perceptual linear prediction(PLP)analysis for speech[J].Journal of the Acoustical Society of Amefica,1990,87(4):1738-1752.
    [8]郭庆,声学模型中帧间相关性和自适应问题的研究[D].北京:清华大学,计算机系,1999.
    [9]Furui S.Comparison of speaker recognition methods using static features and dynamic features[J].IEEE Transactions on Acoustic,Speech and Signal Processing,1981,29(3):342-350.
    [10]段新,黄新宇,吴淑珍.与文本无关的说话人辨认中的一种新的使用基音周期方法研究[J].北京大学学报(自然科学版),2003,39(5):690-696.
    [11]林玮,杨莉莉,徐柏龄.基于修正MFCC参数汉语耳语语音的话者识别[J].南京大学学报,2006,42(1):54-62.
    [12]Reynolds D A,Campbell W,Gleason T T.The 2004 MIT Lincoln laboratory speaker recognition system[A].In Proceedings of ICASSP[C],Philadelphia,USA,2005.
    [13]Chen Z H,Liao Y F,Juang Y T.Prosody modeling and eigen-prosody analysis for robust speaker recognition[A].In Proceedings of ICASSP[C],Philadelphia, USA,2005:185-188.
    [14]Adami A G.Prosodic modeling for speaker recognition based on sub-band energy temporal trajectories[A].In Proceedings of ICASSP[C],Philadelphia,USA,2005:189-192.
    [15]Mijail A,Anil A,Philip Z.A Bayesian network approach combining pitch and spectral envelope features to reduce channel mismatch in speaker verification and forensic speaker recognition[A].In Proceedings of InterSpeech[C],Lisbon,Portugal,2005:2009-2013.
    [16]Ramachandran R P,Farrell K R,Ramachandran R,et al.Speaker recognitiongeneral classifier approaches and data fusion methods[J].Pattern Recognition,2002,35:2801-2821.
    [17]杨行俊,迟惠生.语音信号数字处理[M].1995.
    [18]Hertz J,Krogh A,Palmer R G.Introduction to the theory of neural computation [R].Santa Fe Institute Studies in the Sciences of Complexity,Addison-Wesley,Reading,Mass,USA,1991.
    [19]Haykin S,ed.Neural networks:a comprehensive foundation,ed.Macmillan.1994:New York,USA.
    [20]Vapnik V N,ed.The nature of sratistical learning theory,ed.Springer-Verlag.1995:New York.
    [21]Sakoe H,Chiba S.Dynamic programming algorithm optimization for spoken word recognition[J].IEEE Transactions on Acoustic,Speech and Signal Processing,1978,26(1):43-49.
    [22]Higgins A L,Bahler L G,Porter J E.Voice identification using nearest neighbor distance measure[A].In Proceedings of ICASSP[C],1993:375-378.
    [23]Soong F K,Rosenberg A E,Rabiner L R,et al.A vector quantization approach to speaker recognition[A].In Proceedings of International Conference on Acoustics,Speech and Signal Processing[C],1985:387-390.
    [24]Campbell J P.A vector quantization approach to speaker recognition[J].AT&T Tech J,1987,66(2):14-26.
    [25]Rabiner L R,Juang B H.Fundamentals of speech recognition,Signal Processing [M].NJ:1993.
    [26]刘鸣,戴蓓倩,李辉.鲁棒性话者辨识中的一种改进的马尔可夫模型[J].电子学报,2002,30(1):46-48.
    [27]朱晓园.一个对隐马尔可夫模型用于自由语句说话人的研究[J].北方交通大学学报,1997,21(1):34-38.
    [28]Reynolds D A,Rose R C.Robust text-independent speaker identification using Gussian mixture speaker models[J].IEEE Trans.On Speech and Audio Processing,1995,3(1):72-83.
    [29]马继涌,高文.基于最大交叉熵估计高斯混合模型参数的方法[J].软件学报,1999,10(9):974-978.
    [30]Gish H,Schmidt M.Text-independent speaker identification[J].IEEE Signal Processing Mag,1994,11(18-32)
    [31]Reynolds D A,Carlson B.Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers[A].In Proceedings of Eurospeech [C],1995:647-650.
    [32]Bennani Y,Gallinari P.On the use of TDNN-extracted features information in talker identification[A].In Proceedings of International Conference on Acoustics,Speech and Signal Processing[C],1991:385-388.
    [33]Farrell K P,Mammone R J,Assaleh K T.Speaker recognition using neural networks and conventional classiers[J].IEEE Transactions on Acoustic,Speech and Signal Processing,1994,2:194-205.
    [34]Burges C J C.A tutorial on support vector machines for pattern recognition [J].Data Mining and Knowledge Discovery,1998,2(2):121-167.
    [35]张学工.关于统计学习理论与支持向量机[J].自动化学报,2000,26(1):32-42.
    [36]Schmidt M,Gish H.Speaker identification via support vector classifiers[A].In Proceedings of ICASSP[C],Atlanta,Ga,USA,1996,1:105-108.
    [37]Gu Y,Thomas T.A text-independent speaker verification system using support vector machines classifier[A].In Proceedings of Eurospeech[C],Aalborg,Denmark,2001:1765-1769.
    [38]Wan V,Campbell W M.Support vector machines for speaker verification and identification[A].In Proceedings of IEEE Signal Processing Society Workshop [C],2000:755-784.
    [39]侯风雷,王炳锡.基于支持向量机的说话人辨认研究[J].通信学报,2002,23(6):61-67.
    [40]Sun S Y,Tseng C L.Cluster-based support vector machines in text-independent speaker identification[A].In Proceedings of IEEE International Joint Conference on Neural Networks[C],2004:729-734.
    [41]Liu M H,Dai B Q,Xie Y L,et al.Improved GMM-UBM/SVM for speaker verification[A].In Proceedings of ICASSP[C],2006,1:925-928.
    [42]Kharroubi J,Petrovska D D,Chollet G.Combining GMMs with support vector machines for text-independent speaker verification[A].In Proceedings of Eurospeech[C],Aalborg,Denmark,2001:1757-1760.
    [43]Xin D,Wu Z H,Yang Y C.Exploiting support vector machines in Hidden Markov Models for speaker verification[A].In Proceedings of ICSLP[C],Denver,Colo,USA,2002:1329-1332.
    [44]Campbell W M.A SVM/HMM system for speaker recognition[A].In Proceedings of ICASSP[C],2003:209-212.
    [45]贺志阳,张玲华.基于GMM统计参数和SVM的说话人辨认研究[J].南京邮电大学学报(自然科学版),2006,26(3):78-82.
    [46]Reynolds D A,Rose R C.Robust Text-independent Speaker Identification Using Gaussian Mixture Speaker models[J].IEEE Trans.Speech Audio Process,1995,3(1):72-83.
    [47]You K H,Wang H C.Joint Estimation of Feature Transformation Parameters and Gaussian mixture Model for Speaker identification[J].Speech Communication,1999,28:227-241.
    [48]Gong J P.On MMI Learning of Gaussian mixture for speaker models[A].In Proceedings of Proceddings EUROSPEECH[C],1995:363-366.
    [49]Hong Q Y,Kwong S.Discriminative training for speaker identification based on maximum model distance algorithm[A].In Proceedings of IEEE Int.Conf.on Acoustic,Speech,and Signal Process[C],2004,1:25-28.
    [50]Hong Q Y,Kwong S.A Discriminative Training Approach for Text-independent Speaker Recognition[J].Signal Processing,2005,85:1449-1463.
    [51]Ljolje A.The importance of cepstrai parameter correlations in speech recognition [J].Computer Speech and Language,1994,8:223-232.
    [52]Chen C C T,Chen C T,Hou C K.Speaker Identification Using Hybrid Karhunen-Loeve transform and Gaussian mixture model approach[J].Pattern Recognition,2004,37:1073-1075.
    [53]Fukunaga K.Introduction to Statistical Pattern Recognition[M].Academic Press, 1990.
    [54]Boulis C,Diakoloukas V,Digalakis V.Maximum Likelihood Stochastic Transformation Adaptation for Medium and Small Data Sets[J].Computer Speech and Language,2001,15:257-285.
    [55]Boll S F.Suppression of acoustic noise in speech using spectral subtraction [J].IEEE Transactions on Acoustic,Speech and Signal Processing,1979,27:113-120.
    [56]Berouti M,Schwartz R,Makhoul J.Enhancement of speech corrupted by acoustic noise[A].In Proceedings of International Conference on Acoustics,Speech and Signal Processing[C],1979:208-211.
    [57]田斌,易克初.一种用于强噪声环境下语音识别的含噪Lombard及Loud语音补偿方法[J].声学学报,2003,28(1):28-32.
    [58]Lockwood P,Boudy J.Experiments with a nonlinear spectral subtractor (NSS),hidden Markov models and the projection of robust speech recognition in cars[J].Speech Communication,1992,11(6):215-228.
    [59]Poruba J.Speech enhancement based on nonlinear spectral subtraction[A].In Proceedings of Device,Circuits and Systems[C],2002:546-549.
    [60]Hermansky H,Morgan N.RASTA processing of speech[J].IEEE Transactions on Speech and Audio Processing,1994,2(4):578-589.
    [61]吕成国,王承发,李俊庆.等.RASTA-PLP技术与谱减法相结合的去噪方法[J].自动化学报,2000,26(5):717-720.
    [62]Zhen B,Wu X H,Liu Z M,et al.An enhanced RASTA processing for speech signal[J].Chinese Journal of Acoustics,2001,26(3):252-258.
    [63]Kocsor A,Toth L,Kuba A.A comparative study of several featuretransformation and learning methods for phoneme classification[J].International Journal of Speech Technology,2000,3(3):263-276.
    [64]Saon G,Padmanabhan M,Gopinath R,et al.Maximum likelihood discriminant feature spaces[A].In Proceedings of the International Conference on Acoustics,Speech and Signal Processing[C],2000,2:1129-1132.
    [65]Skosan M,Histogram equalization for robust text-independent speaker verification in telephone environments[D].Cape Town:University of Cape Town,Electrical Engineering,2005.
    [66]Gales M J F,Young S J.Robust continuous speech recognition using parallel model combination[J].IEEE Transactions on Speech and Audio Processing,1996,4(5):352-359.
    [67]Pelecanos J,Sridharan S.Feature warping for robust speaker verification[A].In Proceedings of Speaker Odyssey conference[C],2001:213-218.
    [68]Reynolds D A.Channel robust speaker verification via feature mapping[A].In Proeeedings of ICASSP[C],2003,2:53-56.
    [69]Teunen R,Shahshahani B,Heck LP.Amodelbased transformational approach to robust speaker recognition[A].In Proceedings of ICSLP[C],2000:213-218.
    [70]周静芳,陈一宁,刘加,et al.说话人识别信道补偿技术HNSSM[J].清华大学学报(自然科学版),2004,24(7):942-945.
    [71]Vogt R,Sridharan S.Experiments in session variability modeling for speaker verification[A].In Proceedings of ICASSP[C],Toulouse,France,2006:897-900.
    [72]Campbell W M,Sturim D E,Reynolds D A,et al.SVM based speaker verification using a GMM supervector kernel and NAP variability compensation[A].In Proceedings of ICASSP[C],2006:97-100.
    [73]Bouman C A.Cluster:An Unsupervised Algorithm for Modeling Gaussian Mixtures[R 2005.
    [74]Reynolds D A,Guassian Mixture Modeling Approach to Text-independent Speaker Identification.[D Georgia Institute of Technology,1992.
    [75]Rissanen J.A Universal Prior for Integers and Estimation by Minimum Description Length[J].Annals of Statistics,1983,11(2):417-431.
    [76]Crunwald P D.Model selection based on minimum description length[J].Journal of Mathematical Psychology,2000,44(1):133-152.
    [77]Xu L,Tang Z,He K,et al.Transformation-Based GMM with Improved Cluster Algorithm for Speaker Identification[A].In Proceedings of PAKDD[C],Nanjing,China,2007:1006-1014.
    [78]Childers D G.Matlab之语音处理与合成工具箱(影印版)[M].北京:清华大学出版社,2004.
    [79]赵力.语音信号处理[M].北京:机械工业出版社,2003.
    [80]马大猷.现代声学理论基础[M].北京:科学技术出版社,2004.
    [81]Acero A.Acoustical and Environmental Robustness in Automatic Speech Recognition[M].Boston:Kluwer Academic Publishers,1993.
    [82]Gales M J,Model Based Techniques for Noise Robust Speech Recognition[D Cambridge University,1995.
    [83]Sharma S.Feature extraction using non-linear transformation for robust speech recognition on the aurora database[J].International Conference on Acoustics,Speech and Signal Processing,2000,:1117-1120.
    [84]Pandey P C,Bhandorkar S M.Enhancement of alaryngeal speech using spectral subtraction[J].Digital Processing,2002,12(2):591-594.
    [85]Soon I Y,Koh S N.Speech enhancement using 2-D Fourier transform[J].IEEE Transactions on Speech and Audio Processing,2003,11(6):717-724.
    [86]Tadj C,Gabrea M.Towards robustness in speaker verification:Enhancement and adaptation[J].Midwest Symposium on Circuits and Systems,2002,3(3):320-323.
    [87]田滨,曹志刚.帧间约束MMSE语音增强算法[J].电子学报,1995,23(9):12-18.
    [88]黄磊,吴顺君,张林让,et al.快速子空间分解方法及其维数的快速估计[J].电子学报,2005,33(6):977-981.
    [89]Viikki O,Laurila K.Noise robust HMM-based speech recognition using segmental cepstral feature vector normalization[A].In Proceedings of ESCA NATO Workshop on Robust Speech Recognition for Unknown Communication Channels[C],Pont-a-Mousson,France,1997:107-110.
    [90]陶智,赵鹤鸣,龚呈卉.基于听觉掩蔽效应和Bark子波变换的语音增强[J].声学学报(中文),2005,30(4):367-372.
    [91]Zhenyu X,Thomas F Z,Wenhu W.Weighting observation vectors for robust speech recognition in noisy environment[J].International Conference on Spoken Language Processing,2004,:819-822.
    [92]于鹏,徐义芳,曹志刚.基于加权特征值补偿的说话人识别[J].信号处理,2002,18(6):513-517.
    [93]Carlson B A,Clements M A.A projection-based likelihood measure for speech recognition in noise[J].IEEE Transactions on Speech and Audio Processing,1994,2:97-102.
    [94]包永强,赵力,邹采荣.采用归一化补偿变换的与文本无关的说话人识别[J].声学学报,2006,31(1):55-60.
    [95]Chen K.Towards better making a decision in speaker verification[J].Pattern Recognition,2003,36(2):329-346.
    [96]Matsui T,Furui S.Likelihood normalization for speaker verification using a phoneme-and speaker-independent model[J].Speech Communication,1995,17(1-2):97-116.
    [97]熊振宇,大规模、开集、文本无关说话人辨认研究[D].北京:清华大学,计算机科学与技术系,2005.
    [98]Longbotham H G,Bovik A C.Theory of order statistic filters and their relationship to linear fir filters[J].IEEE Transactions on Acoustic,Speech and Signal Processing.,1989,37(2)
    [99]Segura J C,Benitez M C,Torre A,et al.Feature extraction combining spectral noise reduction and cepstral histogram equalization for robust asr[A].In Proceedings of International Symposium of Chinese Spoken Language Processing[C],2002:225-228.
    [100]王炳锡,屈丹,彭煊.实用语音识别基础[M].北京:国防工业出版社,2005.
    [101]Matsui T,Furui S.Concatenated phoneme models for text variable speaker recognition[A].In Proceedings of ICASSP[C],1993,2:391-394.
    [102]Varga A P,Steeneken H J M,Tomlinson M,et al.The noisex-92 study on the effect of additive noise on automatic speech recognition[R].Speech Research Unit,Malvern,UK,1992.
    [103]Gong Y.Speech Recognition in Noisy Environments:A Survey[J].Speech Communication,1995,16
    [104]Huang X,Acero A,Hon H.Spoken Language Processing:A Guide to Theory,Algorithm and System Development[M].NJ,USA:Prentice Hall PTR Upper Saddle River,2001.
    [105]Furui S.Cepstral Analysis Techniques for Automatic Speaker Verification [J].IEEE Trans.On ASSP,1981,
    [106]Viikki A,Lautila K.Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recognition[J].Speech Communication,1998,25
    [107]Gauian J L,Lee C H.Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains[J].IEEE Trans.on Speech and Audio Processing,1994,2(2):291-298.
    [108]Leggetter C J,Woodland P C.Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models[J].Computer Speech and Language,1995,9:171-185.
    [109]Matthews J.Histogram equalization.http://www.generation5.org/content/2004/histogramEqualization.asp,2004.
    [110]Skosan M,Masho D J.Matching feature distributions for robust speaker verification[A].In Proceedings of PRASA[C],2004:93-97.
    [111]Torre D 1,Segura J C,Benftez M C.Non-linear transformations of the feature space for robust speech recognition[A].In Proceedings of ICASSP[C],2002:401-404.
    [112]Molau S,Keysers D,Ney H.Matching training and test data distributions for robust speech recognition[J].Speech Communication,2003,41:579-601.
    [113]Wan C Y,Lee L S.Joint uncertainty decoding(JUD)with histogram-based quantization(HQ)for robust and/or distributed speech recognition[A].In Proceedings of ICASSP[C],2006.
    [114]Wan C Y,Lee L S.Histogram-based quantization(HQ)for robust and scalable distributed speech recognition[A].In Proceedings of EUROSPEECH[C],2005.
    [115]Torre D L,Antonio M.Histogram equalization of speech representation for robust speech recognition[J].IEEE Trans.Speech Audio Process,2005,13(3):355-366.
    [116]Skosan M,Mashao D J.Modified segmental histogram equalization for robust speaker verification[J].Pattem RecognitionLetters,2006,27:479-486.
    [117]Dharanipargda S,Padmanabhan M.A nonlinear unsupervised adaptation technique for speech recognition[A].In Proceedings of ICSLP[C],2000.
    [118]Hilger F,Ney H.Quantile based histogram equalization for noise robust speech recognition[A].In Proceedings of EUROPSEECH[C],2001.
    [119]Hilger F,al.e.Quantile based histogram equalization for noise robust large vocabulary speech recognition[J].IEEE Trans.on Speech and Audio Processing,2005,14(3)
    [120]Molau S,Pitz M,Ney H.Histogram Based Normalization in the Acoustic Feature Space[A].In Proceedings of ASRU[C],2001.
    [121]Molau S,Hilger F,Ney H.Feature Space Normalization in Adverse Acoustic Conditions[A]In Proceedings of International Conference on Acoustics,Speech,and Signal Processing[C],Piscataway USA,2003:656-659.
    [122]Molau S,Hilger F,Keysers D,et al.Enhanced histogram normalization in the acoustic feature space[A].In Proceedings of International Conference of Spoken Language Processing[C],Rundle Mall,Australia,2002:1421-1424.
    [123]Segura J C,Benitez C.Cepstral domain segmental nonlinear feature transformations for robust speech recognition[J].IEEE Signal Processing Letters,2004,11:517-520.
    [124]林士翔,叶耀明,陈柏琳.统计圆等化法于杂讯语音辨识之进一步研究[A].In Proceedings of ROCLING[C],台北,台湾,2006.
    [125]Lin S H,Yeh Y M,Chen B.Exploiting Polynomial-Fit Histogram Equalization and Temporal Average for Robust Speech Recognition[A].In Proceedings of the 9th International Conference on Spoken Language Processing(Interspeech-ICSLP2006)[C],Pittsburgh PA,USA,2006.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700