说话人识别中语音特征参数提取方法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本文主要研究了应用于说话人识别系统的语音特征提取技术。针对加性噪声环境中的语音增强和端点检测、基音特征提取、听觉特征参数提取及降维等方面进行了深入的研究与探讨。论文的主要研究内容如下:
     1.提出了一种基于扩展谱相减的语音增强算法,使得对背景噪声的估计相对传统方法更加精确。结合语音缺失概率和动态阈值法提出了一种新的端点检测算法。实验证明该算法在低信噪比条件下也能准确检测出语音起始点。
     2.提出了一种基于CAMDF的倒数加权自相关来进行基音周期估计方法,即RCAF(Reverse CAMDF Autocorrelation Function)算法。仿真实验结果表明,RCAF算法能够减少由共振峰和噪声所引起的异变点对搜索峰值的影响,从而精确地提取基音周期,相对于传统算法具有更强的抗噪声性能。
     3.对人耳听觉模型进行了深入研究,采用Gammatone和Gammachirp这两种滤波器来建立耳蜗工作模型并设计其数字滤波器的实现方法。该组滤波器与人耳听阈曲线拟合度高,具有良好的模拟人耳听觉的特性。
     4.提出了两种基于人耳听觉特性的语音特征参数:Gammatone滤波器系数(GTF)与Gammachirp滤波器系数(GCF),在与文本无关的说话人辨认实验中,取得了优于传统特征参数的性能。针对听觉特征维数较高难以应用的问题,探讨了基于主成分分析和离散余弦变换的特征降维方法,给出了基于PCA降维的说话人识别算法,通过离散余弦变换得到了听觉倒谱特征。在纯净语音和带噪语音情况下分别进行仿真实验,结果表明经过降维后的听觉特征仍然具有良好的噪声鲁棒性,在噪声条件下仍然获得了最优的识别率。
As a kind of biometric identification technology, speaker recognition is to recognize people's identity from its voice, which contains physiological and behavioral characteristics specific to each individual. One significant use of speaker recognition is to determine whether a speaker has the right to enter security or confidential systems. Using speech password has advantages the traditional way by inputting password on keyboard doesn't have, for it is unforgettable and cannot be easily taken. Speaker recognition technology is a very promising area of research.
     Most speaker recognition systems are designed for ideal environment and easily acquired high accuracy in controlled quiet lab situation. However, when a speaker recognition system is used in a real-life situation, there is bound to be a mismatch between training and testing. The background noise to cause the performance of system accuracy decrease sharply. This is the major obstacle to the commercial use of speaker recognition system. So, how to increase the robustness of speaker recognition system is significant and necessary. The thesis focus on how to improve the recognition ratio and robustness of speaker recognition system by several aspects. The main innovation ideas of the dissertation are listed as follows.
     1. An endpoint detection algorithm that combines expanded spectral subtraction with the SAP (speech absence probability) dynamic threshold is proposed based on traditional methods. The algorithm employs a method of expanded spectral subtraction based on the noise compensation structure, which can estimate the noise during speech presence. A method of endpoint detection based on the SAP soft decision is given, which improves robustness and precision of endpoint detection. The experiments show that better performance can be obtained even if SNR is equal to -10dB whereas such performance cannot be achieved by traditional two-doors methods with the same SNR.
     2. Pitch detection is one of the most difficult technologies in speech signal processing under noisy conditions. A new pitch detection of noisy speech signal for lower SNR is proposed, which is based on Reverse CAMDF Autocorrelation Function (RCAF) and searching tentative smooth measurement. The algorithm can estimate noise during speech presence, which employs the method of expanded spectral subtraction based on noise compensation structure. RCAF algorithm improves the robustness and precision of pitch detection. A number of experiments show that by RCAF method, higher efficiency and better detection accuracy can be obtained while the SNR is equal to -10dB. However, such performance can not be achieved by traditional methods, AMDF, CAMDF and AWAC under the same SNR.
     3. Auditory filter plays an important role in understanding the mechanism of hearing, auditory modeling and speaker recognition. Digital implementations of linear gammatone and Gammachirp filters are regularly part of auditory models and can be used in the sound processing in cochlear implants. This paper mainly studied on Gammatone and Gammachirp auditory filter, including their definition, amplitude-frequency response, and performance in simulating the basilar membrane filtering characteristics. Besides, the paper also compared the two auditory filters, explaining their relation and difference. How close digital impulse, magnitude, and phase responses match the corresponding properties of the analog gammatone and Gammachirp filters were evaluated for two infinite-impulse response filter designs. The gammachirp filter was implemented with a small number of filter coefficients using IIR filter. The result shows that the combination of a gammatone filter and an IIR asymmetric compensation filter excellently approximated the gammachirp filter.
     4. An auditory based feature extraction algorithm was developed to improve the recognition performance of speaker identification algorithms using human auditory characteristics. The sub-band energies of the extracted auditory features were calculated using Gammatone and Gammachirp filter bank instead of the commonly used triangle filter bank. The center frequencies and bandwidths then determined according to the equivalent rectangular bandwidth (ERB). The proposed method was compared with two commonly used techniques; LPCC and MFCC in a text-independent speaker identification system. The simulation results prove that the two proposed features outperform the widely used MFCC and LPCC and perform more robust to noisy environment with low environmental SNR level.
     5. For the defect of high-dimensional human auditory features, using two methods to extract low-dimensional features of the speaker is in order to reduce the computational complexity. The two methods of Multivariate Statistieal Analysis are: Prineipal Component Analysis (PCA) and Discrete Cosine Transform (DCT). And the first and second order delta cepstrum and the shifted delta cepstrum is derived based on these auditory features. Compared to the standard Mel-frequency cepstral coefficients, the auditory features yielded higher recognition rate in a speaker recognition system. Also the feature set has better classification and robustness characteristics than traditional speech features.
引文
[1]S.Furui.An overview of speaker recognition technology.ESCA Workshop on Automatic Speaker Recognition,Identification and Verification[C].1994:1-9.
    [2]J.P.Campbell.Speaker recognition:A tutorial[J].Proceedings of the IEEE,1997,85:1437-1462.
    [3]G.R.Doddington,M.A.Przybocki,A.F.Martin,D.A.Reynolds.The NIST speaker recognition evaluation-Overview,methodology,systems,results,perspective.Speech Communication[C].June,2000,31(2-3):225-254.
    [4]D.A.Reynolds.An overview of automatic speaker recognition technology at ICASSP[C].2002,14:4072-4075.
    [5]张军英.说话人识别的现代方法和技术[M].西安:西北大学出版社,1994.
    [6]S.Pruzansky.Pattern-matching Procedure for automatic talker recognition[J].Journal of the Acoustical Society of America.March,1963,35(3):354-358.
    [7]L.Boves.Commercial Applications of Speaker.Verification:Overview and Critical Success Factors.RLA2C[C].France,1998:150-159.
    [8]F.Bimbot,H.P.Hutter,C.Jaboulet,J.Koolwaaij,J.Lindberg,J.B.Pierrot.Speaker verification in the telephone network:Research activities in the CAVE Project.Proceedings of the European Conference on Speech Communication and Technology(EUROSPEECH '97)[C].1997.
    [9]F.Bimbot,M.Blomberg.An overview of the PICASSO project research in speaker verification for telephone applications.Proceedings of the European Conference on Speech Communication and Technology(EUROSPEECH'99)[C].1999.
    [10]J.P.Ding,Yang Liu,Bo Xu.Factor Analyzed Gaussian Mixture Models for Speaker Identification.Proc.ICSLP[C].2002.
    [11]吴玺宏.声纹识别听声辨人[J].计算机世界,2001,(5):2-3.
    [12]崔桂香,丁晓明.声纹识别技术应用及一些关键问题[J].计算机安全,2004,(7):45-47.
    [13]J.D.马尔卡.语音信号线性预测[M].北京:中国铁道出版社,1987:78-90.
    [14]Y.Gong.Speech recognition in noisy environments:A survey[J].Speech Communication.1995,16(3):261-291.
    [15] A. M. A. Ali, J. Van der Spiegel, P. Mueller. Robust auditory based speech processing using the average localized synchrony detection [J]. IEEE Trans. Speech and Audio Processing. 2002,10(5): 279-292.
    [16] Fatima Chouireb, Mhania Guerti. Towards a high quality Arabic speech synthesis system based on neural networks and residual excited vocal tract model, Signal, Image and Video Processing. 2007, 10: 23-28.
    [17] Tranzai Lee, Fang Zheng, Wenhu Wu, Daowen Chen. The Hidden Markov Model of co-articulation and its application to the continuous speech recognition [J]. Journal of Electronics (China). 2004,3: 134-140.
    [18] H. Hermansky, N. Morgan. RASTA processing of speech [J]. IEEE Trans. Speech Audio Processing. 1994, 2(4): 578-589.
    [19] J. Ming. Universal compensation-an approach to noisy speech recognition assuming no knowledge of noise. IEEE ICASSP [C]. January, 2004: 961-964.
    [20] R. C. Rose, E. M. Hofstetter, D. A. Reynolds. Integrated models of signal and background with application to speaker identification in noise [J]. IEEE Trans. On Speech and Audio Processing. April, 1994,2 (2): 241-257.
    [21] K. M. Dobroth, B. L. Zeigler, D. Karis. Future directions for audio interface research: characteristics of human-human order-entry conversations [J]. Proc. Am. Voice Input/output Soc. September, 1989.
    [22] L. Heck, Y. Konig, M. K. Sonmez, M. Weintraub. Robustness To Telephone Handset Distortion In Speaker Recognition By Discriminative Feature Design [J]. Speech Communication. 2000, 31:181 -192.
    [23] T. F. Quatieri, D. A. Reynolds, G. C. O'Leary. Estimation of Handset Nonlinearity with Application to Speaker Recognition [J]. IEEE Transactions on Speech and Audio Processing. August, 2000.
    [24] R. Teunen, B. Shahshahani, L. Heck. A Model-based Transformational Approach to Robust Speaker Recognition. Proceedings of the International Conference on Spoken Language Processing [C]. 2000.
    [25] Roland Auckenthaler, Michael Carey, Harvey Lloyd-Thomas. Score Normalization for Text-Independent Speaker Verification Systems [J]. Digital Signal Processing. 2000, 10: 42-54.
    [26] O. Thyes, R. Kuhn, P. Nguyen, J. Junqua. Speaker identification and verification using eigenvoices. ICSLP2000 [C]. Beijing-China: Octorber, 2000, 2: 242-246.
    [27]林琳.基于模糊聚类与遗传算法的说话人识别理论研究及应用[D].长春:吉林大学博士论文,2006.
    [28]A.Varga,H.J.M.Steenneken,M.Tomlinson,D.Jones.The NOISEX-92 study on the effect of additive noise on automatic speech recognition[M].Documentation included in the NOISEX-92 CD-ROMs,1992.
    [29]S.B.Davies,Mermelstein.Comparion of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences.IEEE Trans.Acoustics,Speech and Signal Processing[C].1980,28(4):357-366,.
    [30]赵力.语音信号处理[M].北京:机械工业出版社.2003.
    [31]杨行峻,迟惠生等.语音信号数字处理[M].北京:电子工业出版社,1998.
    [32]D.A.Reynolds,R.C.Rose.Robust text-independent speaker identification using Gaussian mixture speaker models[J].IEEE Trans.on Speech and Audio Processing.1995,3(1):72-83.
    [33]边肇祺,张学工.模式识别[M].北京:清华大学出版社,2000.
    [34]P.Lockwood,J.Boudy.Experiments with a Nonlinear Spectral Sub tractor(NSS) Hidden Markov Models and projection,for robust speech recognition in cars[J].Speech Communication.1993,11(2):215-228.
    [35]J.Han,M.Han,G.B.Park.Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition[C].Proc.European Conf.on Speech Communication and Technology.1997,3:1531-1534.
    [36]M.J.F Gales,S.J.Young.Robust speech recognition in additive and convolutional noise using parallel model combination[J].Computer speech &language.1995,9(4):289-307,
    [37]A.Sankar,C.H.Lee.A maximum-likelihood approach to stochastic matching for robustspeech recognition[J].IEEE Transactions on speech and Audio Processing.1996,4:190-202.
    [38]J.Ortrga-Garcia.Speech enhancement techniques applied to speaker recognition systems[D].Univ.Politechnica de Madrid,Spain,1996.
    [39]J.Poruba.Speech enhancement based on nonlinear spectral subtraction.IEEE Proc.Int.Conf.on Devices,Circuits and Systems[C].2002:0311-0314.
    [40]J.Ortrga-Garcia,J.Conzalez-Rodriguez.Overview of speech enhancement techniques for automatic speaker recognition.Proc.Int.Conf.on Spoken Language Processing[C].Octorber,1996,2.
    [41]V.I.Djigan,P.Sovka,R.Cmejla.Modified Spectral Subtraction Based Speech Enhancement.Proc.of the 1999 IEEE Workshop on Acoustics Echo and Noise Control-IEAENC'99[C].1999:64-67.
    [42]易克初,田斌,付强.语音信号处理[M].北京:国防工业出版社,2000.
    [43]Nam Soo Kim,Joon-Hyuk Chang.Spectral Enhancement Based on Global Soft Decision[J].IEEE Signal Processing Letters.May,2000,7(5):108-110.
    [44]J.Sohn.N.Kim,W.Sung.A statistical model-based voice activity detetion[J].IEEE Signal Processing Letters.1999,6:1-3.
    [45]J.H.Chang.N.S.Kim.Speech entrancement new approaches to soft decision.IEICE Transactions on Information and Systems[C].September,2001,84(9):1231-1240.
    [46]Yoav medan,Eyal Yair,Dan Chazan.Super Resolution Pitch Determination of Speech Signals[J].IEEE Trans.January,1991.
    [47]Y.Cheng,H.C.Leung.Speaker verification using fundamental frequency,in Proc.of the International Conference on Acoustics.Speech and Signal Processing(ICASSP)[C].2002.
    [48]B.Atal.Automatic recognition of speakers from their voices[J].Proc.EEE.April,1976,64:460-475.
    [49]Douglas O' Shaughnessy,Hesham Tolba.Towards a robust/fast continuous speech recognition system using a voiced-unvoiced decision.ICASSP 1999[C].1999:413-416.
    [50]Joseph Picone.Continuous speech recognition using hidden markov models.IEEE ASSP Magazine,July,1990:257-286.
    [51]H.Ezzaidi,J.Rouat,D.O'Shaughnessy.Towards combining pitch and MFCC for speaker identification systems.Eurospeech 2001[C].Scandinavia.
    [52]B.Atal.Automatic speaker recognition based on pitch contour[J].Journal of Acoustical Society of America.1972,52(6):1687-1697.
    [53]J.D.Markel,B.T.Oshika,A.H.Gary.Long-term feature averaging for speaker recognition[J].IEEE Trans ASSP.August,1977,25:330-337.
    [54]G.R.Doddington,M.A.Przybocki,A.F.Martin,D.A.Reynolds.The NIST speaker recognition evaluation- overview,methodology,systems,results,perspective[J].Speech Commun.2000,31(2-3):225-254.
    [55]A.G.Adami.Modeling prosodic differences for speaker recognition[J].Speech Commun.2007,49(4):277-291.
    [56]M.Arcienega,A.Drygajlo.Pitch-dependent GMMs for text-independent speaker recognition systems.Eurospeech 2001[C].Scandinavia.
    [57]M.J.Carey,E.S.Parris,H.Lloyd-Thomas,S.Bennett.Robust prosodic features for speaker identification.Proc.of ICSLP[C].1996,3:1800-1803.
    [58]胡航.语音信号处理[M].哈尔滨:哈尔滨工业大学出版社,2000.
    [59]X.F.Zhao,D.O'Shaughnessy,M.Q.Nguyen.A processing method for pitch smoothing based on autocorrelation and cepstral F0 detection approaches.International Symposium on Signals.Systems and Electronics[C].Montreal,2007:59-62.
    [60]O.Deshmukh,C.Y.Espy-Wilson,A.Salomon,J.Singh.Use of temporal information:Detection of periodicity,aperiodicity,and pitch in speech[J].IEEE Transaction on Speech and Audio Processing.2005,13(5):776-786.
    [61]T.Shimamura,H.Kobavashi.Weighted autocorrelation for pitch extraction of noisy speech[J].IEEE Transaction on Speech and Audio Processing.2001,9(7):727-730.
    [62]陈永彬,语音信号处理[M].上海:上海交通大学出版社,1990.
    [63]Zhang Wen Yao,Xu Gang,Wang Yu Guo.Circular AMDF and pitch estimation based on it[J].Acta Electronica Sinica.2003,31(6):886-890.
    [64]Z.D.Zhao,X.M.Hu,J.F.Tian.An effective pitch detection method for speech signals with low signal-to-noise ratio.International Conference on Machine Learning and Cybernetics[C].Kunming,2008,5:2775-2778.
    [65]R.G.Amado,J.V.Filho.Pitch detection algorithms based on zero-cross rate and autocorrelation function for musical notes.International Conference on Audio,Language and Image Processing[C].Shanghai,2008:449-454.
    [66]O.Deshmukh,C.Y.Espy-Wilson,A.Salomon,J.Singh.Use of temporal information:Detection of periodicity,aperiodicity,and pitch in speech[J].IEEE Transaction on Speech and Audio Processing.2005,13(5):776-786.
    [67]ZhenDong Zhao,XiMei Hu,JingFeng Tian.An effective pitch detection method for speech signals with low signal-to-noise ratio.Machine Learning and Cybernetics,2008 International Conference[C].July,2008,5(12-15):2775-2778.
    [68] X. Y. Zhu, Y. WANG, J. Liu. An approach of fundamental smoothing for Chinese tone recognition [J]. Journal of Chinese Information Processing. 2000, 15(2): 45-50.
    [69] J. G. Wilpon, L. R. Rabiner, C. H. Lee, E.R. Goldman. Automatic recognition of keywords in unconstrained speech usinghidden Markov models [C], IEEE Transactions on Acoustics, Speech and Signal Processing, 1990, 38(11): 1870-1878.
    [70] L. R. Rabiner, M. J. Cheng, A. E. Rosenberg. A comparative performance study of several pitch detection algorithms. IEEE Transactions on Acoustics, Speech, and Signal Processing [C]. 1976, 24(5): 399-417.
    [71] R. G. Kessel, R. H. Kardon. Tissues and Organs: A Text-Atlas of Scanning Electron Microscopy [M]. New York, 1979.
    [72] E. C. Smith, M. S. Lewicki. Efficient auditory coding [J]. Nature 439. 2006: 978-982.
    [73] R. F. Lyon. A Computational Model of Filtering, Detection, and Compression in the Cochlea. Proceedings of IEEE - ICASSP [C]. 1982, 82: 1282-1285.
    [74] C. Martin. Modeling auditory processing and organization [D]. Sheffield, Britain: university of Sheffield, 1991.
    [75] R. F. Lyon, C. Mead. An analog electronic cochlea [J]. Acoustics, Speech, and Signal Processing. 1988, 36 (7): 1119 -1134.
    [76] J. G. Roederer. The physics and psychophysics of music [M]. Springer-Verlag, 1995.
    [77] H. Fletcher. Speech and Hearing in Communication [M]. Krieger, New York, 1933.
    [78] B. J. Allen. How do humans process and recognize speech? [J]. IEEE Trans. on Speech and Audio Processing. 1994,2(4): 567-577.
    [79] P. R. Lippmann. Speech energy accurate consonant perception without mid-frequency [J]. IEEE Trans. on Speech and Audio Processing. 1996, 4: 66-69.
    [80] A. Recio, W. S. Rhode. Basilar membrane responses to broadband stimuli [J]. J. Acoust. Soc. Am..2000,108: 2281-2298.
    
    [81] J. B. Allen. Cochlear micromechanics - A physical model of transduction. J. Acoust. Soc. Am.. 1980,68(6): 1660-1670.
    
    [82] A.S. Bregman. Auditory scene analysis [M]. Cambridge MA, MIT Press, 1990.
    [83]J.B.Allen.Two-dimmensional cochlea fluid model:New results[J].J.Acoust.Soc.Am..1977,61:110-119.
    [84]R.F.Lyon.A computational model of filtering,detecting and compression in the cochlea[C].Proe.ICASSP'82.1982:1282-1285.
    [85]R.F.Lyon.A computational model of binaural localization and separation[C].Proc.ICASSP'83.1983:1148-1151.
    [86]R.F.Lyon.Computational model of neural auditory processing[C].Proc.ICASSP'84.1984:3611-3614.
    [87]J.B.Allen.Cochlear modeling[J].IEEE ASSP Magazine.1985,1:3-28.
    [88]H.Davis.Biophysics physiology of the inner ear[J].Physiol.Rev.1957,37:1-49.
    [89]H.Davis.An active process in cochlear mechanics[J].Hearing Research.1983,9:79-90.
    [90]C.R.Steele,L.A.Taber.Three-dimensional model calculations for the guinea pig cochlea[J].J.Acoust.Soc.Am..1981,89:1107-1111.
    [91]G.V.Bekesy.Experiments in Hearing[M].McGraw-Hill,New York,1960.
    [92]O.Ghitza.AT&T Bell Labs.Auditory neural feedback as a basis for speech processing.International Conference on Acoustics,Speech,and Signal Processing[C].1988:91-94.
    [93]A.M.Aertsen,P.I.Johannesma,D.J.Hermes.Spectro-temporal receptive fields of auditory neurons in the grass-frog[J].Biological Cybernetics.1980,38(4):235-248.
    [94]P.I.Johannesma.The pre-response stimulus ensemble of neurons in the cochlear nucleus.Symposium on Hearing Theory(IPO,Eindhoven,The Netherlands)[C].1972:58-69.
    [95]R.D.Patterson,K.Robinson,J.W.Holdsworth,et al.Complex sounds and auditory images[M].Auditory Physiology and Perception,(Pergamon,Oxford),1992:429-446.
    [96]Malcolm Slaney.An efficient implementation of the Patterson-Holdsworth auditory filter bank.Apple Computer Technical Report#35 Perception Group-Advanced Technology Group[R].Computer,Inc:Apple 1993.
    [97]M.P.Cooke.Modeling auditory processing and organization[M].Cambridge,U.K.,Cambridge University Press,1993;
    [98]R.D.Patterson,M.H.Allerhand,C.Giguere.Time-domain modelling of peripheral auditory processing:A modular architecture and a software platform [J].J.Acoust.Soc.Am..1995,98:1890-1894.
    [99]M.Pflueger,R.Hoeldrich,W.Riedler.Nonlinear all-pole and one-zero gammatone filters[J].Acustica.1997,83:513-519.
    [100]Z.Wanfeng,Y.Yingchun,W.Zhaohui,S.Lifeng.Experimental evaluation of a new speaker identification framework using PCA.IEEE International Conference on Systems,Man and Cybemetics[C].2003,5:4147-4152.
    [101]T.Irino,R.D.Patterson.A compressive gammachirp auditory filter for both physiological and psychophysical data[J].J.Acoust.Soc.Am.2001,109:2008-2022.
    [102]R.D.Patterson,et al.Auditory models as preprocessors for speech recognition [M].The auditory processing of speech:From sounds to words.Berlin,Germany:Mouton de Gruyter,1992:67-83.
    [103]T.Irino,R.D.Patterson.A time-domain level-dependent auditory filter:the gammachirp[J].J.Acoust.Soc.Am..1997,101:412-419.
    [104]R.D.Patterson,M.Unoki,T.Irino.Extending the domain of center frequencies for the compressive gammachirp auditory filter[J].J.Acoust.Soc.Am..2003,114:1529-1542.
    [105]M.P,Cooke.Modelling auditory processing and organization[D].Ph.D.thesis.University of Sheffield,1991.
    [106]R.P.Derleth,T.Dau.On the role of envelope fluctuation processing in spectral masking[J].J.Acoust.Soc.Am..2000,108:285-295.
    [107]GB/T 4854.7 Acoustics,reference zero for the calibration of audiometric equipment,Part 7:Reference threshold of hearing under free-field and diffuse-field listening conditions[S].1999.
    [108]何强,何英.Matlab扩展编程[M].北京:清华大学出版社.2002.
    [109]章万锋.基于PCA与LDA的说话人识别研究[D].浙江:浙江大学硕士论文,2004.
    [110]T.F.Quatieri.离散时间语音信号处理[M].赵胜辉等译.北京:电子工业出版社.2005.
    [111]S.Furui.Cepstral analysis technique for automatic speaker verification[J].IEEE Trans.On Acoustic,Speech,Signal Processing.1981,10(4):254-272.
    [112]S. Furui. An overview of speaker recognition technology: Workshop on automatic speaker recognition. Identification and Verification [C]. Martigny, Switzerland, 1994: 282-292.
    [113] P. Ding, L. Zhang. Speaker recognition using principal component analysis. 8th International Conference on Neural Information Processing [C]. 2001.
    [114]L. R. Rabiner, et.al. Digital processing of speech signals [M]. Prentice-Hall, New York, 1978.
    [115] K. Pearson. On lines and planes of closest fit to systems of points in Space [J]. Phil. Mag. 1901,2:559-572.
    [116] H. Hotelling. Analysis of a complete of statistical variables into principal components [J]. Journal of Educational Psychology. 1933, 24: 417-441.
    
    [117]M. Kendall. Multivariate analysis [M]. London: Charles Griffin Company Limited, 1975.
    [118]Y.H. Yan. Development of an approach to language identification based on language-dependent phone recognition [D]. Oregon Graduate Institute of Science and Technology, 1995.
    [119] G. R. Doddington, M. A. Przybocki, A. F. Martin, et al. The NIST speaker recognition evaluation-overview, methodology, systems, results, perspective [J]. Speech Communication. 2000, 31(2/3): 225-254.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700