详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
在说话人识别领域,基于支持向量机(Support Vector Machine,SVM)的识别方法是当今的研究热点。同其他模式识别方法相比该方法主要有两个不同点:一是它采用一个非线性核函数来表示特征空间的内积,另外一方面它采用分类间隔最大的最优分类超平面实现结构风险最小化。而这些特征使得支持向量机方法能得到广泛的应用。
     为缩短说话人识别系统的训练时间,在对样本进行基于支持向量机的训练之前,需要对样本进行约简。论文总结了该领域的理论成果并给予归纳,提出了一种新的约简方法——支持聚类区提取法(SupportCluster Abstracting,SCA),阐述了该方法的理论基础并给出了具体实现步骤,并对SCA方法和传统的方法进行了实验和分析,用实验演示了算法对线性可分样本边界的描述准确度。考察了算法对线性不可分样本即语音样本在约简率和识别率方面的性能。
In speaker recognition field, recognition method based on Support Vector Machine (SVM) technique is a hot spot. Unlike other conventional pattern recognition techniques, this method has two perculiar characteristics. Firstly, the proposed SVM technique expresses inner product of feature space using a non-linear kernel function. Secondly, the SVM method carries out structural risk minimization principle using optimal classification super surface. This made the proposed SVM technique widely applicable.
     In this thesis, we investigate the fundamental theory and realization procedure for speaker recognition. We began with a thorough review on feature parameter. This is followed by an investigation of the linear prediction cepstrum coefficient (LPCC) and mel-frequency cepstrum coefficient (MFCC). The thesis combined features from LPCC and MFCC into several feature vectors and tested their degree of accuracy in abstracting personal characteristics. The thesis also investigated the impact of virous feature parameter on rate of recognition and noise abatement.
     Since kernel function is an essential technique in SVM theory and the accuracy of feature classification is greatly influenced by the selection of function and parameter, we conducted a review of the basic theory of kernel functions. A simulation and analysis of kernel function such as polynomial function, radial basis function, sigmoid function is presented. Then, the rate of recognition and steadiness of pure speech signal and noisy signal condition is also presented.
     Before SVM training, the size of sample set is critical to achieving high rate of recognition and time efficiency, therefore, we propose reducing the size of the sample set. We also presented a new algorithm for reducing the so-called Support Cluster Abstracting (SCA). We conducted a review of the SCA's fundamentals and provide its realistic steps. At last, the thesis presented a simulation and analysis comparing SCA and other methods. On one hand, we tested linear divisible samples and their performance at boundary description. On the other hand, we tested linear non-divesible samples and measured their rate of reduction and recognition.
     The obtained SCA parameters determine whether reducing sample set can contain all the supporting vectors and relieve the burden of SVM training as far as possible. In this thesis, we set up SCA parameters experimentally. The parameters include fan-out coefficient k, clustering numbers C and approximation degree factor a. The simulation results reveal that, compared to other reducing algorithms, SCA reaches the higher rate of recognition at higher rate of reducing after coefficient set-up. The results of our experiments justify the prediction of theory. This thesis investigated the difference of capability of virous speaker recognition model.
    [4]Vincent Wan,Steve Renals.Speaker Verfication Using Sepuence Discriminate Support Vector Machine[C].IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING,2005,13(2):203-210
    [5]Ganesh N.Ramaswarny,Ji Navratil,UpendraV.Chaudhari and Ran D.Zilca.The IBM system for the NIST 2002 cellular speaker verification evaluation.ICASSP 2003
    [6]Zhu Xiaoyuan.A Study of Hidden Markov Models for Text-independent Speaker Recognition.Journal of Northern Jiao Tong University[J],Feb 1997
    [12]T.Moriyama,S.Ozawa.Emotion Recognition and Synthesis System on Speech.IEEE ICMCS 99,June 1999
    [14]Pan J S,Lu ZM,Sun SH.An Efficient Encoding Algorithm for Vector Quantization Based on Subvector Technique.IEEE TRANSACTIONS ON IMAGE PROCESSING,MAR 2003,12(3):264-270
    [15]Kai-Fu Li,Hsiao-Wuen Hon.Large-Vocabulary,Speaker-Independent,Continuous Speech Recognition Using HMM.In Proceedings of the IEEE International Conference on Acoustics,Speech,and Signal Processing,1988, 123-126
    [16]Reynolds D.A.Speaker identification and verification using Gaussian mixture speaker models[J].Speech Communication,1995,17:92-108.
    [17]H.Harb.Isolated Word Recognition Using Neural Networks.The 7~(th) IEEE International Conference of Electronics.California,2000,349-351
    [18]Corinna Cortes,Vladimir Vapnik.Support-Vector Networks[J].Machine Learning,1995,20(3):273-297
    [19]Vapnik V.The nature of statistical learning theory[M].New York:Springer-Verlag,1995,张学工译.统计学习理论的本质.北京:清华大学出版社,2000
    [25]H.Torres,H.Rufiner,Automatic Speaker Identifacation by Means of Mel Cepstrum,Wavelets and Wavelet Packets.Processing of the 22 Annual EMBSInternational Conference,Chicag,July2000,978-981
    [26]Waleed H.Abdulla,Nikola K.Kasabov.The Concepts of Hidden MarkovModel in Speech Recognition.Technical Report.Information Science Department University of Otago New Zealand,2001,17-35
    [28]Wei L,Weiss S,Hanzo L.Subband-selective partially adaptive broadband beamforming with cosine-modulated blocking matrix.IEEE Acoustics,Speech,and SignalProcessing,2002,3(5):2913-2916
    [30]于明,袁玉倩,董浩.一种基于MFCC和LPCC文本相关说话人识别方法[J]. 计算机应用,2006,26(4):883-885
    [34]Kanedera N,Arai T,Hermansky H.On the Importance of Various Modulation Frequencies for Speech Recogni-tion.In:Proceedings of EUROSPEECH,1997,103-105
    [35]Wu Zunjing,Cao Zhigang.Improved MFCC-based feature for robust speaker identification[J].Tsinghua Science and Technology,2005,10(2):158-161
    [36]Md K I M,Hirose K.On the Effectiveness of MFCCs and Their Statistical Distribution Properties in Speaker Identification.IEEE International Conference on Virtual Environments,Human-Computer Interfaces and Measurement Systems,July 2004,13-14
    [37]Vidyassgar M.A Theory of Learning and Generalization.Great Britain:Springer,1997,304-307
    [38]Vapnik V.Statistical learning theory.New York,John Wiley and Sons,1998,22-34
    [39]Yiqiang Zhan,Dinggang Shen.Design efficient support vector machine for fast classification[J].PatternRecognition,2005,38:157-161
    [40]Sch lkopf B,Plat J C,Shawe-Taylor J.Estimating the support of a high-dimensional distribution[J].NeuralComputation,2001,13(7):1443-1471
    [41]Joachims T.Transductive inference for text classification using support vector machine[A].In proceedings of the Sixteenth International Conference on Machine Learning[C].Morgan Kaufmann,1999,148-156
    [42]N Cristianini,J Shawe-Yaylor.An Introduction to Support Vector Machines and Other Kernel-based Learning Methods.Cambridge,Cambridge University Press,2000
    [43]Changxue Ma,Randolph,M.A.,Drish,J.A Support Vector Machines-Based Rejection Technique for Speech Recognition.Acoustics,Speech,and Signal Processing,2001 IEEE International Conference,2001,1(6):381-384
    [44]W.M.Campbell,D.E.Sturim,D.A.Reynolds,A.Solomonoff.SVM Based Speaker Verification using a GMM Supervector Kemel and NAP Variability Compensation[C].Acoustics,Speech and Signal Processing,2006,2(8):97-100
    [47]David M.J.T,Robert P.W.Using two-class classifiers for multiclass classification.Proceeding of International Conference on Pattern Recognition.,2002
    [54]LEE Y J,MANGASARIAN O L.RSVM:Reduced support vector machine[C]//First SIAM International Conference on DataMin-ing.Chicago:[s.n.],2001,350-366
    [56]TSAIW H,CHENG S S,WANG HM.Automatic speaker cluste-ring using a voice characteristic reference space andmaximum purityestimation[J].IEEE Transactions on Audio,Speech,and Lan-guage Processing,2007,15(4),1461-1471
    [57]WANG JH,LEEW J,LEE S J.A kernel-based fuzzy clustering algorithm [C]//Proceedings of the First International Conference on Innovative Computing,Information and Contro.1 Beijing:IEEE CSPress,2006,1,550-553
    [58]Tran Q A,Zhang Q L,Li X.Reduce the Number of Support Vectors by Using ClusteringTechniques.The Second International Conference on Machine Learning and Cybernetics,2003,36-43
    [63]Ravindra Koggalage,Saman Halgamuge.Reducing the Number of Training Samples for Fast Support Vector Machine Classification-Letters and Reviews.Neural lnformationProcessing,2004,2(3),53-56

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700