摘要
针对说话人识别中单一参数表征不够全面的特点,将抗噪性能一般的传统MFCC参数与鲁棒性更强的GFCC参数相互融合,并结合它们的动态特性构成一种新的混合参数.针对特征参数维数过高造成的冗余,研究了每种特征参数各分量与识别结果的关系,舍弃其中贡献较低的分量以实现特征参数降维的目的,并将混合参数应用于基于高斯混合模型的说话人识别系统.仿真实验表明,该混合特征参数具有更好的识别性能和抗噪性.
Aiming at the issue that single feature parameter of speaker recognition has the shortcoming of low representation ability, a set of mixture feature parameters is formed by combining the single poor anti-noise Mel frequency cepstral coefficients(MFCC) with more robust Gammatone frequency cepstral coefficients(GFCC) and their dynamic differential in this paper. Since the high dimension of the mixture feature parameters, the relationships of each dimension of different feature parameters and recognition results is studied, where dimensionality reduction on high dimensional features is implemented by discarding the dimensions with low contribution ratio. After that, the combination of feature parameters was applied to the speaker recognition system based on Gaussian mixture model. Experimental results show that the combination of parameters can better describe the speakers' feature and have better anti-noise capability.
引文
[1]王伟,邓辉文.基于MFCC参数和VQ的说话人识别系统[J].仪器仪表学报,2006, 27(S):2253-2255.Wang W, Deng H W. Speaker recognition system using MFCC features and vector quantization[J]. Chinese Journal of Scientific Instruments, 2006, 27(S):2253-2255.(in Chinese)
[2]黄羿博,张秋余,袁占亭,杨仲平.融合MFCC和LPCC的语音感知哈希算法[J].华中科技大学学报(自然科学版),2015, 43(2):124-128.Huang Y B, Zhang Q Y, Yuan Z T, Yang Z P. The hash algorithm of speech perception based on the integration of adaptive MFCC and LPCC[J]. Journal of Huazhong University of Science and Technology(Natural Science Edition), 2015, 43(2):124-128.(in Chinese)
[3] Yuan Y, Zhao P, Zhou Q. Research of speaker recognition based on combination of LPCC and MFCC[C]//IEEE International Conference on Intelligent Computing and Intelligent Systems,2010:765-767.
[4]吕霄云,王宏霞.基于MFCC和短时能量混合的异常声音识别算法[J].计算机应用,2010, 30(3):796-798.LüX Y, Wang H X. Abnormal audio recognition algorithm based on MFCC and short-term energy[J]. Journal of Computer Applications, 2010, 30(3):796-798.(in Chinese)
[5]王玥,钱志鸿,王雪,程光明.基于伽马通滤波器组的听觉特征提取算法研究[J].电子学报,2010,38(3):525-528.Wang Y, Qian Z H, Wang X, Cheng G M. An auditory feature extraction algorithm based onγ-Tone filter-banks[J]. Acta Electronica Sinica, 2010, 38(3):525-528.(in Chinese)
[6] Shi X, Yang H, Zhou P. Robust speaker recognition based on improved GFCC[C]//IEEE International Conference on Computer and Communications, 2017:1927-1931.
[7] Qi J, Wang D, Jing Y, Liu R S. Auditory features based on Gammatone filters for robust speech recognition[C]//IEEE International Symposium on Circuits and Systems, 2013:305-308.
[8]柯晶晶,周萍,景新幸,杨青.差分和加权Mel倒谱混合参数应用于说话人识别[J].微电子学与计算机,2014, 31(9):89-91.Ke J J, Zhou P, Jing X X, Yang Q. Mixed parameters of differential and weighted Mel Cepstrum used in speaker recognition[J]. Microelectronics&Computer, 2014, 31(9):89-91.(in Chinese)
[9]茅正冲,王正创,黄芳.基于GFCC与RLS的说话人识别抗噪系统研究[J].计算机工程与应用,2015, 51(10):215-218.Mao Z C, Wang Z C, Huang F. Speaker recognition anti-noise system research based on RLS and GFCC[J]. Computer Engineering and Applications, 2015, 51(10):215-218.(in Chinese)
[10]甄斌,吴玺宏,刘志敏,迟惠生.语音识别和说话人识别中各倒谱分量的相对重要性[J].北京大学学报(自然科学版),2001, 37(3):371-378.Zhen B, Wu X H, Liu Z M, Chi H S. On the importance of components of the MFCC in speech and speaker recognition[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2001,37(3):371-378.(in Chinese)