基于MFCC与GFCC混合特征参数的说话人识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Speaker Recognition Based on Combination of MFCC and GFCC Feature Parameters
  • 作者:周萍 ; 沈昊 ; 郑凯鹏
  • 英文作者:ZHOU Ping;SHEN Hao;ZHENG Kai-peng;College of Electric Engineering and Automation, Guilin University of Electronic Technology;
  • 关键词:说话人识别 ; 混合特征参数 ; Mel频率倒谱系数 ; Gammatone滤波器
  • 英文关键词:speaker recognition;;combination of feature parameters;;Mel frequency cepstral coefficients(MFCC);;Gammatone filter
  • 中文刊名:YYKX
  • 英文刊名:Journal of Applied Sciences
  • 机构:桂林电子科技大学电子工程与自动化学院;
  • 出版日期:2019-01-30
  • 出版单位:应用科学学报
  • 年:2019
  • 期:v.37
  • 基金:国家自然科学基金(No.61462017);; 广西自然科学基金(No.2014GXNSFAA118353);; 广西自动检测技术与仪器重点实验室基金(No.YQ15110)资助
  • 语种:中文;
  • 页:YYKX201901003
  • 页数:9
  • CN:01
  • ISSN:31-1404/N
  • 分类号:28-36
摘要
针对说话人识别中单一参数表征不够全面的特点,将抗噪性能一般的传统MFCC参数与鲁棒性更强的GFCC参数相互融合,并结合它们的动态特性构成一种新的混合参数.针对特征参数维数过高造成的冗余,研究了每种特征参数各分量与识别结果的关系,舍弃其中贡献较低的分量以实现特征参数降维的目的,并将混合参数应用于基于高斯混合模型的说话人识别系统.仿真实验表明,该混合特征参数具有更好的识别性能和抗噪性.
        Aiming at the issue that single feature parameter of speaker recognition has the shortcoming of low representation ability, a set of mixture feature parameters is formed by combining the single poor anti-noise Mel frequency cepstral coefficients(MFCC) with more robust Gammatone frequency cepstral coefficients(GFCC) and their dynamic differential in this paper. Since the high dimension of the mixture feature parameters, the relationships of each dimension of different feature parameters and recognition results is studied, where dimensionality reduction on high dimensional features is implemented by discarding the dimensions with low contribution ratio. After that, the combination of feature parameters was applied to the speaker recognition system based on Gaussian mixture model. Experimental results show that the combination of parameters can better describe the speakers' feature and have better anti-noise capability.
引文
[1]王伟,邓辉文.基于MFCC参数和VQ的说话人识别系统[J].仪器仪表学报,2006, 27(S):2253-2255.Wang W, Deng H W. Speaker recognition system using MFCC features and vector quantization[J]. Chinese Journal of Scientific Instruments, 2006, 27(S):2253-2255.(in Chinese)
    [2]黄羿博,张秋余,袁占亭,杨仲平.融合MFCC和LPCC的语音感知哈希算法[J].华中科技大学学报(自然科学版),2015, 43(2):124-128.Huang Y B, Zhang Q Y, Yuan Z T, Yang Z P. The hash algorithm of speech perception based on the integration of adaptive MFCC and LPCC[J]. Journal of Huazhong University of Science and Technology(Natural Science Edition), 2015, 43(2):124-128.(in Chinese)
    [3] Yuan Y, Zhao P, Zhou Q. Research of speaker recognition based on combination of LPCC and MFCC[C]//IEEE International Conference on Intelligent Computing and Intelligent Systems,2010:765-767.
    [4]吕霄云,王宏霞.基于MFCC和短时能量混合的异常声音识别算法[J].计算机应用,2010, 30(3):796-798.LüX Y, Wang H X. Abnormal audio recognition algorithm based on MFCC and short-term energy[J]. Journal of Computer Applications, 2010, 30(3):796-798.(in Chinese)
    [5]王玥,钱志鸿,王雪,程光明.基于伽马通滤波器组的听觉特征提取算法研究[J].电子学报,2010,38(3):525-528.Wang Y, Qian Z H, Wang X, Cheng G M. An auditory feature extraction algorithm based onγ-Tone filter-banks[J]. Acta Electronica Sinica, 2010, 38(3):525-528.(in Chinese)
    [6] Shi X, Yang H, Zhou P. Robust speaker recognition based on improved GFCC[C]//IEEE International Conference on Computer and Communications, 2017:1927-1931.
    [7] Qi J, Wang D, Jing Y, Liu R S. Auditory features based on Gammatone filters for robust speech recognition[C]//IEEE International Symposium on Circuits and Systems, 2013:305-308.
    [8]柯晶晶,周萍,景新幸,杨青.差分和加权Mel倒谱混合参数应用于说话人识别[J].微电子学与计算机,2014, 31(9):89-91.Ke J J, Zhou P, Jing X X, Yang Q. Mixed parameters of differential and weighted Mel Cepstrum used in speaker recognition[J]. Microelectronics&Computer, 2014, 31(9):89-91.(in Chinese)
    [9]茅正冲,王正创,黄芳.基于GFCC与RLS的说话人识别抗噪系统研究[J].计算机工程与应用,2015, 51(10):215-218.Mao Z C, Wang Z C, Huang F. Speaker recognition anti-noise system research based on RLS and GFCC[J]. Computer Engineering and Applications, 2015, 51(10):215-218.(in Chinese)
    [10]甄斌,吴玺宏,刘志敏,迟惠生.语音识别和说话人识别中各倒谱分量的相对重要性[J].北京大学学报(自然科学版),2001, 37(3):371-378.Zhen B, Wu X H, Liu Z M, Chi H S. On the importance of components of the MFCC in speech and speaker recognition[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2001,37(3):371-378.(in Chinese)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700