基于压缩感知和MFCC的语音端点检测算法

英文篇名：Voice Activity Detection Algorithm Based on Compressed Sensing and MFCC
作者：杨海燕 ; 吴雷 ; 周萍
英文作者：YANG Hai-yan;WU Lei;ZHOU Ping;Key Laboratory of Cognitive Radio and Information Processing, Ministry of Education, Guilin University of Electronic Technology;School of Information and Communication, Guilin University of Electronic Technology;School of Electric Engineering and Automation, GuiLin University of Electronic Technology;
关键词：连续语音 ; 端点检测 ; 压缩感知 ; 梅尔倒谱系数
英文关键词：continuous speech;;endpoint detection;;compressed sensing;;MFCC
中文刊名：IKJS
英文刊名：Measurement & Control Technology
机构：桂林电子科技大学认知无线电与信息处理教育部重点实验室;桂林电子科技大学信息与通信学院;桂林电子科技大学电子工程与自动化学院;
出版日期：2019-05-18
出版单位：测控技术
年：2019
期：v.38;No.327
基金：国家自然科学基金资助项目(61462017);; 广西区自然科学基金资助项目(2014GXNSFAA118353);; 广西千亿元产业产学研用合作项目;; 教育部认知无线电与信息处理教育部重点实验室主任基金项目(C77KYOOORX15)
语种：中文;
页：IKJS201905021
页数：6
CN：05
ISSN：11-1764/TB
分类号：96-101

摘要

在连续语音识别系统中,针对强噪声环境下传统双门限语音检测方法出现的误检问题,提出了一种结合压缩感知理论和MFCC倒谱系数的端点检测算法。该算法采用Hadamard随机观测矩阵和改进的OMP重构算法对语音信号进行压缩感知与重构,利用语音信号在离散余弦基上的近似稀疏性,提取重构信号的MFCC倒谱系数来检测语音信号的端点。仿真结果表明,提出的改进算法具有较强的鲁棒性,能满足在强噪声环境下对连续语音信号进行有效端点检测的要求。
In the strong noise environment, in order to solve the mistaken identified problems of continuous speech recognition in traditional double threshold speech detection methods, an endpoint detection algorithm combining compressed sensing theory and MFCC cepstrum coefficients is proposed. The method used Hadamard random observation matrix and improved OMP algorithm to compress and reconstruct the speech signal. The MFCC cepstrum coefficiens of reconstructed signals were extracted to detect the endpoints of speech signals by using the approximate sparsity of speech signals on discrete cosine basis. The simulation results show that the presented algorithm has strong robustness and can meet the requirement of effective endpoint detection for continuous speech signals in strong noise environment.

引文

[1]韦国刚,周萍,杨青.一种简单的噪声鲁棒性语音端点检测方法[J].测控技术,2015,34(2):31-34.
    [2]刘玉珍,田金波.基于语音增强的双门限语音端点检测算法[J].测控技术,2016,35(11):33-35.
    [3]叶蕾,孙林慧,杨震.基于压缩感知观察序列倒谱距离的语音端点检测算法[J].信号处理,2011,27(1):67-72.
    [4]李哲涛,臧浪,田淑娟,等.基于混合压缩感知的分簇式网络数据收集方法[J].计算机研究与发展,2017,54(3):493-501.
    [5]高悦,臧明相,郭馥英.基于小波变换和压缩感知的语音信号压缩研究[J].计算机应用研究,2017(12):3672-3674.
    [6] Wang H Z,Xu Y C,Li M J. Study on the MFCC similaritybased voice activity detection algorithm[C]//2011 2nd International Conference on Artificial Intelligence, ManagementScience and Electronic Commerce. 2011:4391-4394.
    [7]王宏志,徐玉超,李美静.基于Mel频率倒谱参数相似度的语音端点检测算法[J].吉林大学学报(工学版),2012,42(5):1331-1335.Wang H Z,Xu Y C,Li M J. Voice activity detection algorithm based on Mel frequency cepstrum coefficient(MFCC)similarity[J]. Journal of Jilin University(Engineering and Technology Edition),2012,42(5):1331-1335.
    [8]周小星,王安娜,孙红英,等.基于压缩感知过程的语音增强[J].清华大学学报(自然科学版),2011,51(9):1234-1238.
    [9]王军,孔令斌,赵洁.基于压缩感知的OMP改进重构算法[J].光通信研究,2016(1):74-78.
    [10]黄海.压缩感知重建算法及其在语音识别中的应用[D].沈阳:辽宁大学,2014.
    [11]陈振锋,吴蔚澜,刘加,等.基于Mel倒谱特征顺序统计滤波的语音端点检测算法[J].中国科学院大学学报,2014,31(4):524-529.
    [12]葛艳,胡红萍.低信噪比下的语音端点检测技术的研究[D].太原:中北大学,2015
    [13]胡政权,曾毓敏,宗原,等.说话人识别中MFCC参数提取的改进[J].计算机工程与应用,2014,50(7):217-220.
    [14]陈蔚,熊卫华,施巍巍.基于经验模态分解和Mel倒谱系数的语音端点检测[J].浙江理工大学学报(自然科学版),2015,33(4):574-578.
    [15]赵新燕,王炼红,彭林哲.基于自适应倒谱距离的强噪声语音端点检测[J].计算机科学,2015,42(9):83-85.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700