基于时序结构的听觉感知语音信号端点特征检测

英文篇名：Auditory perception speech signal endpoint feature detection based on temporal structure
作者：韩天 ; 张宏国 ; 郑重 ; 崔扬 ; 于晓洋
英文作者：HAN Tian;ZHANG Hong-guo;ZHENG Zhong;CUI Yang;YU Xiao-yang;College of Software and Microelectronics,Harbin University of Science and Technology;College of Electronics and Information Engineering,Harbin Institute of Technology;College of Measurement-control Technology and Communication Engineering,Harbin University of Science and Technology;
关键词：信息处理技术 ; 时序结构 ; 听觉感知 ; 语音信号 ; 端点特征 ; 检测
英文关键词：information processing technology;;temporal structure;;auditory perception;;speech signal;;endpoint feature;;detection
中文刊名：JLGY
英文刊名：Journal of Jilin University(Engineering and Technology Edition)
机构：哈尔滨理工大学软件与微电子学院;哈尔滨工业大学电子与信息工程学院;哈尔滨理工大学测控技术与通信工程学院;
出版日期：2018-06-22 10:35
出版单位：吉林大学学报(工学版)
年：2019
期：v.49;No.201
基金：黑龙江省教育厅2014年度科学技术研究项目(12541144)
语种：中文;
页：JLGY201901038
页数：6
CN：01
ISSN：22-1341/T
分类号：318-323

摘要

针对传统方法在高信噪比情况下检测性能较好、但在低信噪比情况下性能很差的问题,提出一种新的基于时序结构的听觉感知语音信号端点特征检测方法。利用有限长窗时间序列结构对听觉感知语音信号进行采集,实现时序分析,得到听觉感知语音信息的一般形式,在此基础上,获取时序结构下听觉感知语音信号的短时能量特征。对含噪声的听觉感知语音信号进行离散小波变换处理,获取含噪声的小波系数,通过阈值对小波系数进行处理,将未超过阈值的小波系数看作噪声,通过高于阈值的小波系数对听觉感知语音信号进行重构,完成语音信号去噪处理。利用双门限-三态转换判断体系实现听觉感知语音信号端点特征检测。实验结果表明,本文方法在低信噪比状态下仍可保证高检测精度。
The traditional detection method has good performance under high SNR,which becomes poor under low SNR.Therefore,a new method based on temporal structure is proposed for auditory perception speech signal endpoint feature detection.The auditory perceptual speech signal is collected by the finite length window time sequence structure,and the time sequence analysis is conducted,then the general form of the auditory perceptual speech information is obtained.On this basis,the short time energy characteristics of the auditory perceptual speech signal under the time series structure are obtained.The noise sensing speech signal is processed by discrete wavelet transform to obtain the wavelet coefficients of the noise,and the coefficients are processed by the threshold value.The wavelet coefficients,which are not more than the threshold,are regarded as noise.The auditory perceptual speech signals are reconstructed by the wavelet coefficients above the threshold,and the speech signal de-noising is completed.A two threshold and three state transformation judgment system is applied to realize auditory perception speech signal endpoint feature detection.Experimental results show that the proposed method can guarantee high detection accuracy in low SNR condition.

引文

[1]欧微,柳少军,贺筱媛,等.基于时序特征编码的目标战术意图识别算法[J].指挥控制与仿真,2016,38(6):36-41.Ou Wei,Liu Shao-jun,He Xiao-yuan,et al.Tactical intention recognition algorithm based on encoded temporal features[J].Command Control and Simulation,2016,38(6):36-41.
    [2]赵凌伟,张磊.基于Mel尺度的语音干扰效果评估方法研究[J].无线电工程,2017,47(2):32-35.Zhao Ling-wei,Zhang Lei.Study on voice jamming effect evaluation based on mel scale[J].Radio Engineering,2017,47(2):32-35.
    [3]张毅,倪雷.基于模糊熵与改进相关向量机的语音端点检测[J].华中科技大学学报:自然科学版,2017,45(8):15-19.Zhang Yi,Ni Lei.Speech activity detection based on fuzzy entropy and improved relevance vector machine[J].Journal of Huazhong University of Science and Technology(Nature Science Edition),2017,45(8):15-19.
    [4]吕丽平,张西芝,张玉宏.基于投影分类的语音端点检测方法[J].电子测量与仪器学报,2017,31(6):922-927.Lv Li-ping,Zhang Xi-zhi,Zhang Yu-hong.Speech endpoint detection method based on projection classification[J].Journal of Electronic Measurement and Instrumentation,2017,31(6):922-927.
    [5]刘薇娜,周小龙,姜振海,等.基于最优特征的改进经验模态分解方法[J].吉林大学学报:工学版,2017,47(6):1957-1963.Liu Wei-na,Zhou Xiao-long,Jiang Zhen-hai,et al.Improved empirical mode decomposition method based on optimal feature[J].Journal of Jilin University(Engineering and Technology Edition),2017,47(6):1957-1963.
    [6]田秀丽,黄亚丽.空间噪声下语音信号端点检测仿真研究[J].计算机仿真,2017,34(5):406-409.Tian Xiu-li,Huang Ya-li.The space under the noise speech signal endpoint detection simulation research[J].Computer Simulation,2017,34(5):406-409.
    [7]陈振锋,吴蔚澜,刘加,等.基于Mel倒谱特征顺序统计滤波的语音端点检测算法[J].中国科学院大学学报,2014,31(4):524-529.Chen Zhen-feng,Wu Wei-lan,Liu Jia,et al.Voice activity detection algorithm based on Mel cepstrum distance order statistics filter[J].Journal of University of Chinese Academy of Sciences,2014,31(4):524-529.
    [8]Wang J,Wang H,Song Y.Quantum endpoint detection based on QRDA[J].International Journal of Theoretical Physics,2017,56(10):3257-3270.
    [9]高应波,柳钦火,李静,等.基于时序植被指数特征时相识别的多熟制耕地提取新方法[J].遥感技术与应用,2015,30(3):431-438.Gao Ying-bo,Liu Qin-huo,Li Jing,et al.A new methodology for extracting multiple cropping land based on distinguishing characteristic phases of time series vegetation index[J].Remote Sensing Technology and Application,2015,30(3):431-438.
    [10]许佳佳.基于压缩感知的语音编码新方案[J].电子设计工程,2016,24(3):32-36.Xu Jia-jia.New speech coding scheme based on compressed sensing[J].Electronic Design Engineering,2016,24(3):32-36.
    [11]刘欢,王骏,林其光,等.时域和频域特征相融合的语音端点检测新方法[J].江苏科技大学学报:自然科学版,2017,31(1):73-78.Liu Huan,Wang Jun,Lin Qi-guang,et al.A novel speech activity detection algorithm based on the fusion of time domain and frequency domain features[J].Journal of Jiangsu University of Science and Technology(Natural Science Edition),2017,31(1):73-78.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700