Single-channel speech separation using empirical mode decomposition and multi pitch information with estimation of number of speakers

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

Single-channel speech separation using empirical mode decomposition and multi pitch information with estimation of number of speakers

详细信息查看全文

作者：M. K. Prasanna Kumar ; R. Kumaraswamy
关键词：BSS ; SCSS ; EMD ; IMF ; SIFT ; Multi pitch information ; LP analysis ; Excitation source
刊名：International Journal of Speech Technology
出版年：2017
出版时间：March 2017
年：2017
卷：20
期：1
页码：109-125
全文大小：
刊物类别：Engineering
刊物主题：Signal,Image and Speech Processing; Social Sciences, general; Artificial Intelligence (incl. Robotics);
出版者：Springer US
ISSN：1572-8110
卷排序：20

文摘

Speech separation is an essential part of any voice recognition system like speaker recognition, speech recognition and hearing aids etc. When speech separation is applied at the front-end of any voice recognition system increases the performance efficiency of that particular system. In this paper we propose a system for single channel speech separation by combining empirical mode decomposition (EMD) and multi pitch information. The proposed method is completely unsupervised and requires no knowledge of the underlying speakers. In this method we apply EMD to short frames of the mixed speech for better estimation of the speech specific information. Speech specific information is derived through multi pitch tracking. To track multi pitch information from the mixed signal we apply simple-inverse filtering tracking and histogram based pitch estimation to excitation source information along with estimating the number of speakers present in the mixed signal.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700