Real-time blind source separation system with applications to distant speech recognition
详细信息    查看全文
文摘
A real-time BSS system based on DUET was developed and implemented in order to assess its potential as the front-end for a DSR engine. The system uses only two closely-spaced standard omni-directional microphones and a computer soundcard and was developed for low reverberation environments with several human speakers and different noise sources.

A novel multi-source real-time audio streaming module was developed, with arbitrary statistics, movement tracking, continuity cues such as position and cross-correlation, a spurious peak classifier stage based on kurtosis, and spectral subtraction post-processing.

Two intrinsic error causes for the binaural attenuation and delay estimators were identified, due to FFT spectral leakage and to sibilants, which violate the taken for granted DUET assumptions. A comprehensive study on time windows was done and new window types proposed in order to minimize the DUET assumptions violations.

The implemented system correctly identifies the clusters in the binaural estimators’ space for the case of a real room with two human speakers, up to distances of 2 m from the two microphones, although for distances greater than 1 m the separation quality quickly degrades.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700