Bioinspired sparse spectro-temporal representation of speech for robust classification

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

Bioinspired sparse spectro-temporal representation of speech for robust classification

详细信息	查看全文 \| 推荐本文 \|

作者：C. Mart&iacute ; nez^a ; ^c ; ^{cmartinez@fich.unl.edu.ar} ; J. Goddard^b ; D. Milone^a ; ^d ; H. Rufiner^a ; ^c ; ^d
关键词：Robust phoneme recognition ; Approximated auditory cortical representation ; Sparse coding
刊名：Computer Speech & Language
出版年：2012
期刊代码：124_08852308
类别：cp
出版时间：October, 2012
卷：26
期：5
页码：336-348
文件大小：1005 K

摘要

In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700