Adjusting dysarthric speech signals to be more intelligible
详细信息    查看全文
文摘
This paper presents a system that transforms the speech signals of speakers with physical speech disabilities into a more intelligible form that can be more easily understood by listeners. These transformations are based on the correction of pronunciation errors by the removal of repeated sounds, the insertion of deleted sounds, the devoicing of unvoiced phonemes, the adjustment of the tempo of speech by phase vocoding, and the adjustment of the frequency characteristics of speech by anchor-based morphing of the spectrum. These transformations are based on observations of disabled articulation including improper glottal voicing, lessened tongue movement, and lessened energy produced by the lungs. This system is a substantial step towards full automation in speech transformation without the need for expert or clinical intervention.

Among human listeners, recognition rates increased up to 191 % (from 21.6 % to 41.2 % ) relative to the original speech by using the module that corrects pronunciation errors. Several types of modified dysarthric speech signals are also supplied to a standard automatic speech recognition system. In that study, the proportion of words correctly recognized increased up to 121 % (from 72.7 % to 87.9 % ) relative to the original speech, across various parameterizations of the recognizer. This represents a significant advance towards human-to-human assistive communication software and human-computer interaction.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700