Adjusting dysarthric speech signals to be more intelligible

详细信息查看全文

作者：Frank Rudzicz
关键词：Speech transformation ; Dysarthria ; Intelligibility
刊名：Computer Speech & Language
出版年：2013
出版时间：September, 2013
年：2013
卷：27
期：6
页码：1163-1177
全文大小：956 K

文摘

This paper presents a system that transforms the speech signals of speakers with physical speech disabilities into a more intelligible form that can be more easily understood by listeners. These transformations are based on the correction of pronunciation errors by the removal of repeated sounds, the insertion of deleted sounds, the devoicing of unvoiced phonemes, the adjustment of the tempo of speech by phase vocoding, and the adjustment of the frequency characteristics of speech by anchor-based morphing of the spectrum. These transformations are based on observations of disabled articulation including improper glottal voicing, lessened tongue movement, and lessened energy produced by the lungs. This system is a substantial step towards full automation in speech transformation without the need for expert or clinical intervention.

Among human listeners, recognition rates increased up to 191 % (from 21.6 % to 41.2 % ) relative to the original speech by using the module that corrects pronunciation errors. Several types of modified dysarthric speech signals are also supplied to a standard automatic speech recognition system. In that study, the proportion of words correctly recognized increased up to 121 % (from 72.7 % to 87.9 % ) relative to the original speech, across various parameterizations of the recognizer. This represents a significant advance towards human-to-human assistive communication software and human-computer interaction.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700