Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis

详细信息查看全文

作者：John Dines^a ; ^{john.dines@idiap.ch} ; Hui Liang^a ; Lakshmi Saheer^a ; Matthew Gibson^b ; William Byrne^b ; Keiichiro Oura^c ; Keiichi Tokuda^c ; Junichi Yamagishi^d ; Simon King^d ; Mirjam Wester^d ; Teemu Hirsimä ; ki^e ; Reima Karhila^e ; Mikko Kurimo^e
关键词：Speech-to-speech translation ; Cross-lingual speaker adaptation ; HMM-based speech synthesis ; Speaker adaptation ; Voice conversion
刊名：Computer Speech & Language
出版年：2013
出版时间：February, 2013
年：2013
卷：27
期：2
页码：420-437
全文大小：1285 K

文摘

In this paper we present results of unsupervised cross-lingual speaker adaptation applied to text-to-speech synthesis. The application of our research is the personalisation of speech-to-speech translation in which we employ a HMM statistical framework for both speech recognition and synthesis. This framework provides a logical mechanism to adapt synthesised speech output to the voice of the user by way of speech recognition. In this work we present results of several different unsupervised and cross-lingual adaptation approaches as well as an end-to-end speaker adaptive speech-to-speech translation system. Our experiments show that we can successfully apply speaker adaptation in both unsupervised and cross-lingual scenarios and our proposed algorithms seem to generalise well for several language pairs. We also discuss important future directions including the need for better evaluation metrics.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700