A method for generation of Mandarin F<sub>0</sub> contours based on tone nucleus model and superpositional model

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

A method for generation of Mandarin F₀ contours based on tone nucleus model and superpositional model

详细信息	查看全文 \| 推荐本文 \|

作者：Qinghua Sun^a ; ^{qinghua@gavo.t.u-tokyo.ac.jp} ; Keikichi Hirose^b ; ^{hirose@gavo.t.u-tokyo.ac.jp} ; Nobuaki Minematsu^b ; ^{mine@gavo.t.u-tokyo.ac.jp}
关键词：Mandarin ; Tone ; Fundamental frequency contour ; Tone nucleus ; Phrase component ; Tone component
刊名：Speech Communication
出版年：2012
期刊代码：112_01676393
类别：cp
出版时间：October, 2012
卷：54
期：8
页码：932-945
文件大小：1466 K

摘要

A new method was proposed for synthesizing sentence fundamental frequency (F₀) contours of Mandarin speech. The method is based on representing a sentence logarithmic F₀ contour as a superposition of tone components on phrase components, as in the case of the generation process model (F₀ model). However, the method is not fully depending on the model in that tone components are generated in a corpus-based way by concatenating F₀ patterns predicted for constituting syllables. Furthermore, the prediction is done only for the stable part of syllable tone component, known as tone nucleus. The entire tone components were obtained by concatenating the predicted patterns. Since effect of tone coarticulation is minor for tone nuclei, as compared to conventional methods of handling full syllable F₀ contours, a better prediction is possible especially when the size of training corpus is limited. While tone components are highly language specific, phrase components are assumed to be more language universal: analogy from a control scheme of phrase components developed for a language may applicable for other languages. Also, phrase components covers a wider range (phrase, clause, etc.) of speech and is tightly related to higher linguistic information (syntax), and, therefore, concatenation of short F₀ contour fragments predicted in a corpus-based method will not be appropriate. Taking these into consideration, rules similar to Japanese were constructed to control phrase commands, from which phrase components were generated with simple mathematical calculations in the framework of the generation process model. There is a tight relation between phrase and tone components, and, therefore, both components cannot be generated independently. To ensure the correct relation be held in the synthesized F₀ contour, a two-step scheme was developed, where information of generated phrase components was utilized for the prediction of tone components. A listening test was conducted for speech synthesized using F₀ contours generated by the developed method. Synthetic speech sounded highly natural, showing the validity of the method. Furthermore, it was shown through an experiment of word emphasis that flexible F₀ control was possible by the proposed method.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700