Speech Synchronized Tongue Animation by Combining Physiology Modeling and X-ray Image Fitting
详细信息    查看全文
文摘
This paper proposes a speech synchronized tongue animation system from text or speech. Firstly, an anatomically accurate physiological tongue model is built, and then produces tremendous tongue deformation samples according to the randomly input muscle activation samples. Secondly, these input and output samples are used to train a neural network for establishing the relationship between the muscle activation and tongue contour deformation. Thirdly, the neural network is used to estimate the non-rigid tongue movement parameters, namely tongue muscle activations, from a collected X-ray tongue movement image database of Mandarin Chinese phonemes after removing the rigid tongue movement, and then the estimation results are used for constructing the tongue physeme (the sequences of the tongue muscle activations and the rigid movement) database corresponding to the Mandarin Chinese phoneme database. Finally, the physemes corresponding to the phonemes extracted from input text or speech are blended to drive the physiological tongue model for producing the speech synchronized tongue animation according to the durations of phonemes. Simulation results demonstrate that the synthesized tongue animations are visually realistic and approximate the tongue medical data well.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700