Phonetic Aspects of High Level of Naturalness in Speech Synthesis

详细信息查看全文

关键词：Phonetics ; Voice source ; Source ; filter interaction ; Formants ; Formant synthesis
刊名：Lecture Notes in Computer Science
出版年：2016
出版时间：2016
年：2016
卷：9811
期：1
页码：531-538
全文大小：1,357 KB
参考文献：1.Lobanov, B.M., Tsirulnik, L.I.: Computer synthesis and speech cloning. Belarussian science (2008)
2.Mizutani, T., Kagoshima, T.: Concatenative speech synthesis based on the plural unit selection and fusion method. IEICE Trans. Inf. Syst. E88–D(11), 2565–2572 (2005)CrossRef
3.Ni, J., Shiga, Y., Hori, Ch., Kidawara, Y.: A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis. In: INTERSPEECH 2013, pp. 1052–1056 (2013)
4.Raitio, T., Kane, J., Drugman, T., Gobl, Ch.: HMM-based synthesis of creaky voice. In: INTERSPEECH 2013, pp. 2316–2320 (2013)
5.Nurminen, J., Sil’en, H., Gabbouj, M.: Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases. In: INTERSPEECH 2013, pp. 388–391 (2013)
6.Bondarko L.V. Phonetics of Russian modern language. SPbSU (1998). (in Russian)
7.Kodzasov S.V., Krivnova O.F.: General Phonetics. Moscow (2001)
8.Fant, G.: Acoustic Theory of Speech Production. Mouton, Netherlands (1960)
9.Flanagan, J.L.: Source-system interaction in the vocal tract. Ann. N.Y. Acad. Sci. 155, 9–17 (1968)CrossRef
10.Flanagan, J.L.: Speech Analysis, Synthesis, and Perception. Springer, New York (1972)CrossRef
11.Fant, G.: The voice source in connected speech. Speech Commun. 22(2–3), 125–139 (1997)CrossRef
12.Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. Technical Report, STL-QPSR (1985)
13.Alzamendi, G.A., Schlotthauer, G., Torres, M.E.: Formulation of a stochastic glottal source model inpired on deterministic Lilencrants-Fant model. In: International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications - MAVEBA 2015, Florence, pp. 15–18 (2015)
14.Stevens, K.: Acoustic Phonetics. The MIT Press, Cambridge (1998). 02141
15.Howe, M.S., McGowan, R.S.: On the single-mass model of the vocal folds. Fluid. Dyn. Res. 42(1), 15001 (2010). doi:10.1088/0169-5983/42/1/015001 CrossRef MATH
16.Titze, I.R.: Non-linear source-filter coupling in phonation Theory. J. Acoust. Soc. Am. 123, 2733–2749 (2008). doi:10.1121/1.2832337 CrossRef
17.Zanartu, M., Mongeau, L., Wodicka, G.R.: Influence of acoustic loading on an effective single mass model of the vocal folds. J. Acoust. Soc. Am. 121, 1119–1129 (2007). doi:10.1121/1.2409491 CrossRef
18.Hatzikirou, H., Fitch, W.T.S., Herzel, H.: Voice instabilities due to source-tract interactions. Acta Acust. Acust. 92, 468–475 (2006)
19.Miller, D.G., Schutte, H.K.: Mixing the registers: Glottal source or vocal tract. Folia Phoniatr. Logop. 57, 278–291 (2005)CrossRef
20.Mergell, P., Herzel, H.: Modeling biphonation - The role of the vocal tract. Speech Commun. 22, 141–154 (1997). doi:10.1016/S0167-6393(97)00016-2 CrossRef
21.Evgrafova, K., Evdokimova, V., Skrelin, P., Chukaeva, T., Shvalev, N.: A new technique to record a voice source signal. In: International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications - MAVEBA 2013, Florence, pp. 181–182 (2013)
22.Evdokimova, V., Evgrafova, K., Skrelin, P., Chukaeva, T., Shvalev, N.: Detection of the frequency characteristics of the articulation system with the use of voice source signal recording method. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 108–115. Springer, Heidelberg (2013)CrossRef
23.Barabanov, A., Evdokimova, V., Skrelin, P.: Estimation of vowel spectra near vocal chords with restoration of a clipped speech signal. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 209–216. Springer, Heidelberg (2015)CrossRef
24.Evdokimova, V., Skrelin, P., Evgrafova, K., Chukaeva, T., Shvalev, N.: Investigating voice source signal filtering be articulation component. Teoreticheskaya i prikladnaya lingvistika 1(3), 37–49 (2015). (in Russian)
25.Evdokimova V., Evgrafova K., Skrelin P.: Investigating source - filter interaction to specify classic speech production theory. In: Proceedings of the 18th International Congress of Phonetic Sciences. The Scottish Consortium for ICPhS 2015. The University of Glasgow, Glasgow, UK. ISBN 978-0-85261-941-4. Paper number 04.621.1-5. http://www.icphs2015.info/pdfs/Papers/ICPHS0462.pdf . Retrieved from (2015)
26.Fraile, R., Evdokimova, V.V., Evgrafova, K.V., Godino-Llorente, J.I., Skrelin, P.A.: Analysis of measured and simulated supraglottal acoustic waves. J. Voice (2015). doi:10.1016/j.jvoice.2015.08.006
27.Evdokimova, V.V.: The use of vocal tract model for constructing the vocal structure of the vowels. In: SPECOM 2006, Saint-Petersburg, pp. 210–214, 25–29 June 2006
作者单位：Vera Evdokimova (16)
Pavel Skrelin (16)
Andrey Barabanov (16)
Karina Evgrafova (16)

16. Saint Petersburg State University, 7/9 Universitetskaya Nab., St. Petersburg, 199034, Russia
丛书名：Speech and Computer
ISBN：978-3-319-43958-7
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349
卷排序：9811

文摘

The paper is concerned with the phonetic aspects of speech synthesis of Russian vowels with the use of a voice source signal. An original method of recording the glottal wave synchronously with an output speech signal was employed to obtain the experimental material. Several types of perceptual experiments were carried out. The comparison of the recorded signals allowed us to analyze the structure of the speech signal at different stages of its generation. The source-filter interaction is analyzed by speech signal filtering. The transfer functions of the articulation for the Russian vowels were obtained. The transfer functions and voice source signals of different vowels were used to generate new signals. The resulted signals were analyzed. We examined the way the fundamental frequency, voice quality and a type of phoneme influence the source-filter interaction. In the paper the perceptual experiments, acoustic analysis and signal generation results are presented.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700