Phonetic Aspects of High Level of Naturalness in Speech Synthesis
详细信息    查看全文
  • 关键词:Phonetics ; Voice source ; Source ; filter interaction ; Formants ; Formant synthesis
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9811
  • 期:1
  • 页码:531-538
  • 全文大小:1,357 KB
  • 参考文献:1.Lobanov, B.M., Tsirulnik, L.I.: Computer synthesis and speech cloning. Belarussian science (2008)
    2.Mizutani, T., Kagoshima, T.: Concatenative speech synthesis based on the plural unit selection and fusion method. IEICE Trans. Inf. Syst. E88–D(11), 2565–2572 (2005)CrossRef
    3.Ni, J., Shiga, Y., Hori, Ch., Kidawara, Y.: A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis. In: INTERSPEECH 2013, pp. 1052–1056 (2013)
    4.Raitio, T., Kane, J., Drugman, T., Gobl, Ch.: HMM-based synthesis of creaky voice. In: INTERSPEECH 2013, pp. 2316–2320 (2013)
    5.Nurminen, J., Sil’en, H., Gabbouj, M.: Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases. In: INTERSPEECH 2013, pp. 388–391 (2013)
    6.Bondarko L.V. Phonetics of Russian modern language. SPbSU (1998). (in Russian)
    7.Kodzasov S.V., Krivnova O.F.: General Phonetics. Moscow (2001)
    8.Fant, G.: Acoustic Theory of Speech Production. Mouton, Netherlands (1960)
    9.Flanagan, J.L.: Source-system interaction in the vocal tract. Ann. N.Y. Acad. Sci. 155, 9–17 (1968)CrossRef
    10.Flanagan, J.L.: Speech Analysis, Synthesis, and Perception. Springer, New York (1972)CrossRef
    11.Fant, G.: The voice source in connected speech. Speech Commun. 22(2–3), 125–139 (1997)CrossRef
    12.Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. Technical Report, STL-QPSR (1985)
    13.Alzamendi, G.A., Schlotthauer, G., Torres, M.E.: Formulation of a stochastic glottal source model inpired on deterministic Lilencrants-Fant model. In: International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications - MAVEBA 2015, Florence, pp. 15–18 (2015)
    14.Stevens, K.: Acoustic Phonetics. The MIT Press, Cambridge (1998). 02141
    15.Howe, M.S., McGowan, R.S.: On the single-mass model of the vocal folds. Fluid. Dyn. Res. 42(1), 15001 (2010). doi:10.​1088/​0169-5983/​42/​1/​015001 CrossRef MATH
    16.Titze, I.R.: Non-linear source-filter coupling in phonation Theory. J. Acoust. Soc. Am. 123, 2733–2749 (2008). doi:10.​1121/​1.​2832337 CrossRef
    17.Zanartu, M., Mongeau, L., Wodicka, G.R.: Influence of acoustic loading on an effective single mass model of the vocal folds. J. Acoust. Soc. Am. 121, 1119–1129 (2007). doi:10.​1121/​1.​2409491 CrossRef
    18.Hatzikirou, H., Fitch, W.T.S., Herzel, H.: Voice instabilities due to source-tract interactions. Acta Acust. Acust. 92, 468–475 (2006)
    19.Miller, D.G., Schutte, H.K.: Mixing the registers: Glottal source or vocal tract. Folia Phoniatr. Logop. 57, 278–291 (2005)CrossRef
    20.Mergell, P., Herzel, H.: Modeling biphonation - The role of the vocal tract. Speech Commun. 22, 141–154 (1997). doi:10.​1016/​S0167-6393(97)00016-2 CrossRef
    21.Evgrafova, K., Evdokimova, V., Skrelin, P., Chukaeva, T., Shvalev, N.: A new technique to record a voice source signal. In: International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications - MAVEBA 2013, Florence, pp. 181–182 (2013)
    22.Evdokimova, V., Evgrafova, K., Skrelin, P., Chukaeva, T., Shvalev, N.: Detection of the frequency characteristics of the articulation system with the use of voice source signal recording method. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 108–115. Springer, Heidelberg (2013)CrossRef
    23.Barabanov, A., Evdokimova, V., Skrelin, P.: Estimation of vowel spectra near vocal chords with restoration of a clipped speech signal. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 209–216. Springer, Heidelberg (2015)CrossRef
    24.Evdokimova, V., Skrelin, P., Evgrafova, K., Chukaeva, T., Shvalev, N.: Investigating voice source signal filtering be articulation component. Teoreticheskaya i prikladnaya lingvistika 1(3), 37–49 (2015). (in Russian)
    25.Evdokimova V., Evgrafova K., Skrelin P.: Investigating source - filter interaction to specify classic speech production theory. In: Proceedings of the 18th International Congress of Phonetic Sciences. The Scottish Consortium for ICPhS 2015. The University of Glasgow, Glasgow, UK. ISBN 978-0-85261-941-4. Paper number 04.621.1-5. http://​www.​icphs2015.​info/​pdfs/​Papers/​ICPHS0462.​pdf . Retrieved from (2015)
    26.Fraile, R., Evdokimova, V.V., Evgrafova, K.V., Godino-Llorente, J.I., Skrelin, P.A.: Analysis of measured and simulated supraglottal acoustic waves. J. Voice (2015). doi:10.​1016/​j.​jvoice.​2015.​08.​006
    27.Evdokimova, V.V.: The use of vocal tract model for constructing the vocal structure of the vowels. In: SPECOM 2006, Saint-Petersburg, pp. 210–214, 25–29 June 2006
  • 作者单位:Vera Evdokimova (16)
    Pavel Skrelin (16)
    Andrey Barabanov (16)
    Karina Evgrafova (16)

    16. Saint Petersburg State University, 7/9 Universitetskaya Nab., St. Petersburg, 199034, Russia
  • 丛书名:Speech and Computer
  • ISBN:978-3-319-43958-7
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9811
文摘
The paper is concerned with the phonetic aspects of speech synthesis of Russian vowels with the use of a voice source signal. An original method of recording the glottal wave synchronously with an output speech signal was employed to obtain the experimental material. Several types of perceptual experiments were carried out. The comparison of the recorded signals allowed us to analyze the structure of the speech signal at different stages of its generation. The source-filter interaction is analyzed by speech signal filtering. The transfer functions of the articulation for the Russian vowels were obtained. The transfer functions and voice source signals of different vowels were used to generate new signals. The resulted signals were analyzed. We examined the way the fundamental frequency, voice quality and a type of phoneme influence the source-filter interaction. In the paper the perceptual experiments, acoustic analysis and signal generation results are presented.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700