Classification of a Sequence of Objects with the Fuzzy Decoding Method
详细信息    查看全文
  • 作者:Andrey V. Savchenko (25)
    Lyudmila V. Savchenko (26)
  • 关键词:Classification ; sequence of objects ; fuzzy sets ; phoneme recognition ; fuzzy decoding method ; Kullback ; Leibler discrimination ; Mel ; frequency cepstral coefficients
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8536
  • 期:1
  • 页码:309-318
  • 参考文献:1. Savchenko, A.V.: Probabilistic neural network with homogeneity testing in recognition of discrete patterns set. Neural Networks?46, 227-41 (2013) CrossRef
    2. Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Elsevier Inc. (2009)
    3. Benesty, J., Sondh, M., Huang, Y. (eds.): Springer Handbook of Speech Recognition. Springer (2008)
    4. Savchenko, L.V., Savchenko, A.V.: Fuzzy Phonetic Decoding Method in a Phoneme Recognition Problem. In: Drugman, T., Dutoit, T. (eds.) NOLISP 2013. LNCS, vol.?7911, pp. 176-83. Springer, Heidelberg (2013) CrossRef
    5. Wang, H., Wang, Y., Cao, Y.: Video-based face recognition: a survey. World Academy of Science. Engineering and Technologies?60, 293-02 (2009)
    6. Zadeh, L.A.: Fuzzy Sets. Information Control?8, 338-53 (1965) CrossRef
    7. Sarkar, M.: Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets and Systems?158(19), 2134-152 (2007) CrossRef
    8. Kullback, S.: Information Theory and Statistics. Dover Pub. (1997)
    9. Anusuya, M.A., Katti, S.K.: Speech recognition by Machine: A Review. International Journal of Computer Science and Information Security?6(3), 181-05 (2009)
    10. Kipyatkova, I.S., Karpov, A.A.: An Analytical Survey of Large Vocabulary Russian Speech Recognition Systems. SPIIRAS Proceedings?12, 7-0 (2010)
    11. Keener, R.W.: Theoretical Statistics: Topics for a Core Course. Springer, New York (2010)
    12. Reddy, D.R.: Speech recognition by machine: a review. Proceedings of the IEEE?64(4), 501-31 (1976) CrossRef
    13. Hill, J.E.: The minimum of n independent normal distributions, http://www.untruth.org/~josh/math/normal-min.pdf
    14. Savchenko, A.V.: Adaptive Video Image image Recognition recognition System Using using a Committee committee Machinemachine. Optical Memory and Neural Networks (Information Optics)?21(4), 219-26 (2012) CrossRef
    15. Specht, D.F.: Probabilistic neural networks. Neural Networks?3(1), 109-18 (1990) CrossRef
    16. Itakura, F., Saito, S.: An analysis–synthesis telephony based on the maximum likelihood method. In: Proc. of International Congress on Acoustics c-5-5, vol.?5, pp. 17-0 (1968)
    17. Basseville, M.: Distance measures for signal processing and pattern recognition. Signal Processing?18, 349-69 (1989) CrossRef
    18. Mérialdo, B.: Multilevel Decoding for Very-Large-Size-Dictionary Speech Recognition. IBM Journal of Research and Development?32(2), 227-37 (1988) CrossRef
    19. Sirigos, J., Fakotakis, N., Kokkinakis, G.: A hybrid syllable recognition system based on vowel spotting. Speech Communication?38, 427-40 (2002) CrossRef
    20. Savchenko, A.V.: Phonetic words decoding software in the problem of Russian speech recognition. Automation and Remote Control?74(7), 1225-232 (2013) CrossRef
    21. Savchenko, A.V.: Phonetic encoding method in the isolated words recognition problem. Journal of Communications Technology and Electronics?59(4), 310-15 (2014) CrossRef
    22. CMU Sphinx, http://cmusphinx.sourceforge.net/
  • 作者单位:Andrey V. Savchenko (25)
    Lyudmila V. Savchenko (26)

    25. National Research University Higher School of Economics, Nizhny Novgorod, Russia
    26. Nizhny Novgorod State Linguistic University, Russia
  • ISSN:1611-3349
文摘
The problem of recognition of a sequence of objects (e.g., video-based image recognition, phoneme recognition) is explored. The generalization of the fuzzy phonetic decoding method is proposed by assuming the distribution of the classified object to be of exponential type. Its preliminary phase includes association of each model object with the fuzzy set of model classes with grades of membership defined as the confusion probabilities estimated with the Kullback-Leibler divergence between model distributions. At first, each object (e.g., frame) in a classified sequence is put in correspondence with the fuzzy set which grades are defined as the posterior probabilities. Next, this fuzzy set is intersected with the fuzzy set corresponding to the nearest neighbor. Finally, the arithmetic mean of these fuzzy intersections is assigned to the decision for the whole sequence. In this paper we propose not to limit the method’s usage with the Kullback-Leibler discrimination and to estimate the grades of membership of models and query objects based on an arbitrary distance with appropriate scale factor. The experimental results in the problem of isolated Russian vowel phonemes and words recognition for state-of-the-art measures of similarity are presented. It is shown that the correct choice of the scale parameter can significantly increase the recognition accuracy.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700