Classification of a Sequence of Objects with the Fuzzy Decoding Method

详细信息查看全文

作者：Andrey V. Savchenko (25)
Lyudmila V. Savchenko (26)
关键词：Classification ; sequence of objects ; fuzzy sets ; phoneme recognition ; fuzzy decoding method ; Kullback ; Leibler discrimination ; Mel ; frequency cepstral coefficients
刊名：Lecture Notes in Computer Science
出版年：2014
出版时间：2014
年：2014
卷：8536
期：1
页码：309-318
参考文献：1. Savchenko, A.V.: Probabilistic neural network with homogeneity testing in recognition of discrete patterns set. Neural Networks?46, 227-41 (2013) CrossRef
2. Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Elsevier Inc. (2009)
3. Benesty, J., Sondh, M., Huang, Y. (eds.): Springer Handbook of Speech Recognition. Springer (2008)
4. Savchenko, L.V., Savchenko, A.V.: Fuzzy Phonetic Decoding Method in a Phoneme Recognition Problem. In: Drugman, T., Dutoit, T. (eds.) NOLISP 2013. LNCS, vol.?7911, pp. 176-83. Springer, Heidelberg (2013) CrossRef
5. Wang, H., Wang, Y., Cao, Y.: Video-based face recognition: a survey. World Academy of Science. Engineering and Technologies?60, 293-02 (2009)
6. Zadeh, L.A.: Fuzzy Sets. Information Control?8, 338-53 (1965) CrossRef
7. Sarkar, M.: Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets and Systems?158(19), 2134-152 (2007) CrossRef
8. Kullback, S.: Information Theory and Statistics. Dover Pub. (1997)
9. Anusuya, M.A., Katti, S.K.: Speech recognition by Machine: A Review. International Journal of Computer Science and Information Security?6(3), 181-05 (2009)
10. Kipyatkova, I.S., Karpov, A.A.: An Analytical Survey of Large Vocabulary Russian Speech Recognition Systems. SPIIRAS Proceedings?12, 7-0 (2010)
11. Keener, R.W.: Theoretical Statistics: Topics for a Core Course. Springer, New York (2010)
12. Reddy, D.R.: Speech recognition by machine: a review. Proceedings of the IEEE?64(4), 501-31 (1976) CrossRef
13. Hill, J.E.: The minimum of n independent normal distributions, http://www.untruth.org/~josh/math/normal-min.pdf
14. Savchenko, A.V.: Adaptive Video Image image Recognition recognition System Using using a Committee committee Machinemachine. Optical Memory and Neural Networks (Information Optics)?21(4), 219-26 (2012) CrossRef
15. Specht, D.F.: Probabilistic neural networks. Neural Networks?3(1), 109-18 (1990) CrossRef
16. Itakura, F., Saito, S.: An analysis–synthesis telephony based on the maximum likelihood method. In: Proc. of International Congress on Acoustics c-5-5, vol.?5, pp. 17-0 (1968)
17. Basseville, M.: Distance measures for signal processing and pattern recognition. Signal Processing?18, 349-69 (1989) CrossRef
18. Mérialdo, B.: Multilevel Decoding for Very-Large-Size-Dictionary Speech Recognition. IBM Journal of Research and Development?32(2), 227-37 (1988) CrossRef
19. Sirigos, J., Fakotakis, N., Kokkinakis, G.: A hybrid syllable recognition system based on vowel spotting. Speech Communication?38, 427-40 (2002) CrossRef
20. Savchenko, A.V.: Phonetic words decoding software in the problem of Russian speech recognition. Automation and Remote Control?74(7), 1225-232 (2013) CrossRef
21. Savchenko, A.V.: Phonetic encoding method in the isolated words recognition problem. Journal of Communications Technology and Electronics?59(4), 310-15 (2014) CrossRef
22. CMU Sphinx, http://cmusphinx.sourceforge.net/
作者单位：Andrey V. Savchenko (25)
Lyudmila V. Savchenko (26)

25. National Research University Higher School of Economics, Nizhny Novgorod, Russia
26. Nizhny Novgorod State Linguistic University, Russia
ISSN：1611-3349

文摘

The problem of recognition of a sequence of objects (e.g., video-based image recognition, phoneme recognition) is explored. The generalization of the fuzzy phonetic decoding method is proposed by assuming the distribution of the classified object to be of exponential type. Its preliminary phase includes association of each model object with the fuzzy set of model classes with grades of membership defined as the confusion probabilities estimated with the Kullback-Leibler divergence between model distributions. At first, each object (e.g., frame) in a classified sequence is put in correspondence with the fuzzy set which grades are defined as the posterior probabilities. Next, this fuzzy set is intersected with the fuzzy set corresponding to the nearest neighbor. Finally, the arithmetic mean of these fuzzy intersections is assigned to the decision for the whole sequence. In this paper we propose not to limit the method’s usage with the Kullback-Leibler discrimination and to estimate the grades of membership of models and query objects based on an arbitrary distance with appropriate scale factor. The experimental results in the problem of isolated Russian vowel phonemes and words recognition for state-of-the-art measures of similarity are presented. It is shown that the correct choice of the scale parameter can significantly increase the recognition accuracy.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700