文摘
The problem of recognition of a sequence of objects (e.g., video-based image recognition, phoneme recognition) is explored. The generalization of the fuzzy phonetic decoding method is proposed by assuming the distribution of the classified object to be of exponential type. Its preliminary phase includes association of each model object with the fuzzy set of model classes with grades of membership defined as the confusion probabilities estimated with the Kullback-Leibler divergence between model distributions. At first, each object (e.g., frame) in a classified sequence is put in correspondence with the fuzzy set which grades are defined as the posterior probabilities. Next, this fuzzy set is intersected with the fuzzy set corresponding to the nearest neighbor. Finally, the arithmetic mean of these fuzzy intersections is assigned to the decision for the whole sequence. In this paper we propose not to limit the method’s usage with the Kullback-Leibler discrimination and to estimate the grades of membership of models and query objects based on an arbitrary distance with appropriate scale factor. The experimental results in the problem of isolated Russian vowel phonemes and words recognition for state-of-the-art measures of similarity are presented. It is shown that the correct choice of the scale parameter can significantly increase the recognition accuracy.