Clustering and maximum likelihood search for efficient statistical classification with medium-sized databases
详细信息    查看全文
文摘
This paper addresses the problem of insufficient performance of statistical classification with the medium-sized database (thousands of classes). Each object is represented as a sequence of independent segments. Each segment is defined as a random sample of independent features with the distribution of multivariate exponential type. To increase the speed of the optimal Kullback–Leibler minimum information discrimination principle, we apply the clustering of the training set and an approximate nearest neighbor search of the input object in a set of cluster medoids. By using the asymptotic properties of the Kullback–Leibler divergence, we propose the maximal likelihood search procedure. In this method the medoid to check is selected from the cluster with the maximal joint density (likelihood) of the distances to the previously checked medoids. Experimental results in image recognition with artificially generated dataset and Essex facial database prove that the proposed approach is much more effective, than an exhaustive search and the known approximate nearest neighbor methods from FLANN and NonMetricSpace libraries.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700