Clustering and maximum likelihood search for efficient statistical classification with medium-sized databases

详细信息查看全文

作者：Andrey V. Savchenko
关键词：Statistical classification ; Approximate nearest neighbor method ; Image recognition ; Kullback–Leibler discrimination ; Exponential family
刊名：Optimization Letters
出版年：2017
出版时间：February 2017
年：2017
卷：11
期：2
页码：329-341
全文大小：
刊物类别：Mathematics and Statistics
刊物主题：Optimization; Operation Research/Decision Theory; Computational Intelligence; Numerical and Computational Physics, Simulation;
出版者：Springer Berlin Heidelberg
ISSN：1862-4480
卷排序：11

文摘

This paper addresses the problem of insufficient performance of statistical classification with the medium-sized database (thousands of classes). Each object is represented as a sequence of independent segments. Each segment is defined as a random sample of independent features with the distribution of multivariate exponential type. To increase the speed of the optimal Kullback–Leibler minimum information discrimination principle, we apply the clustering of the training set and an approximate nearest neighbor search of the input object in a set of cluster medoids. By using the asymptotic properties of the Kullback–Leibler divergence, we propose the maximal likelihood search procedure. In this method the medoid to check is selected from the cluster with the maximal joint density (likelihood) of the distances to the previously checked medoids. Experimental results in image recognition with artificially generated dataset and Essex facial database prove that the proposed approach is much more effective, than an exhaustive search and the known approximate nearest neighbor methods from FLANN and NonMetricSpace libraries.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700