Sparse modeling of high-dimensional data for learning and vision.

详细信息

作者：Yang ; Jianchao.
学历：Doctor
年：2011
导师：Huang, Thomas S.,eadvisor
毕业院校：University of Illinois
Department：Electrical & Computer Eng
ISBN：9781267277114
CBH：3503886
Country：USA
语种：English
FileSize：3469538
Pages：135

文摘

Sparse representations account for most or all of the information of a signal by a linear combination of a few elementary signals called atoms, and have increasingly become recognized as providing high performance for applications as diverse as noise reduction, compression, inpainting, compressive sensing, pattern classification, and blind source separation. In this dissertation, we learn the sparse representations of high-dimensional signals for various learning and vision tasks, including image classification, single image super-resolution, compressive sensing, and graph learning. Based on the bag-of-features BoF) image representation in a spatial pyramid, we first transform each local image descriptor into a sparse representation, and then these sparse representations are summarized into a fixed-length feature vector over different spatial locations across different spatial scales by max pooling. The proposed generic image feature representation properly handles the large in-class variance problem in image classification, and experiments on object recognition, scene classification, face recognition, gender recognition, and handwritten digit recognition all lead to state-of-the-art performances on the benchmark datasets. We cast the image super-resolution problem as one of recovering a high-resolution image patch for each low-resolution image patch based on recent sparse signal recovery theories, which state that, under mild conditions, a high-resolution signal can be recovered from its low-resolution version if the signal has a sparse representation in terms of some dictionary. We jointly learn the dictionaries for high- and low-resolution image patches and enforce them to have common sparse representations for better recovery. Furthermore, we employ image features and enforce patch overlapping constraints to improve prediction accuracy. Experiments show that the algorithm leads to surprisingly good results. Graph construction is critical for those graph-orientated algorithms designed for the purposes of data clustering, subspace learning, and semi-supervised learning. We model the graph construction problem, including neighbor selection and weight assignment, by finding the sparse representation of a data sample with respect to all other data samples. Since natural signals are high-dimensional signals of a low intrinsic dimension, projecting a signal onto the nearest and lowest dimensional linear subspace is more likely to find its kindred neighbors, and therefore improves the graph quality by avoiding many spurious connections. The proposed graph is informative, sparse, robust to noise, and adaptive to the neighborhood selection； it exhibits exceptionally high performance in various graph-based applications. To this end, we propose a generic dictionary training algorithm that learns more meaningful sparse representations for the above tasks. The dictionary learning algorithmis formulated as a bilevel optimization problem, which we prove can be solved using stochastic gradient descent. Applications of the generic dictionary training algorithm in supervised dictionary training for image classification, super-resolution, and compressive sensing demonstrate its effectiveness in sparse modeling of natural signals.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700