基于仿射子空间稀疏表示的半监督分类

英文篇名：Semi-supervised classification based on affine subspace sparse representation
作者：古楠楠 ; 樊明宇 ; 王迪 ; 贾立好 ; 杜亮
英文作者：GU NanNan;FAN MingYu;WANG Di;JIA Li Hao;DU Liang;School of Statistics, Capital University of Economics and Business;College of Mathematics & Information Science, Wenzhou University;Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences;School of Computer and Information Technology, Shanxi University;
关键词：分类算法 ; 数据稀疏分布 ; 图方法 ; 信号分类 ; 最小二乘方法
英文关键词：classification algorithms,sparse data distributions,graphic methods,classification of signals,least squares method
中文刊名：PZKX
英文刊名：Scientia Sinica(Informationis)
机构：首都经济贸易大学统计学院;温州大学数学与信息科学学院;中国科学院自动化研究所精密感知与控制研究中心;山西大学计算机与信息技术学院;
出版日期：2015-08-20
出版单位：中国科学:信息科学
年：2015
期：v.45
基金：国家自然科学基金项目(批准号:11426159,61473212,61203241,61472285,61305035);; 浙江省自然科学基金项目(批准号:LY15F030011,LQ13F030009);; 浙江省科技厅项目(批准号:2014C1062);; 北京市自然科学基金项目(批准号:4154087);; 首都经济贸易大学校级科研项目(批准号:2015XJQ011)资助
语种：中文;
页：PZKX201508003
页数：16
CN：08
ISSN：11-5846/TP
分类号：37-52

摘要

基于图的半监督分类是近年来机器学习与数据挖掘领域的研究热点之一.该类方法一般通过构造图来挖掘数据中所蕴含的本质结构,并进一步利用图的结构信息帮助对无标签样本进行分类.一般来说,基于图的半监督分类方法的效果高度依赖于其构造的图.本文提出了一种基于仿射子空间稀疏表示的图构造方法,该稀疏编码方法在最小化输入信号重构误差时考虑了3个约束条件:(1)输入信号能够被字典矩阵的仿射组合近似表示;(2)线性表示系数的非负性约束;(3)线性表示系数的稀疏性约束.根据这3个约束,我们构造了基于l0-范数的稀疏编码的约束优化问题,提出相应近似求解方法,并进而构造了数据的l0-图.最后,在正则化学习理论框架下,通过引进度量l0-图中结构保持误差的正则项,提出了一种新的半监督学习方法.该方法具有显性的多类分类函数,同时也继承了由数据稀疏编码所得l0-图中蕴含的强判别信息,因此对外样本具有快速和准确的分类能力.一系列人工数据与现实采集的数据集上的实验结果验证了所提半监督分类方法的有效性.
Graph-based semi-supervised classification is one of the hottest research areas in machine learning and data mining. These methods usually model an entire dataset as a graph, then utilize the structure information extracted by the graph to help with the classification of unlabeled data. Generally speaking, the performance of graph-based semi-supervised classification methods highly depends on the constructed graphs. In this paper, we propose a new kind of graph construction method based on affine subspace sparse representation. The proposed sparse coding method minimizes the construction error of the input signal, considering three constraints:(1) the input signal being approximately reconstructed by the affine combination of the dictionary;(2) the nonnegativity constraint of the reconstruction coefficients;(3) the sparsity constraint of the reconstruction coefficients. Based on the constraints, we present the l0-norm constrained optimization problem for sparse coding; then, we propose the algorithm to solve the problem and further construct the l0-graph of data. Finally, under the manifold regularization framework, we propose a new kind of semi-supervised classification method by introducing the regularization term that measures the structure preserving error of the l0-graph. The proposed semi-supervised classification method has an explicit multiclass classification function and inherits the strong discriminative information from sparse representation. As a result, it has efficient and effective classification ability. Experimental results on artificial and real-world datasets are provided to show the effectiveness of the proposed method.

引文

1 Chen K,Wang S H.Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions.IEEE Trans Pattern Anal Mach Intell,2011,33:129–143
    2 Xiang S M,Nie F P,Zhang C S.Semi-supervised classification via local spline regression.IEEE Trans Pattern Anal Mach Intell,2010,32:2039–2053
    3 Wang Y Y,Chen S C,Zhou Z H.New semi-supervised classification method based on modified cluster assumption.IEEE Trans Neural Netw Learn Syst,2012,23:689–702
    4 Chapelle O,Scholkoph B.Semi-Supervised Learning(Adaptive Computation and Machine Learning).Cambridge:The MIT Press,2006
    5 Zhu X J.Semi-supervised Learning Literature Survey.Technical Report 1530.Madison:University of Wisconsin,2006
    6 Nigam K,Mc Callum A,Thrun S,et al.Text classification from labeled and unlabeled documents using EM.Mach Learn,2000,39:103–134
    7 Fujino A,Ueda N,Saito K.Semi-supervised learning for a hybrid generative/discriminative classifier based on the maximum entropy principle.IEEE Trans Pattern Anal Mach Intell,2008,30:424–437
    8 Maulik U,Chakraborty D.A self-trained ensemble with semisupervised SVM:an application to pixel classification of remote sensing imagery.Pattern Recognit,2011,44:615–623
    9 Dopido I,Li J,Marpu P R,et al.Semisupervised self-learning for hyperspectral image classification.IEEE Trans Geosci Rem Sens,2013,51:4032–4044
    10 Blum A,Mitchell T.Combining labeled and unlabeled data with co-training.In:Proceedings of 11th Annual Conference on Computational Learning Theroy,COLT’98,Madison,1998.92–100
    11 Zhang M L,Zhou Z H.Co Trade:confident co-training with data editing.IEEE Trans Syst Man Cybern Part B-Cybern,2011,41:1612–1626
    12 Joachims T.Transductive inference for text classification using support vector machines.In:Proceedings of 16th International Conference on Machine Learning,Bled,1999.200–209
    13 Qi Z Q,Tian Y J,Shi Y.Laplacian twin support vector machine for semi-supervised classification.Neural Netw,2012,35 :46–53
    14 Kang B Y,Ko S,Kim D W.SICAGO:semi-supervised cluster analysis using semantic distance between gene pairs in gene ontology.Bioinformatics,2010,26:1384–1385
    15 Luo Y,Tao D C,Geng B,et al.Manifold regularized multitask learning for semi-supervised multilabel image classification.IEEE Trans Image Process,2013,22:523–536
    16 Badrinarayanan V,Budvytis I,Cipolla R.Semi-supervised video segmentation using tree structured graphical models.IEEE Trans Pattern Anal Mach Intell,2013,35:2751–2764
    17 Zhu X,Ghahramani Z,Lafferty J.Semi-supervised learning using gaussian fields and harmonic functions.In:Proceedings of the 20th International Conference on Machine Learning,2003.912–919
    18 Zhou D,Bousquet O,Lal T,et al.Learning with local and global consistency.In:Proceedings of the Neural Information Processing Systems Conference,Vancouver,2004.321–328
    19 Wang F,Zhang C S.Label propagation through linear neighborhoods.IEEE Trans Knowl Data Eng,2008,20:55–67
    20 Belkin M,Sindhwani V,Niyogi P.Manifold regularization:a geometric framework for learning from labeled and unlabeled examples.J Mach Learn Res,2006,7:2399–2434
    21 Bruckstein A M,Donoho D L,Elad M.From sparse solutions of systems of equations to sparse modeling of signals and images.SIAM Rev,2009,51:34–81
    22 Wright J,Yang A,Ganesh A,et al.Robust face recognition via sparse representation.IEEE Trans Pattern Anal Mach Intell,2009,31:210–227
    23 Cheng B,Yang J C,Yan S C,et al.Learning with?1-graph for image analysis.IEEE Trans Image Process,2010,19:858 –866
    24 Yan S,Wang H.Semi-supervised learning by sparse representation.In:Proceedings of SIAM International Conference on Data Mining,SDM’09,Sparks,2009.792–801
    25 Pati Y C,Rezaiifar R,Krishnaprasad P S.Orthogonal matching pursuit:recursive function approximation with applications to wavelet decomposition.In:Proceedings of The Twenty-Seventh Asilomar Conference on Signals,Systems and Computers,Pacific Grove,1993.40–44
    26 Blumensath T,Davies M E.Iterative hard thresholding for compressive sensing.Appl Comput Harmon Anal,2009,27 :265–274
    27 Donoho D.For most large underdetermined systems of linear equations the minimal?1-norm solution is also the sparsest solution.Comm Pure Appl Math,2006,59:797–829
    28 Monteiro R,Adler I.Interior path following primal-dual algorithms.Part I:Linear programming.Math Program,1989,44:27–41
    29 Chen S S,Donoho D L,Saunders M A.Atomic decomposition by basis pursuit.SIAM Rev,2001,43:129–159
    30 Malioutov D,Cetin M,Willsky A.Homotopy continuation for sparse signal representation.In:Proceedings of IEEE International Conference on Acoustics,Speech,and Signal Processing,ICASSP’05,Pennsylvania,2005.733–736
    31 Beck A,Teboulle M.A fast iterative shrinkage-thresholding algorithm for linear inverse problems.SIAM J Imaging Sci,2009,2:183–202
    32 Yang J F,Zhang Y.Alternating direction algorithms for?1-problems in compressive sensing.SIAM J Sci Comput,2011,33:250–278
    33 Yang A Y,Zhou Z H,Balasubramanian A G,et al.Fast?1-minimization algorithms for robust face recognition.IEEE Trans Image Process,2013,22:3234–3246
    34 Hull J J.A database for handwritten text recognition research.IEEE Trans Pattern Anal Mach Intell,1998,16:550 –554
    35 Samaria F S,Harter A C.Parameterisation of a stochastic model for human face identification.In:Proceedings of the Second IEEE Workshop on Applications of Computer Vision,WACV 1994,Sarasota,1994.138–142
    36 Lu Z S,Zhang Y.Sparse approximation via penalty decomposition methods.SIAM J Optim,2013,23:2448–2478
    37 Candes E,Romberg J.?1-MAGIC:Recovery of Sparse Signals via Convex Programming.Technical Report.2005
    38 Xu Z B,Dai M W,Meng D Y.Fast and efficient strategies for model selection of support vector machines.IEEE Trans Syst Man Cybern Part B-Cybern,2009,39:1292–1307
    1)http://www.cad.zju.edu.cn/home/dengcai/Data/data.html.
    2 )http://mldata.org/repository/tags/data/IDA Benchmark Repository/.
    3 )http://archive.ics.uci.edu/ml/datasets.html.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700