基于图和熵正则化的半监督分类算法

英文题名：Semi-supervised Classification Algorithm Based on Graph and Entropy Regularization
作者：刘小兰
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：半监督学习 ; 图方法 ; 稀疏表达 ; 分式二次规划 ; 熵正则化
英文关键词：Semi-supervised learning ; Graph-based method ; Sparse representation ; Fractional quadratic program ; Entropy regularization
学位年度：2011
导师：郝志峰
学科代码：081203
学位授予单位：华南理工大学
论文提交日期：2011-04-08

摘要

半监督学习(Semi-supervised Leaning,SSL)试图利用大量的无标记样本学习数据的内在几何结构,在此基础上利用少量的有标记样本完成降维、分类和回归等任务。由于SSL在减少人工标注代价、提高机器学习性能方面的突出优势,以及在网页检索、文本分类、基于生物特征的身份识别和医疗诊断等领域应用的广泛性,从上世纪90年代开始,它就在机器学习界引起了关注。目前,SSL已成为机器学习研究中最受关注的问题之一。
     本文在分析了SSL的发展现状和目前仍存在的问题的基础上,对基于图和熵正则化的半监督分类学习中的若干重要问题进行了研究,具体研究内容和成果如下:
     1、数据图的构造。数据图的构造是设计基于图的SSL算法的第一步。大多数传统数据图构造方法是参数依赖的,且对参数较敏感;另一方面,最近提出的基于稀疏表达的最小化L1模构造模型不能保证非负解,因此不能直接用作图上边的权重。针对这些不足,提出了两个基于非负稀疏表达的最小化L1模构造模型:L1_IMP和L1_IMPv。两个新模型在现有最小化L1模构造模型的基础上增加了非负约束,从而使得模型的稀疏解不仅可以反映成对样本间的紧密程度,而且可以直接用作图上边的权重。此外,新的图构造方法可以在确定图的邻接结构的同时完成边的权重计算。结合标记传播算法,在UCI和人脸数据集上的实验结果表明,L1_IMP和L1_IMPv在大多数情况下的分类效果优于传统方法。
     2、基于不相似性的图SSL算法。负相似性在协同过滤等问题中经常出现。针对目前提出的大部分图SSL算法都不能处理不相似性或负相似性的不足,提出了一个基于负相似性的图SSL模型SMLP。SMLP的优化目标是如下两个量的比值:类标记和正相似性的不一致性以及类标记和负相似性的一致性;同时,SMLP允许有标记样本的标记予以重新标记,运用一种全局优化方法求解SMLP,可以在O ( n~3 logε~(-1) )时间内获得一个ε-最优解。在UCI数据集和协同过滤问题上验证了SMLP算法的有效性。
     3、适于处理标记有噪声数据的图SSL算法。算法的基本思路是运用软标记方法来处理标记有噪声数据。首先,利用各种标记软化方法将样本的类标记转化为软标记,相比硬标记,软标记可以更好地容纳监督者对模式类别的不确定性。在此基础上,嵌入现有的基于图的SSL算法LGC,以达到预期目的。在有类重叠的UCI和物体识别数据集上的实验表明,与基于硬标记的LGC算法相比,基于软标记的LGC算法可以更好地用于标记有噪声数据的半监督分类学习。
     4、基于熵正则化的SSL算法。提出了一个基于条件Havrda-Charvat’s Structuralα-熵正则化的直推式半监督分类模型MinEnt。MinEnt的基本思想是:一个好的聚类标准是对无标记样本的一个好的刻画。在MinEnt模型中,用条件Havrda-Charvat’s Structuralα-熵聚类标准刻画无标记样本及其所属类别之间的关系,同时对有标记样本采用其对数似然函数。设计了基于拟牛顿法的求解算法。所提出的算法是判别式的,降低了对模型的依赖程度;同时,它可以预测样本空间中任何一个样本的标记,是一种直推式方法。在UCI数据集上的仿真实验验证了该算法的有效性。
Semi-supervised learning (SSL) tries to discover the intrinsic structure of the given data by use of lot of unlabeled data, on the basis of which, it finishes the task of dimensionality reduction, classification and regression by making use of few labeled data. Because of its prominent advantage of reducing the cost of labeling manually and improving the performance of machine learning, and its widespread popularity in web page retrieval, text classification, personal identification based on biometrics feature and medical diagnosis, SSL has received the attention of machine learning community since 1990. Now, SSL becomes one of the most active research areas in the machine learning field. Based on analyzing the state of the art and the existing problems of SSL, the thesis mainly investigates some key issues of graph-based and entropy regularization SSL. The contributions are as follows:
     1. Graph construction. Graph construction is the first step of graph-based SSL algorithm. Most traditional graph construction methods depend on parameters and are sensitive to these parameters. The solutions of the recently proposed L1 norm reconstruction error minimization graph construction models based on sparse representation may be negative, so they can not be used as the graph weights directly. According to these deficiencies, two L1 norm minimization graph construction models based on nonnegative sparse representation named L1_IMP and L1_IMPv which add nonnegative constraints to the existing L1 norm minimization models are proposed. The solutions of the proposed models can not only reflect the close relation between the sample pairs, but also can be used as the graph weights directly. Moreover, L1_IMP and L1_IMPv complete the neighborhood graph construction and graph weights calculation within one step. Experimental results on UCI and face recognition datasets show that the classification accuracy of the label propagation algorithms using L1_IMP and L1_IMPv are better than that of the label propagation algorithms using traditional graph construction methods in most cases.
     2. Graph-based SSL algorithm by dissimilarity. Dissimilarity, or negative similarity frequently appears in many practical applications such as collaborative filtering problem. Considering that most graph-based SSL algorithms can not deal with negative similarity, a graph-based SSL model based on negative similarity named SMLP is proposed. The optimization objective of SMLP is the ratio between the following inconsistency and consistency: the inconsistency between the class assignment and the positive similarity, and the consistency between the class assignment and the negative similarity. Also SMLP allows the labeled data to be relabeled. A global optimal algorithm is applied for solving SMLP, yielding anε?global optimal solution in a computational effort of O ( n 3 logε?1 ). Experimental results on UCI datasets and collaborative filtering problem verify the effectiveness of SMLP algorithm.
     3. Graph-based SSL algorithm for misclassified data. We use soft labels to deal with misclassified data in the circumstance of SSL. First hard labels of labeled samples are converted to soft labels by several existing methods which can accommodate the uncertainty of an external teacher about uncertain patterns better than hard labels. Then soft labels are embedded into the existing graph-based SSL algorithm LGC to deal with the misclassified data. Experimental results on UCI and object recognition datasets with some classes overlapping show that LGC with soft labels is more resistant to label errors compared with LGC with hard labels.
     4. SSL algorithm based on minimum entropy regularization. A discriminative SSL classification model named MinEnt is established based on the conditional Havrda-Charvat’s Structuralα-entropy regularization. The basic idea of MinEnt is that a good clustering criterion is also a good description of the unlabeled data. In MinEnt, conditional Havrda-Charvat’s Structuralα-entropy clustering criterion is used to describe the relation of unlabeled data and theirs labels, log likelihood function is used to describe the labeled data and Quasi-Newton method is applied for solving MinEnt. The proposed algorithm is discriminative which has less dependence on the model selection. Moreover, the proposed algorithm is inductive, so it can predict the labels of the out of the samples easily. Experimental results on several UCI datasets demonstrate the effectiveness of the proposed algorithm.

引文

[1]翟德明.半监督判别分析方法研究[D].哈尔滨:哈尔滨工业大学,2009
    [2]周志华,王珏.机器学习及应用[M].北京:清华大学出版社,2007,259-275
    [3] Zhu X J, Goldberg A B. Introduction to Semi-supervised Learning[M]. Morgan & Claypool, 2009
    [4] Chapelle O, Sch?lkopf B, Zien A. Semi-supervised Learning[M]. Cambridge, England: MIT Press, 2006
    [5] Joachims T. Transductive Inference for Text Classification using Support Vector Machines[A]. Proceedings of the 16th International Conference on Machine Learning[C]. San Francisco, CA: Morgan Kaufmann, 1999: 200-209
    [6] Chapelle O, Zien A. Semi-supervised Classification by Low Density Separation[A]. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics[C]. New Jersey, USA: Society for Artificial Intelligence and Statistics, 2005: 57-64
    [7] Reddy I S, Shevade S, Murty M. A Fast Quasi-Newton Method for Semi-supervised SVM[J]. Pattern Recognition, Available online 8 September 2010
    [8] Zhu X J, Lafferty J, Ghahramani Z. Semi-supervised Learning using Gaussian Fields and Harmonic Functions[A]. Proceedings of the 20th International Conference on Machine Learning[C]. Menlo Park, California: AAAI, 2003: 912-919
    [9] Mantrach A, Van Z N, Francq P et al. Semi-supervised Classification and Betweenness Computation on Large, Sparse, Directed Graphs[J]. Pattern Recognition, 2011, 44 (6): 1212-1224
    [10] Fan M, Gu N, Qiao H, Zhang B. Sparse Regularization for Semi-supervised Classification[J]. Pattern Recognition, 2011, 44(8): 1777-1784
    [11] Culp M, Michailidis G. An Iterative Algorithm for Extending Learners to a Semi-supervised Setting[J]. Journal of Computational and Graphical Statistics, 2008, 17 (3): 545-571
    [12] Meng J, Ding T, Wang X. Data Editing based Self-training Algorithm[J]. Journal of Computational Information Systems, 2009, 5 (3): 1373-1378
    [13] Li Y, Guan C, Li H et al. A Self-training Semi-supervised SVM Algorithm and its Application in an EEG-based Brain Computer Interface Speller System[J]. Pattern Recognition Letters, 2008, 29 (9): 1285-1294
    [14] Maulik U, Chakraborty D. A Self-trained Ensemble with Semisupervised SVM: An Application to Pixel Classification of Remote Sensing Imagery[J]. Pattern Recognition. 2011, 44 (3): 615-623
    [15] Nigam K, McCallum A, Thrun S et al. Learning to Classify Text from Labeled and Unlabeled Documents[A]. Proceedings of the 15th National Conference on Artificial Intelligence[C]. Menlo Park, California: AAAI Press, 1998: 792–799
    [16] Huang J T, Johnson M H. On Semi-supervised Learning of Gaussian Mixture Models for Phonetic Classification[A]. Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing[C]. Boulder, Colorado, USA, 2009: 75-83
    [17] Ji S H, Watson L T, Carin L. Semi-supervised Learning of Hidden Markov Models via a Homotopymethod[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(2):275–287
    [18] Fujino A, Ueda N, Saito K. Semi-supervised Learning for a Hybrid Generative/Discriminative Classifier based on the Maximum Entropy Principle[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(3): 424-437
    [19] Gregory D, Chris P, Zhu X J et al. A Semi-supervised Classification with Hybrid Generative/Discriminative Methods[A]. Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C]. New York, NY, USA: ACM, 2007: 280-289
    [20] Lasserre J A, Bishop C M, Minka T P. Principled Hybrids of Generative and Discriminative Models[A]. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition[C]. Washington, DC, USA: IEEE Computer Society, 2006: 87-94
    [21] Cozman, F, Cohen I, Cirelo M. Semi-supervised Learning of Mixture Models[A]. Proceedings of the 20th International Conference on Machine Learning[C]. Menlo Park, California: AAAI, 2003: 99-106
    [22] Nigam K, McCallum A, Thrun S et al. Text Classification from Labeled and Unlabeled Documents using EM[J]. Machine Learning, 2000, 39(2-3): 103–134
    [23] Callison-Burch C,Talbot D,Osborne M. Statistical Machine Translation with Word and Sentence Algned Parallel corpora[A]. Proceedings of the 42nd Meeting of the Association for Computational Linguistics[C]. San Francisco, CA: Morgan Kaufman, 2004: 175-182
    [24]张博锋,白冰,苏金树.基于自训练EM算法的半监督文本分类[J].国防科技大学学报.2007, 29(6):66-69
    [25]王立宏,赵宪佳,武栓虎.基于EM的启动子序列半监督学习[J].计算机研究与发展.2009,46(11):1942-1948
    [26] Dillon J V, Balasubramanian K, Lebanon G. Asymptotic Analysis of Generative Semi-Supervised Learning[A]. Proceedings of the 27th International Conference on Machine Learning[C]. Haifa, Israel: Omnipress, 2010:295-302
    [27] Druck G, McCallum A. High-Performance Semi-Supervised Learning using Discriminatively Constrained Generative Models[A]. Proceedings of the 27th International Conference on Machine Learning[C]. Haifa, Israel: Omnipress, 2010: 319-326
    [28] Blum A, Mitchell T. Combining Labeled and Unlabeled Data with Co-training[A]. Proceedings of the 11th Annual Conference on Computational Learning Theory[C]. New York: ACM, 1998: 92-100
    [29] Goldman S, Zhou Y. Enhancing Supervised Learning with Unlabeled Data[A]. Proceedings of the 17th International Conference on Machine Learning[C]. San Francisco, CA: Morgan Kaufmann, 2000: 327-334
    [30] Zhou Y, Goldman S. Democratic Co-learning[A]. Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence[C]. Washington, Dc, USA: IEEE Computer Society, 2004: 594–602
    [31] Zhou Z H, Li M. Tri-training: Exploiting Unlabeled Data using Three Classifiers[J]. IEEETransactions on Knowledge and Data Engineering, 2005, 17(11): 1529–1541
    [32] Li M, Zhou Z H. Learning Techniques using Undiagnosed Samples[J]. IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and humans, 2007, 37(6): 1088-1098
    [33]詹永照,陈亚必.具有噪声过滤功能的协同训练半监督主动学习算法[J].模式识别与人工智能.2009,22(5):750-755
    [34] Yu Z T, Su L, Li L et al. Question Classification Based on Co-training Style Semi-supervised Learning[J]. Pattern Recognition Letters, 2010, 31(13): 1975-1980
    [35] Dasgupta S, Littman M, McAllester D. PAC Generalization Bounds for Co-training[A]. Advances in Neural Information Processing Systems 14[C]. Cambridge, MA: MIT Press, 2002, 375-382
    [36] Balcan M F, Blum A, Yang K. Co-training and Expansion: Towards Bridging Theory and Practice[A]. Advances in Neural Information Processing Systems 17[C]. Cambridge, MA: MIT Press, 2005, 89-96
    [37] Wang W, Zhou Z H. Analyzing Co-training Style Algorithms[A]. Proceedings of the 18th European Conference on Machine Learning[C]. Warsaw, Poland, 2007: 454-465
    [38] Vapnik V. Statistical Learning Theory[M]. Wiley, New York, 1998
    [39] De Bie T, Cristianini, N. Convex Methods for Transduction[A]. Advances in Neural Information Processing Systems 16[C]. Cambridge, MA: MIT Press, 2004
    [40] Sindhwani V, Keerthi S, Chapelle O. Deterministic Annealing for Semi-supervised Kernel Machines[A]. Proceedings of the 23rd International Conference on Machine Learning[C]. New York, USA: ACM, 2006: 841-848
    [41] Chapelle O, Sindhwani V, Keerthi S S. Branch and Bound for Semisupervised Support Vector Machines[A]. Advances in Neural Information Processing Systems 19[C]. Cambridge, MA: MIT Press, 2007: 217-256
    [42]陈毅松,汪国平,董士海.基于支持向量机的渐进直推式分类学习算法[J].软件学报,2003,14(3):451-460
    [43]廖东平等.一种改进的渐进直推式支持向量机分类学习算法[J].信号处理.2008,24(2):213-218
    [44]薛贞霞,刘三阳,刘万里.基于SVDD的渐进直推式支持向量机学习算法[J].模式识别与人工智能.2008,21(6):721-727
    [45]赵莹,张健沛,杨静等.一种改进的分枝定界半监督支持向量机学习算法[J].电子学报.2010,38(2):449-454
    [46]皋军,王士同,邓赵红.基于全局和局部保持的半监督支持向量机[J].电子学报.2010,38(7):1626-1633
    [47] Szummer M, Jaakkola T. Information Regularization with Partially Labeled Data[A]. Advances in Neural Information Processing Systems 15[C]. Cambridge, MA: MIT Press, 2003
    [48] Lawrence N D, Jordan M I. Semi-supervised Learning via Gaussian Processes[A].Advances in Neural Information Processing Systems 17[C]. Cambridge, MA: MIT Press, 2005: 753-760
    [49] Grandvalet Y, Bengio Y. Semi-supervised Learning by Entropy Minimization[A]. Advances inNeural Information Processing Systems 17[C]. MIT Press, Cambridge, MA, 2005: 529-536
    [50] Blum A, Chawla S. Learning from Labeled and Unlabeled Data using Graph Mincuts[A]. Proceedings of the 18th International Conference on Machine Learning[C]. Francisco, CA: Morgan Kaufmann, 2001: 19-26
    [51] Zhu X J. Semi-supervised Learning with Graphs[D]. USA: Carnegie Mellon University, 2005
    [52] Joachims T. Transductive Learning via Spectral Graph Partitioning[A]. Proceedings of the 20th International Conference on Machine Learning[C]. Menlo Park, California: AAAI, 2003: 290-297
    [53] Zhou D Y, Bousquet O, Lal T et al. Learning with Local and Global Consistency[A]. Advances in Neural Information Processing Systems 16[C]. 2004, 16: 321-328
    [54] Belkin M, Niyogi P, Sindhwani V. On Manifold Regularization[A]. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics[C]. New Jersey, USA: Society for Artificial Intelligence and Statistics, 2005
    [55] Wang F, Zhang C S. Label Propagation through Linear Neighborhoods[J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(1): 55-67
    [56] Wu M, Scholkopf B. Transductive Classification via Local Learning Regularization[A]. Proceedings of the 11th International Conference on Artificial Intelligence and Statistics[C]. San Juan, Puerto Rico, 2007: 628-635
    [57] Wang J, Jebara T, Chang S. Graph Transduction via Alternating Minimization[A]. Proceedings of the 25th Annual International Conference on Machine Learning[C]. New York, USA: ACM, 2008: 1144–1151
    [58] Chen K, Wang S H. Regularized Boost for Semi-supervised Learning[A]. Advances in Neural Information Processing Systems 20[C]. Cambridge, MA: MIT Press, 2008: 281–288
    [59] Balcan M F, Blum A, Choi P P et al. Person Identification in Webcam Images: An Application of Semi-supervised Learning[A]. ICML 2005 Workshop on Learning with Partially Classified Training Data[C]. New York, USA: ACM, 2005: 1-9
    [60] Miguel A, Perpinan C, Zemel R S. Proximity Graphs for Clustering and Manifold Learning[A]. Advances in Neural Information Processing Systems 17[C]. Cambridge, MA: MIT, 2005: 225-232
    [61] Hein M, Audibert J Y, Luxburg U V. Graph Laplacians and their Convergence on Random Neighborhood Graphs[J]. Journal of Machine Learning Research, 2007, 8: 1325–1368
    [62] Cheng H, Liu Z C, Yang J. Sparsity Induced Similarity Measure for Label Propagation[A]. IEEE International Conference on Computer Vision[C]. Kyoto, Japan, 2009: 317-324
    [63] Yan S C, Wang H. Semi-supervised Learning by Sparse Representation[A]. SIAM International Conference on Data Mining[C]. 2009: 792-801
    [64] Argyriou A. Efficient Approximation Methods for Harmonic Semi-supervised Learning[D]. University College London: 2004
    [65] Delalleau O, Bengio Y, Roux N L. Efficient Non-parametric Function Induction in Semi-supervised Learning[A]. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics[C]. New Jersey, USA: Society for Artificial Intelligence and Statistics, 2005: 96-103
    [66] Herbster M, Pontil M, and Galeano S R. Fast prediction on a tree[A]. Advances in Neural Information Processing Systems 21[C]. Cambridge, MA: MIT, 2009: 657-664
    [67] Tsang I, Kwok J. Large-scale Sparsified Manifold Regularization[A]. Advances in Neural Information Processing Systems 18[C]. Cambridge, MA: MIT, 2006:1401-1408
    [68] Zhang K, Kwok J T, and Parvin B. Prototype Vector Machine for Large Scale Semi-supervised Learning[A]. Proceedings of the 26th Annual International Conference on Machine Learning[C]. New York, USA: ACM, 2009: 1232-1239
    [69] Liu W, He J F, Chang S F. Large Graph Construction for Scalable Semi-Supervised Learning[A]. Proceedings of the 27th Annual International Conference on Machine Learning[C]. Haifa, Israel: Omnipress, 2010
    [70] Beck A, Ben-Tal A and Teboulle M. Finding a Global Optimal Solution for a Quadratically Constrained Fractional Quadratic Problem with Applications to the Regularized Total Least Squares[J]. SIAM Journal on Matrix Analysis and Applications, 2007, 28(2): 425-445
    [71] Dietterich T G. Ensemble Methods in Machine Learning[A]. Proceedings of the 1st International Workshop on Multiple Classifier Systems[C]. London, UK: Springer-Verlag, 2000: 1-15
    [72] Bennett K P, Demiriz A. Semi-supervised Support Vector Machines[A]. Advances in Neural Information Processing Systems 10[C]. Cambridge, MA: MIT, 1998: 368-374
    [73]廖东平等.一种快速的渐进直推式支持向量机分类学习算法[J].系统工程与电子技术.2007,29(1):87-91
    [74] Blum A, Lafferty J, Rwebangira M et al. Semi-supervised Learning using Randomized Mincuts[A]. Proceedings of the 21st International Conference on Machine Learning[C]. New York, USA: ACM, 2004: 97-104
    [75] Pang B, Lee L. A Sentimental Education: Sentiment Analysis using Subjectivity Summarization Based on Minimum Cuts[A]. Proceedings of the Association for Computational Linguistics[C]. 2004: 271–278
    [76] Grady L, Funka-Lea G. Multi-label Image Segmentation for Medical Applications Based on Graph-theoretic Electrical Potentials[A]. Proc. Workshop Computer Vision and Math. Methods in Medical and Biomedical Image Analysis[C]. Springer, 2004: 230-245
    [77] Levin A, Lischinski D, Weiss Y. Colorization using Optimization[J]. ACM Transactions on Graphics. 2004: 689-694
    [78] Niu Z Y, Ji D H, Tan C L. Word Sense Disambiguation using Label Propagation Based Semi-supervised Learning[A]. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics[C]. Association for Computational Linguistics, 2005: 395-402
    [79] Goldberg A, Zhu X. Seeing Stars When there aren't Many Stars: Graph-based Semi-supervised Learning for Sentiment Categorization[A]. Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing[C]. New York, USA: ACM, 2006: 45-52
    [80] Sindhwani V, Niyogi P, Belkin M. Beyond the Point Cloud: from Transductive to Semi-supervised Learning[A]. Proceedings of the 22nd International Conference on Machine Learning[C]. New York,USA: ACM, 2005: 824-831
    [81] Huang T M, Kecman V. Performance Comparisons of Semi-Supervised Learning Algorithms[A]. Proceedings of the 22nd International Conference on Machine Learning[C]. New York, USA: ACM, 2005: 45-49
    [82] Talukdar P P. Topics in Graph Construction for Semi-supervised Learning[R]. Technical report MS-CIS-09-13, University of Pennsylvania: 2009
    [83] Jebara T, Wang J, Chang S F. Graph Construction and B-matching for Semi-supervised Learning[A]. Proceedings of the 26th International Conference on Machine Learning[C]. New York, USA: ACM, 2009: 441-448
    [84] Roweis S, Saul L. Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000, 290: 2323-2326
    [85] Wright J, Ganesh A, Yang A et al. Robust Face Recognition via Sparse Representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(2): 210-227
    [86] Donoho D. For Most Large Underdetermined Systems of Linear Equations the Minimal L1-norm Solution is also the Sparsest Solution[J]. Comm. On Pure and Applied Math, 2006, 59( 6): 797–829
    [87] Candès E, Tao T. Near-optimal Signal Recovery from Random Projections: Universal Encoding Strategies[J]. IEEE Trans. Information Theory, 2006, 52(12): 5406–5425
    [88] Chen S, Donoho D, Saunders M. Atomic Decomposition by Basis Pursuit[J]. SIAM Rev., 2001, 43(1): 129-159
    [89] Donoho.D, Tsaig Y. Fast Solution of L1-norm Minimization Problems when the Solution May be Sparse[J]. IEEE Transactions on InformationTtheory, 2008, 54(11) : 4789-4812
    [90] Candès E, Romberg J. L1-magic: Recovery of Sparse Signals via Convex Programming[CP]. http:www.acm.caltech.edu/l1magic/, 2005.
    [91] Zhang Y, Yang J F, Yin W.YALL1[CP]. http://yall1.blogs.rice.edu/
    [92] Zhao B, Wang F, Zhang C S et al. Active Model Selection for Graph based Semi-Supervised Learning[A]. The 33rd International Conference on Acoustics, Speech, and Signal Processing[C]. Las Vegas, Nevada, 2008: 1881-1884
    [93] Tong W, Rong J. Semi-supervised Learning by Mixed Label Propagation[A]. Proceedings of the 22nd AAAI Conference on Artificial Intelligence[C]. Menlo Park, California: AAAI, 2007: 651-65
    [94] Goldberg A, Zhu X J and Wright S. Dissimilarity in Graph-based Semi-supervised Classification[A]. Proceedings of the 11th International Conference on Artificial Intelligence and Statistics[C]. New Jersey, USA: Society for Artificial Intelligence and Statistics ,2007: 155-162
    [95] Sturm J. Using SeDuMi 1.02, A Matlab Toolbox for Optimization Over Symmetric Cones[J]. Optimization Methods and Software. 1999, 11-12: 625—653
    [96] Melman A. A Unifying Convergence Analysis of Second-order Methods for Secular Equations[J]. Mathematics of Computation, 1997, 66: 333-344
    [97] Resnick P, Iavovou N, Sushak M et al. Grouplense: An Open Architecture for Collaborative Filtering of Netnews[A]. Proceedings of the Computer Supported Collaborative WorkConference[C]. New York, USA: ACM, 1994, 175-186
    [98]张忠平,郭献丽.一种新的用于Item-Based协同过滤算法的相似性度量的参考文献[J].小型微型计算机系统.2009,4:726-720
    [99] Ma H, King I, Lyu M R. Effective Missing Data Prediction or Collaborative Filtering[A]. Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C]. New York, USA: ACM, 2007: 39– 46
    [100]李聪,梁昌勇,马丽.基于领域最近邻的协同过滤推荐算法[J].计算机研究与发展.2008,45(9):1532-1538
    [101] Gayar N E, Schwenker F, and Palm G. A Study of the Robustness of KNN Classifiers Trained using Soft Labels[A]. Proceedings of the second IAPR Workshop, ANNPR 2006[C]. Germany: Springer Verlag, Berlin-Heidelberg, 2006: 67-80
    [102] Karmaker A, Kwek S. A Boosting Approach to Remove Class Label Noise[J]. International Journal of Hybrid Intelligent Systems, 2006, 3(3): 169-177
    [103] Lawrence N D, Sch?lkopf B. Estimating a Kernel Fisher Discriminant in the Presence of Label Noise[A]. Proceeding of the 18th International Conference on Machine Learning[C]. Francisco, CA: Morgan Kaufmann, 2001: 306-313
    [104] Liu Y L, Wessels L F A, Ridder D Det al. Classification in the Presence of Class Noise using a Probabilistic Kernel Fisher Method[J]. Patttern Recognition, 2007, 40(12): 3349-3357
    [105] Amini M R, Gallinari P. Semi-supervised Learning with an Explicit Label-error Model for Misclassified Data[A]. Proceedings of the 18th International Joint Conference on Artificial Intelligence[C]. Francisco, CA: Morgan Kaufmann, 2003: 555-560
    [106] Amini M R, Gallinari P. Semi-supervised Learning with an Imperfect Supervisor[J]. Knowledge and Information Systems, 2005, 8(4): 385-413
    [107] Thiel C. Classification on Soft Labels is Robust against Label Noise[J]. Knowledge-based intelligent information and engineering systems, 2008: 65-73
    [108] Pal S, Mitra S. Multilayer Perceptron Fuzzy Sets and Classification[J]. IEEE transactions on neural networks, 1992, 2(3): 683-697
    [109] Bouveyron C, Girard S. Robust Supervised Classification with Mixture Models: Learning from Data with Uncertain Labels[J]. Pattern Recognition, 2009, 42(11): 2649-265
    [110] Gates G. The Reduced Nearest Neighbor Rule[J]. IEEE transactions on Information Theory, 1972, 18(3): 431-433
    [111] Wilson D, Martinez T. Instance Pruning Techniques[A]. Proceedings of the 14th International Conference on Machine Learning[C]. San Francisco, CA: Morgan Kaufman, 1997: 404-411
    [112] Zhu X, Wu X and Chen Q. Eliminating Class Noise in Large Datasets[A]. Proceeding of the 20th International Conference on Machine Learning[C]. Menlo Park, California: AAAI, 2003: 920-927
    [113] Zeng X, Martinez T. A Noise Filtering Method using Neural Networks[A]. IEEE International Workshop on Soft Computing Techniques in Instrumentation, Measurement and Related Applications[C]. IEEE, 2003: 26-31
    [114] Guyon I, Matic N, Vapnik V. Discovering Informative Patterns and Data Cleaning[J]. Advances in Knowledge Discovery and Data Mining, 1996: 181-203
    [115] Bashir S, Carter E. High Breakdown Mixture Discriminant Analysis[J]. Journal of Multivariate Analysis, 2005, 93(1): 102-111
    [116] Sakakibara1993 Sakakibara Y. Noise-tolerant Occam Algorithms and Their Applications to Learning Decision Trees[J]. Journal of Machine Learning, 1993, 11(1): 37-62
    [117] Vannoorenbergue P, Denoeux T. Handling Uncertain Labels in Multiclass Problems using Belief Decision Trees[A]. Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems[C]. 2002: 1919-1926
    [118] Ludmila I K, Bezdek J C. An Integrated Framework for Generalized Nearest Prototype Classified Design[J]. Int. Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 1998, 6(5): 437-457
    [119] Li H F, Zhang K S, Jiang T. Minimum Entropy Clustering and Applications to Gene Expression Analysis[A]. Proceedings of the 3rd IEEE Computational Systems Bioinformatics Conference[C]. IEEE Computer Society, 2004

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700