高维数据上的半监督学习研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着信息技术的快速发展,各个行业收集到的数据越来越多,如何有效地从这些数据中挖掘出有帮助作用的信息,可以极大地促进这些行业的发展。机器学习是数据挖掘、知识发现的基础之一,它是当前及未来计算机科学中的一个热点方向。传统的机器学习主要针对监督式学习问题,要求训练样本的标记齐全,处理的数据维度一般不高。然而,随着数据采集技术的发展和广泛使用,采集到的样本不仅属性数量多、属性之间高度相关,而且有标记的样本极少,传统的机器学习方法难以在这些样本上进行有效地学习,急需能够综合利用大量无标记样本和少量有标记样本的机器学习模式。
     半监督学习能够综合利用有标记样本和无标记样本来提高学习器的泛化能力,近年来成为机器学习领域的热点方向之一。当前的半监督学习方法,特别是基于图的半监督学习方法,很多都只关注如何平衡利用无标记样本和有标记样本,却忽视了另一个更基础和更重要的问题,那就是如何在这些样本上构造能够真实反映样本之间相似关系的图,特别是在高维样本上。因为随着样本维度的增多,噪声特征和冗余特征也被大量引入,很多常用的距离度量并不能较好地刻画样本之间的相似性关系,基于这些距离也不能定义结构良好的图,而基于图的半监督学习中正是通过图来综合利用有标记样本和无标记样本。因此,如何在高维样本上构造图,是基于图的半监督学习方法在高维数据上有效性的关键,这也是其它基于图的学习方法成败的关键。
     本文针对高维数据上基于图的半监督学习存在的问题,以如何在高维数据上构图为出发点,以提高半监督学习的精度为目标,以度量学习和集成学习为基本手段,对基于图的半监督维数约减、半监督分类和半监督多标记分类展开深入研究,提出了一些构图方案,并把它们结合到基于图的半监督学习中。全文的主要贡献包括:
     1、提出一种增强型保局投影方法(ELPP)及其半监督版本(SELPP)。ELPP针对经典的保局投影方法(LPP)在高维数据上降维效果不佳和参数敏感的问题,利用鲁棒路径相似性度量构图,并应用到LPP的目标方程中。实验分析表明,ELPP不仅获得了较原始LPP更高的分类精度,而且对噪声特征和各种参数输入鲁棒,这也说明图结构在图嵌入方法中的重要性。SELPP继承了ELPP的所有优点,可以利用同类约束进一步提高降维结果的质量。实验表明,SELPP优于其它相关的半监督维数约减方法。
     2、提出一种混合图构造策略并把它应用到基于边信息的半监督维数约减中,进而提出基于混合图的半监督维数约减方法(MGSSDR)。MGSSDR不仅可以利用同类约束和未标记数据,它还可以利用异类约束进行维数约减。分析表明,MGSSDR的时间复杂度低于ELPP,它降维后的分类精度也高于其它相关方法,对噪声特征和近邻参数的输入也较鲁棒。混合图构建策略还可以应用到其它基于图的学习方法中。
     3、提出一种基于随机子空间降维的半监督分类方法(SSC-RSDR)。SSC-RSDR首先在多个随机生成的子空间上进行基于图的半监督维数约减,其次在这些维数约减后的子空间上构造多个k近邻图,再在这些图上分别训练半监督非线性分类器,最后融合这些分类器为一个集成分类器。实验结果表明,SSC-RSDR的分类精度不仅高于其它相关方法,而且对很多输入参数都很鲁棒,它较好地平衡了基础分类器之间的精确性和差异性,同时也克服了混合图依赖于子空间大小的不足。SSC-RSDR的图构造策略还可以应用到其它基于图的半监督学习方法中。
     4、提出一种子空间上的半监督集成分类方法(SSEC),该方法在多个划分的子间上构造多个k近邻图,再在这些图上训练半监督线性分类器(SSLC),最后把这些分类器进行投票集成。理论分析表明SSEC的时间复杂度低于SSC-RSDR。在高维人脸图像数据集上的实验结果表明,SSEC避免了随机子空间方法可能丢失重要特征的风险,它无需复杂的图优化过程,分类精度超过多种基于图优化的半监督分类方法,对多个输入参数也很鲁棒。子空间上的SSLC比原始空间上的SSLC拥有更高的精度,这证实了高维数据中的确存在大量的冗余特征和在子空间进行集成分类的合理性。
     5、提出一种有向双关系图,它可以克服无向双关系图容易出现标签覆盖的不足,基于有向双关系图提出一种直推多标记分类方法(TMC)和直推多标记集成分类方法(TMEC),并把它们应用到异构多数据源的蛋白质功能预测中。实验结果表明有向双关系图比无向双关系图效果要好,基于分类器集成的方法比基于多核集成的方法更适合蛋白质功能预测任务。
With the rapid development of information technology, the collected data in each areaaccumulate sharply. The advance of these areas depends heavily on efficiently discoveringknowledge from these data. Machining learning is one of the bases of data mining andknowledge discovery. It is and will be the hottest research area in computer science.Traditional machine learning methods mainly focus on supervised learning, they require thesamples are all labeled and do not have a large number of features. However, with the wideapplication and the advance of data collection techniques, the collected samples not onlyhave a large number of correlated features, but also few of them are labeled. As a result,traditional machine learning methods can not efficiently learn from these samples. It is quitenecessary to develop new machine learning paradigms, which can learn from few labeledand many unlabeled data。
     Semi-Supervised Learning can leverage both the labeled data and unlabeled data toachieve a learner with good generalization ability, and it quickly turns to be one of the hotsubfields in machine learning. Current semi-supervise learning methods, especially thegraph-based ones, often concern on how to utilize the labeled and unlabeled data. But theyignore a more fundamental and important problem--how to construct a graph, which canprecisely reflect the similarity information among samples. As the number of featuresincreases, the number of noisy and redundant features also rises, various widely-usedmetrics can not properly measure the similarity between samples. Therefore, it is difficult toconstruct a well-structured graph on these samples. However, semi-supervised learningmethods depend on a graph to leverage the labeled and unlabeled data. Thus, awell-structured graph on the high dimensional samples determines the final effectiveness ofthe graph-based semi-supervised learning methods. The graph also decides the effectivenessof other graph-based learning methods.
     To address these problems associated with graph-based semi-supervised learning, westart from graph construction on high dimensional data with the aim to improve the learningaccuracy, and take metric learning and ensemble learning as basic tools. We conduct extensive research on graph-based semi-supervised dimensionality reduction,semi-supervised classification and semi-supervised multi-label classification. We proposeseveral graph construction schemes and incorporate them into graph-based semi-supervisedlearning. In summary, the key contributions of the thesis are:
     (1) We propose an Enhanced Locality Preserving Projections (ELPP) method and it issemi-supervised version (SELPP). ELPP can address the ineffectiveness and parametersensitivity of original LPP. ELPP capitalizes on robust path based similarity to construct agraph, and incorporates it into the objective function of LPP. Experimental analysis showsthat ELPP has higher classification accuracy than original LPP and is also robust to variousinput parameters. These results also corroborate the importance of graph construction ingraph-based embedding. SELPP inherits all the advantages of ELPP, it can use must-link toboost the learning results and performs better than other related methods.
     (2) We introduce a mixture graph construction scheme and apply it intosemi-supervised dimensionality reduction based on side information, and proposed amethod called Mixture Graph-based Semi-Supervised Dimensionality Reduction(MGSSDR). MGSSDR has lower time complexity than ELPP, and has better performancethan other related methods. In addition, MGSSDR is robust to noisy features andneighborhood size selection. The proposed mixture graph can be used in other graph basedmethods.
     (3) We propose a method coined as Semi-Supervised Classification based on RandomSubspace Dimensionality Reduction (SSC-RSDR). SSC-RSDR first constructs multiple knearest neighborhood (k NN) graphs in multiple generated random subspaces and executessemi-supervised dimensionality reduction in these subspaces. Next, SSC-RSDRre-constructs multiple k NN graphs in these dimensionality reduced subspaces and trainssemi-supervised nonlinear classifiers on these graphs. Finally, SSC-RSDR fuses theseclassifiers into an ensemble classifier. Experimental analysis demonstrates that, SSC-RSDRhas higher accuracy than other related methods and can balance the accuracy and diversityof base classifiers, it is also robust to various input parameters. In addition, SSC-RSDR canovercome the drawback of mixture graph, which is dependent on subspace size selection. The graph construction scheme of SSC-RSDR can be used in other graph-basedsemi-supervised learning.
     (4) We study a method named Semi-Supervised Ensemble Classification (SSEC) insubspaces. SSEC first divides the original feature spaces into several subspaces with equalsize and constructs k NN graphs in these subspaces. Next, it trains Semi-Supervised LinearClassifiers (SSLC) in these subspaces. And finally, SSEC combines these classifiers into anensemble classifier by majority voting. Theory analysis shows SSEC has lower timecomplexity than SSC-RSDR. SSEC avoids the risk of discarding important features, it justmakes use of simple k NN graphs and has better performance than other semi-supervisedclassification methods, which are based on various graph optimization techniques. SSEC isalso robust to various input parameters values. We observe SSLC trained in the subspaceholds higher accuracy than SSLC trained in the original space. This fact confirms that thereare many redundant features in high dimensional data and it is rational to ensembleclassifiers in subspaces.
     (5) We introduce a directed bi-relation graph, which can avoid the risk of labeloverwritten in undirected bi-relation graph. Based on directed bi-relation graph, we proposea method called Transductive Multi-label Classification (TMC) and a method calledTransductive Multi-label Ensemble Classification (TMEC). We apply them into proteinfunction prediction using multiple heterogeneous data sources. Experimental results showthe directed bi-relational graph is better than the undirected one, the classifier ensemblebased methods are more suitable for protein function prediction tasks than kernelintegration based methods.
引文
[1] Mitchell T. Machine Learning [M]. New York: McGraw Hill,1997.
    [2]周志华.机器学习与数据挖掘[J].中国计算机学会通讯,2007,3(12):35-44.
    [3] Science Staff. Introduction to Special Issue [J]. Science,2011,331(6018):692-693.
    [4] Mjolsness E, DeCoste D. Machine Learning for Science: State of the Art and FutureProspects [J]. Science,2001,293(5537):2051-2055.
    [5] Chapelle O, Scholkopf B, Zien A. Semi-Supervised Learning [M]. Cambridage: MIT Press,2006.
    [6] Zhu X. Semi-Supervised Learning Literature Survey [R]. Technical Report1530,Department of Computer Sciences, University of Wisconsin-Madison,2008.
    [7] Settles B. Active Learning Literature Survey [R]. Technical Reprot1648, Department ofComputer Sciences, University of Wisconsin-Madison,2010.
    [8] Dietterich T. Ensemble Methods in Machine Learning [A]. Multiple Classifier Systems [C],2000:1-15.
    [9] Zhou Z H. Ensemble Methods: Foundation and Algorithms [M]. Boca Raton, FL: Chapman&Hall/CRC,2012.
    [10] Tsoumakas G, Katakis I, Vlahavas I. Mining Multi-label Data [M]. Data Mining andKnowledge Discovery Handbook,2nd Edition. Berlin: Springer,2010:667-685.
    [11]周志华,杨强.机器学习及其应用[M].北京:清华大学出版社,2011.
    [12] Zhou Z H, Zhang M L, Huang S J, et al. Multi-instance Multi-label Learning [J]. ArtificialIntelligence,2012,176(1):2291-2320.
    [13]张敏灵,周志华.多示例学习与多标记学习[J].中国计算机学会通讯,2009,5(3):36-43.
    [14] Seung H S, Lee D D. The Manifold Ways of Perception [J]. Science,2000,290(5500):2268-2269.
    [15] Roweis S T, Saul L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science,2000,290(5500):2323-2326.
    [16] Tenenbaum J B, De Silva V, Langford J C. A Global Geometric Framework for NonlinearDimensionality Reduction [J]. Science,2000,290(5500):2319-2323.
    [17] Duda R O, Hart P E, Stork D G. Pattern Classification [M],2nd Edition. New York: JohnWiley&Sons,2001.
    [18] Bishop C M. Pattern Recognition and Machine Learning [M]. Berlin: Springer,2006.
    [19]周志华,王珏.机器学习及其应用[M].北京:清华大学出版社,2007.
    [20] Fukunaga K. Introduction to Statistical Pattern Recognition [M],2nd Edition. San Diego,CA: Academic Press Professional,1990.
    [21] Olshausen B A, Field D J. Sparse Coding with an Overcomplete Basis Set: A StrategyEmployed by V1?[J]. Vision Research,1997,37(23):3311-3325.
    [22] Parsons L, Haque E, Liu H. Subspace Clustering for High Dimensional Data: a Review [J].ACM SIGKDD Explorations Newsletter,2004,6(1):90-105.
    [23] Shahshahani B M, Landgrebe D A. The Effect of Unlabeled Samples in Reducing the SmallSample Size Problem and Mitigating the Hughes Phenomenon [J]. IEEE Transactions onGeoscience and Remote Sensing,1994,32(5):1087-1095.
    [24] Miller D J., Uyar H S. A Mixture of Experts Classifier with Learning based on BothLabelled and Unlabelled Data [A]. Proceedings of Advances in Neural InformationProcessing Systems [C],1997:571-577.
    [25] Chung F R K. Spectral Graph Theory [M]. America Mathematical Society,1997.
    [26] Wagstaff K, Cardie C. Clustering with Instance-level Constraints [A]. Proceedings ofInternational Conference on Machine Learning [C],2000:1103-1110.
    [27] Wagstaff K, Cardie C, Rogers S, et al. Constrained K-means Clustering with BackgroundKnowledge [A]. Proceedings of International Conference on Machine Learning [C],2000:577-584.
    [28] Kumar N, Kummamuru K. Semi-Supervised Clustering with Metric Learning usingRelative Comparisons [J]. IEEE Transactions on Knowledge and Data Engineering,2008,20(4):496-503.
    [29] Dempster A P, Laird N M, Rubin D B. Maximum Likelihood from Incomplete Data via theEM Algorithm [J]. Journal of the Royal Statistical Society Series B (Methodological),1977:1-38.
    [30] Blum A, Mitchell T. Combining Labeled and Unlabeled Data with Co-training [A].Proceedings of the International Conference on Computational Learning Theory [C],1998:92-100.
    [31] Zhou Z H, Li M. Tri-training: Exploiting Unlabeled Data using Three Classifiers [J]. IEEETransactions on Knowledge and Data Engineering,2005,17(11):1529-1541.
    [32] Vapnik V. Statistical Learning Theory [M]. New York: Wiley,1998.
    [33] Blum A, Chawla S. Learning from Labeled and Unlabeled Data using Graph Mincuts [A].Proceedings of International Conference on Machine Learning [C],2001:19-26.
    [34] Blum A, Lafferty J, Rwebangira M R, et al. Semi-Supervised Learning using RandomizedMincuts [A]. Proceedings of International Conference on Machine Learning [C],2004:1-8.
    [35] Shi J, Malik J. Normalized Cuts and Image Segmentation [J]. IEEE Transactions on PatternAnalysis and Machine Intelligence,2000,22(8):888-905.
    [36] Zhu X, Ghahramani Z, Lafferty J. Semi-Supervised Learning using Gaussian Fields andHarmonic Functions [A]. Proceedings of International Conference on Machine Learning [C],2003:912-929.
    [37] Zhou D, Bousquet O, Lal T N, et al. Learning with Local and Global Consistency [A].Proceedings of Advances in Neural Information Processing Systems [C],2003:321-328.
    [38] Belkin M, Matveeva I, Niyogi P. Tikhonov Regularization and Semi-Supervised Learningon Large Graphs [A]. Proceedings of IEEE International Conference on Acoustics, Speech,and Signal Processing [C],2004:1-4.
    [39] Belkin M, Niyogi P, Sindhwani V. Manifold Regularization: A Geometric Framework forLearning from Labeled and Unlabeled Examples [J]. The Journal of Machine LearningResearch,2006,7:2399-2434.
    [40]肖宇,于剑.基于近邻传播算法的半监督聚类[J].软件学报,2008,19(21):2803-2813.
    [41] Kamvar K, Sepandar S, Klein K, et al. Spectral Learning [A]. Proceedings of InternationalJoint Conference of Artificial Intelligence [C],2003:561-567.
    [42]王玲,薄列峰,焦李成.密度敏感的半监督谱聚类[J].软件学报,2007,19(11):2412-2422.
    [43] Ng A Y, Jordan M I, Weiss Y. On Spectral Clustering: Analysis and an Algorithm [A].Proceedings of Advances in Neural Information Processing Systems [C],2002:849-856.
    [44] Von Luxburg U. A Tutorial on Spectral Clustering [J]. Statistics and Computing,2007,17(4):395-416.
    [45] Frey B J, Dueck D. Clustering by Passing Messages between Data Points [J]. Science,2007,315(5814):972-976.
    [46] Lu Z, Carreira-Perpinán M A. Constrained Spectral Clustering through Affinity Propagation
    [A]. Proceedings of IEEE International Conference on Computer Vision and PatternRecognition [C],2008:1-8.
    [47] Wang X, Davidson I. Flexible Constrained Spectral Clustering [A]. Proceedings of ACMConference on Knowledge Discovery and Data Mining [C],2010:563-572.
    [48] Xing E P, Ng A Y, Jordan M I. Distance Metric Learning with Application to Clustering withSide-information [A]. Proceedings of Advances in Neural Information Processing Systems
    [C],2002:505-512.
    [49] Yan B, Domeniconi C. Subspace Metric Ensembles for Semi-Supervised Clustering of HighDimensional Data [A]. Proceedings of European Conference on Machine Learing [C],2006:509-520.
    [50] Ho T K. The Random Subspace Method for Constructing Decision Forests [J]. IEEETransactions on Pattern Analysis and Machine Intelligence,1998,20(8):832-844.
    [51] Bar-Hillel A, Hertz T, Shental N. Learning Distance Functions using Equivalence Relations
    [A]. Proceedings of International Conference on Machine Learning [C],2004:11-18.
    [52] Tang W, Xiong H, Zhong S, et al. Enhancing Semi-Supervised Clustering: a FeatureProjection Perspective [A]. Proceedings of ACM Conference on Knowledge Discovery andData Mining [C],2007:707-716.
    [53]尹学松,胡恩良,陈松灿.基于成对约束的判别型半监督聚类分析[J].软件学报,2008,19(11):2791-2802.
    [54] Demiriz A, Bennett K P, Embrechts M J. Semi-Supervised Clustering using GeneticAlgorithms [A]. Proceedings of Artificial Neural Networks in Engineering [C],1999:809-814.
    [55] Basu S, Bilenko M, Mooney R J. A Probabilistic Framework for Semi-SupervisedClustering [A]. Proceedings of ACM Conferencen on Knowledge Discovery and DataMining [C],2004:59-68.
    [56] Bilenko M, Basu S, Mooney R J. Integrating Constraints and Metric Learning inSemi-Supervised Clustering [A]. Proceedings of International Conference on MachineLearning [C],2004:81-88.
    [57] Basu S, Banerjee A, Mooney R. Semi-Supervised Clustering by Seeding [A]. Proceedingsof the International Conference on Machine Leanring [C],2002,19-26.
    [58] Strehl A, Ghosh J. Cluster Ensembles---a Knowledge Reuse Framework for CombiningMultiple Partitions [J]. Journal of Machine Learning Research,2003,3:583-617.
    [59] Basu S, Banerjee A, Mooney R J. Active Semi-Supervision for Pairwise ConstrainedClustering [A]. Proceedings of SIAM Conference on Data Mining [C],2004:333-344.
    [60] Huang R, Lam W. An Active Learning Framework for Semi-Supervised DocumentClustering with Language Modeling [J]. Data&Knowledge Engineering,2009,68(1):49-67.
    [61] Guyon I, Elisseeff A. An Introduction to Variable and Feature Selection [J]. Journal ofMachine Learning Research,2003,3:1157-1182.
    [62] van der Maaten L, Postma E, van den Herik J. Dimensionality Reduction: A ComparativeReview [R]. Technical Report, Tilburg University,2008.
    [63] Belkin M, Niyogi P. Laplacian Eigenmaps for Dimensionality Reduction and DataRepresentation [J]. Neural Computation,2003,15(6):1373-1396.
    [64] Zhang Z, Zha H. Principal Manifolds and Nonlinear Dimensionality Reduction via TangentSpace Alignment [J]. SIAM Journal of Scientific Computing,2004,26(1):313-338.
    [65] Weinberger K Q, Saul L K. Unsupervised Learning of Image Manifolds by SemidefiniteProgramming [J]. International Journal of Computer Vision,2006,70(1):77-90.
    [66] Shaw B, Jebara T. Structure Preserving Embedding [A]. Proceedings of InternationalConference on Machine Learning [C],2009:937-944.
    [67] Kim M, De la Torre F. Local Minima Embedding [A]. Proceedings of InternationalConference on Machine Learning [C],2010:527-534.
    [68] He X F, Niyogi P. Locality Preserving Projections [A]. Proceedings of Advances in NeuralInformation Processing Systems [C],2003:153-160.
    [69] He X, Cai D, Yan S, et al. Neighborhood Preserving Embedding [A]. Proceedings ofInternation Conference on Computer Vision [C],2005:1208-1213.
    [70] Cai D, He X, Han J. Spectral Regression for Efficient Regularized Subspace Learning [A].Proceedings of IEEE International Conference on Computer Vision [C],2007:1-8.
    [71] Yan S, Xu D, Zhang B, et al. Graph Embedding and Extensions: A General Framework forDimensionality Reduction [J]. IEEE Transactions on Pattern Analysis and MachineIntelligence,2007,29(1):40-51.
    [72]陈诗国,张道强.半监督降维方法的实验比较[J].软件学报,2011,22(1):28-43.
    [73] Cai D, He X, Han J. Semi-supervised Discriminant Analysis [A]. Proceedings of IEEEInternational Conference on Computer Vision [C],2007:1-7.
    [74] Zhang Y, Yeung D Y. Semi-supervised Discriminant Analysis using Robust Path-basedSimilarity [A]. Proceedings of IEEE International Conference on Computer Vision andPattern Recognition [C],2008:1-8.
    [75]魏莱,王守觉.基于流形距离的半监督判别分析[J].软件学报,2010,21(10):2445-2453.
    [76] Sugiyama M, Idé T, Nakajima S, et al. Semi-Supervised Local Fisher Discriminant Analysisfor Dimensionality Reduction [J]. Machine Learning,2010,78(1):35-61.
    [77] Zhang D, Zhou Z H, Chen S. Semi-Supervised Dimensionality Reduction [A]. SIAMInternational Conference on Data Mining [C],2007:629-634.
    [78]韦佳,彭宏.基于局部与全局保持的半监督维数约减方法[J].软件学报,2008,19(11):2833-2842.
    [79] Wei J, Peng H. Neighbourhood Preserving based Semi-Supervised DimensionalityReduction [J]. Electronics Letters,2008,44(20):1190-1191.
    [80] Cevikalp H, Verbeek J, Jurie F, et al. Semi-Supervised Dimensionality Reduction usingPairwise Equivalence Constraints [A]. Proceedings of International Conference onComputer Vision Theory and Applications [C],2008:489-496.
    [81] Zhang D, Chen S, Zhou Z H, et al. Constraint Projections for Ensemble Learning [A].Proceedings of AAAI Conference on Artificial Intelligence [C],2008:758-763.
    [82] Breiman L. Bagging Predictors [J]. Machine Learning,1996,24(2):123-40.
    [83] Freund Y, Schapire R. A Desicion-Theoretic Generalization of Online Learning and anApplication to Boosting [A]. Proceedings of European Conference on ComputationalLearning Theory [C],1995:23-37.
    [84] Martinez A M, Kak A C. PCA versus LDA [J]. IEEE Transactions on Pattern Analysis andMachine Intelligence,2001,23(2):228-233.
    [85] Cai D, He X, Han J, et al. Orthogonal Laplacianfaces for Face Recognition [J]. IEEETransactions on Image Processing,2006,15(11):3608-3614.
    [86] Yang J, Zhang D, Frangi A F. Two-dimensional PCA: a New Approach to Appearance-basedFace Representation and Recognition [J]. IEEE Transactions on Pattern Analysis andMachine Intelligence,2004,26(1):131-137.
    [87] Zhou Z H. When Semi-supervised Learning Meets Ensemble Learning [A]. Proceedings ofInternational Workshop on Multiple Classifier Systems [C],2009:529-538.
    [88] Zhang M L, Zhou Z H. Exploiting Unlabeled Data to Enhance Ensemble Diversity [A].Proceedings of the IEEE International Conference on Data Mining [C],2010:619-628.
    [89] Liu Y, Jin R, Yang L. Semi-Supervised Multi-label Learning by Constrained Non-negativeMatrix Factorization [A]. Proceedings of the AAAI Conference on Artificial Intelligence [C],2006:421-426.
    [90] Chen G, Song Y, Wang F, et al. Semi-Supervised Multi-label Learning by Solving aSylvester Equation [A]. Proceedings of SIAM International Conference on Data Mining [C],2008:410-419.
    [91] Kong X, Ng M, Zhou Z. Transductive Multi-Label Learning via Label Set Propagation [J].IEEE Transactions on Knowledge and Data Engineering,2013,25(3):704-719.
    [92] Qian B, Davidson I. Semi-Supervised Dimension Reduction for Multi-label Classification
    [A]. Proceedings of AAAI Conference on Artificial Intelligence [C],2010:569-574.
    [93] Jiang J, McQuay L. Predicting Protein Function by Multi-label Correlated Semi-SupervisedLearning [J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics,2012,9(4):1059-1069.
    [94] Zhang X, Dai D. A Framework for Incorporating Functional Inter-relationships into ProteinFunction Prediction Algorithms [J]. IEEE/ACM Transactions on Computational Biologyand Bioinformatics,2012,9(3):740-753.
    [95] Maier M, Von Luxburg U, Hein M. Influence of Graph Construction on Graph-basedClustering Measures [A]. Proceedings of Advances in Neural Information ProcessingSystems [C],2008:1025-1032.
    [96] Liu W, Chang S F. Robust Multi-class Transductive Learning with Graphs [A]. Proceedingsof IEEE International Conference on Computer Vision and Pattern Recognition [C],2008:381-388.
    [97] Carreira-Perpinán M A, Zemel R S. Proximity Graphs for Clustering and Manifold Learning
    [A]. Proceedings of Advances in Neural Information Processing Systems [C],2004:225-232.
    [98] Balcan M F, Blum A, Choi P P, et al. Person Identification in Webcam Images: anApplication of Semi-Supervised Learning [A]. Proceedings of ICML Workshop on Learningwith Partially Classified Training Data [C],2005:1-9.
    [99] Tang W, Lu Z, Dhillon I S. Clustering with Multiple Graphs [A]. Proceedings of IEEEInternational Conference on Data Mining [C],2009:1016-1021.
    [100] Wang J, Wang F, Zhang C, et al. Linear Neighborhood Propagation and its Applications [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(9):1600-1615.
    [101] Jebara T, Wang J, Chang S F. Graph Construction and b-matching for Semi-SupervisedLearning [A]. Proceedings of International Conference on Machine Leanring [C],2009:441-448.
    [102] Wright J, Yang A Y, Ganesh A, et al. Robust Face Recognition via Sparse Representation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(2):210-227.
    [103] Qiao L, Chen S, Tan X. Sparsity Preserving Projections with Applications to FaceRecognition [J]. Pattern Recognition,2010,43(1):331-441.
    [104] Cheng B, Yang J, Yan S, et al. Learning With l1Graph for Image Analysis [J]. IEEETransactions on Image Processing,2010,19(4):858-866.
    [105] Wang C, Yan S, Zhang L, et al. Multi-label Sparse Coding for Automatic Image Annotation
    [A]. Proceedings of IEEE International Conference on Computer Vision and PatternRecognition [C],2009:1643-1650.
    [106] Wang M, Hua X S, Hong R, et al. Unified Video Annotation via Multigraph Learning [J].IEEE Transactions on Circuits and Systems for Video Technology,2009,19(5):733-746.
    [107] Tsuda K, Shin H J, Sch lkopf B. Fast Protein Classification with Multiple Networks [J].Bioinformatics,2005,21(suppl2): ii59-ii65.
    [108] Shin H, Tsuda K, Sch lkopf B. Protein Functional Class Prediction with a Combined Graph[J]. Expert Systems with Applications,2009,36(2):3284-3292.
    [109] Hein M, Maier M. Manifold Denoising [A]. Proceedings of Advances in Neural InformationProcessing Systems [C],2006:561-568.
    [110]詹德川,周志华.基于集成的流形学习可视化[J].计算机研究与发展,2005,42(9):1533-1537.
    [111]曾宪华,罗四维,王娇, et al.基于测地线距离的广义高斯型Laplacian特征映射[J].软件学报,2009,20(4):815-824.
    [112] Chang H, Yeung D Y. Robust Path-based Spectral Clustering [J]. Pattern Recognition,2008,41(1):191-203.
    [113] Fischer B, Roth V, Buhmann J M. Clustering with the Connectivity Kernel [A]. Proceedingsof Advances in Neural Information Processing Systems [C],2003:89-96.
    [114] Belkin M, Niyogi P. Semi-Supervised Learning on Riemannian Manifolds [J]. MachineLearning,2004,56(1):209-239.
    [115] Martinez A M. The AR Face Database [R], Technical Report24, CVC,1998.
    [116] Sim T, Barker S, Bast M. The CMU Pose, Illumination, and Expression (PIE) Database [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(12):1615-1618.
    [117] Georghiades A S, Belhumeur P N, Kriegman D J. From Few to Many: Illumination ConeModels for Face Recognition under Variable Lighting and Pose [J]. IEEE Transactions onPattern Analysis and Machine Intelligence,2001,23(6):643-660.
    [118] Belhumeur P N, Hespanha J P, Kriegman D J. Eigenfaces vs. Fisherfaces: Recognition usingClass Specific Linear Projection [J]. IEEE Transactions on Pattern Analysis and MachineIntelligence,1997,19(7):711-720.
    [119] Yang L, Jin R. Distance Metric Learning: a Comprehensive Survey [R], Technical Report,Department of Computer Science and Engineering, Michigan State Universiy,2006.
    [120] Li Y F, Jiang T, Zhang K S. Efficient and Robust Feature Extraction by Maximum MarginCriterion [J]. IEEE Transactions on Neural Networks,2006,17(1):157-65.
    [121] Lai C, Reinders M J T, Wessels L. Random Subspace Method for Multivariate FeatureSelection [J]. Pattern Recognition Letters,2006,27(10):1067-1076.
    [122] Wang X, Tang X. Random Sampling for Subspace Face Recognition [J]. InternationalJournal of Computer Vision,2006,70(1):91-104.
    [123] Zhang X, Jia Y. A Linear Discriminant Analysis Framework based on Random Subspace forFace Recognition [J]. Pattern Recognition,2007,40(9):2585-2591.
    [124] Yu H, Yang J. A Direct LDA Algorithm for High-dimensional Data-with Application to FaceRecognition [J]. Pattern Recognition,2001,34(10):2067-2070.
    [125] Zhu Y., Liu J., Chen S. Semi-Random Subspace Method for Face Recognition [J]. Imageand Vision Computing,2009,27(9):1358-1370.
    [126] Wang J, Luo S, Zeng X. A Random Subspace Method for Co-Training [A]. Proceedings ofIEEE International Joint Conference on Neural Networks [C],2008:195-200.
    [127] Kuncheva L I, Whitaker C J. Measures of Diversity in Classifier Ensembles and TheirRelationship with the Ensemble Accuracy [J]. Machine Learning,2003,51(2):181-207.
    [128] Rodriguez J J, Kuncheva L I, Alonso C J. Rotation Forest: A New Classifier EnsembleMethod [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(10):1619-1630.
    [129] Yang W, Zhang S, Liang W. A Graph based Subspace Semi-Supervised LearningFramework for Dimensionality Reduction [A]. Proceedings of European Conference onComputer Vision [C],2008:664-677.
    [130] Cai D, He X, Zhou K, et al. Locality Sensitive Discriminant Analysis [A]. Proceedings ofInternational Joint Conference on Artificial Intelligence [C],2007:708-713.
    [131] Margineantu D D, Dietterich T G. Pruning Adaptive Boosting [A]. Proceedings ofInternational Conference on Machine Learning [C],1997:211-218.
    [132] Breiman L. Random Forests [J]. Machine Learning,2001,45(1):5-32.
    [133] Fan M, Gu N, Qiao H, et al. Sparse Regularization for Semi-Supervised Classification [J].Pattern Recognition,2011,44(8):1777-1784.
    [134] Wu F, Wang W, Yang Y, et al. Classification by Semi-Supervised DiscriminativeRegularization [J]. Neurocomputing,2010,73(10):1641-1651.
    [135] Sindhwani V, Niyogi P, Belkin M, et al. Linear Manifold Regularization for Large ScaleSemi-Supervised Learning [A]. Proceedings of ICML Workshop on Learning with PartiallyClassified Training Data [C],2005:1-4.
    [136] Witten I H, Frank E, Hall M A. Data Mining: Practical Machine Learning Tools andTechniques [M],3rd Edition. San Francisco: Morgan Kaufmann,2011.
    [137] Samaria F S, Harter A C. Parameterisation of a Stochastic Model for Human FaceIdentification [A]. Proceedings of IEEE2nd Workshop on Applications of Computer Vision
    [C],1994:138-142.
    [138] Sharan R, Ulitsky I, Shamir R. Network-based Prediction of Protein Function [J]. MolecularSystems Biology,2007,3(1):88.
    [139] Noble W S, Ben-Hur A. Integrating Information for Protein Function Prediction [J].Bioinformatics-From Genomes to Therapies,2007:1297-1314.
    [140] Pandey G, Kumar V, Steinbach M. Computational Approaches for Protein FunctionPrediction: A Survey [R]. Technical Report06-028, Department of Computer Science andEngineering, University of Minnesota-Twin Cities,2006.
    [141] Leslie C S, Eskin E, Cohen A, et al. Mismatch String Kernels for Discriminative ProteinClassification [J]. Bioinformatics,2004,20(4):467-476.
    [142] Weston J, Leslie C, Ie E, et al. Semi-Supervised Protein Classification using Cluster Kernels[J]. Bioinformatics,2005,21(15):3241-3247.
    [143] Lanckriet G R G, De Bie T, Cristianini N, et al. A Statistical Framework for Genomic DataFusion [J]. Bioinformatics,2004,20(16):2626-2635.
    [144] Mostafavi S, Morris Q. Fast Integration of Heterogeneous Data Sources for Predicting GeneFunction with Limited Annotation [J]. Bioinformatics,2010,26(14):1759-1765.
    [145] Cesa-Bianchi N, Re M, Valentini G. Synergy of Multi-label Hierarchical Ensembles, DataFusion, and Cost-Sensitive Methods for Gene Functional Inference [J]. Machine Learning,2012,88(1-2):1-33.
    [146] Re M, Valentini G. Ensemble based Data Fusion for Gene Function Prediction [A].Proceedings of International Workshop on Multiple Classifier Systems [C]:2009:448-457.
    [147] Jiang J. Learning Protein Functions from Bi-relational Graph of Proteins and FunctionAnnotations [A]. Proceedings of the International Conference on Algorithms inBioinformatics [C],2011:128-138.
    [148] Pandey G, Myers C, Kumar V. Incorporating Functional Inter-relationships into ProteinFunction Prediction Algorithms [J]. BMC Bioinformatics,2009,10(1):142-163.
    [149] Ashburner M, Ball C A, Blake J A, et al. Gene Ontology: Tool for the Unification ofBiology [J]. Nature Genetics,2000,25(1):25-29.
    [150] Elisseeff A, Weston J. A Kernel Method for Multi-labelled Classification [A]. Proceedingsof Advances in Neural Information Processing Systems [C],2001:681-687.
    [151] Barutcuoglu Z, Schapire R E, Troyanskaya O G. Hierarchical Multi-label Prediction of GeneFunction [J]. Bioinformatics,2006,22(7):830-836.
    [152] Ruepp A, Zollner A, Maier D, et al. The FunCat, a Functional Annotation Scheme forSystematic Classification of Proteins from Whole Genomes [J]. Nucleic Acids Research,2004,32(18):5539-5545.
    [153] Wang H, Huang H, Ding C. Image Annotation using Bi-relational Graph of Images andSemantic Labels [A]. Proceedings of IEEE International Conference on Computer Visionand Pattern Recognition [C],2011:793-800.
    [154] G nen M, Alpayd n E. Multiple Kernel Learning Algorithms [J]. Journal of MachineLearning Research,2011,12:2211-68.
    [155] Mostafavi S, Ray D, Warde-Farley D, et al. GeneMANIA: a Real-time Multiple AssociationNetwork Integration Algorithm for Predicting Gene Function [J]. Genome Biology,2008,9(Suppl1): S4.
    [156] Lewis D P. Combining Kernels for Classification [D]. New York: Columbia University,2006.
    [157] Tong H, Faloutsos C, Pan J Y. Random Walk with Restart: Fast Solutions and Applications[J]. Knowledge and Information Systems,2008,14(3):327-346.
    [158] Chen G, Zhang J, Wang F, et al. Efficient Multi-label Classification with HypergraphRegularization [A]. Proceedings of IEEE International Conference on Computer Vision andPattern Recognition [C],2009:1658-1665.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700