流形学习算法及其应用研究

英文题名：The Study of Manifold Learning Algorithms and Their Applications
作者：雷迎科
论文级别：博士
学科专业名称：模式识别与智能系统
中文关键词：流形学习 ; 维数约简 ; 正交局部样条判别投影 ; 局部多尺度回归嵌入 ; 快速等距特征映射 ; 人脸识别 ; 数据可视化 ; 蛋白质相互作用 ; 假阳性 ; 假阴性
英文关键词：Manifold learning ; Dimensionality reduction ; Orthogonal local spline discriminant projection ; Local multidimensional scaling regression embedding ; Fast isometric feature mapping ; Face recognition ; Data visualization ; Protein-protein interactions ; False positives ; False negatives
学位年度：2011
导师：黄德双
学科代码：081104
学位授予单位：中国科学技术大学
论文提交日期：2011-05-01

摘要

流形学习方法作为一类新兴的非线性维数约简方法,主要目标是获取高维观测数据的低维紧致表示,探索事物的内在规律和本征结构,已经成为数据挖掘、模式识别和机器学习等领域的研究热点。流形学习方法的非线性本质、几何直观性和计算可行性,使得它在许多标准的toy数据集和实际数据集上都取得了令人满意的结果,然而它们本身还存在着一些普遍性的问题,比如泛化学习问题、监督学习问题和大规模流形学习问题等。因此,本文从流形学习方法存在的问题出发,在算法设计和应用(图像数据与蛋白质相互作用数据)等方面展开了一系列研究工作。首先对流形学习的典型方法做了详细对比分析,然后针对流形的泛化学习和监督学习、表征流形的局部几何结构、构造全局的正则化线性回归模型、大规模数据的流形学习等几个方面进行了重点研究,提出了三种有效的流形学习算法,并和相关研究成果进行了理论与实验上的比较,从而验证了我们所提算法的有效性。
     全文的主要工作概括如下:
     (1)在深入研究局部样条嵌入算法(LSE)的基础上,引入明确的线性映射关系,构建平移缩放模型和正交化特征子空间,提出了一种正交局部样条判别投影算法(O-LSDP)。有效解决了原始LSE算法存在的两个主要问题:样本外点学习问题和无监督模式学习问题,从而使该算法能够应用于模式分类问题并显著改善了算法的分类识别能力。在标准人脸数据库上进行实验比较分析,验证了该算法的有效性与可行性。
     (2)在兼容映射的概念框架下,提出了一种局部多尺度回归嵌入算法(LMDSRE)。LMDSRE算法首先利用局部多维尺度分析(LMDS)构建每个样本点邻域的局部坐标来表示低维流形的局部几何结构,然后拟合正则化的线性回归模型并排列所有的局部等距坐标,从而构建全局唯一的低维坐标。该算法作为一种新的流形学习方法具有局部等距的特点,能够应用于非线性维数约简和数据可视化分析,在六个标准人工数据集和三个实际数据集上的实验结果验证了该方法的有效性。
     (3)针对ISOMAP算法计算复杂度高的问题,提出了一种快速等距特征映射算法(Fast-ISOMAP)。Fast-ISOMAP算法首先利用最小子集覆盖策略(MSC)从数据集中选择p个Landmark点( p n),从而在构造最短路径距离矩阵时,用p×n距离矩阵D p×n代替了原始的n×n距离矩阵Dn×n,然后运用Landmark MDS算法将所有样本嵌入到低维特征空间。与原始的ISOMAP算法相比,Fast-ISOMAP算法在不显著改变原始ISOMAP算法嵌入性能的条件下,大大提高了算法的计算效率,该算法适合应用于大规模流形学习问题。在标准数据集上的实验结果验证了该算法的有效性。
     (4)提出了一种鲁棒的基于快速流形嵌入的蛋白质相互作用数据可信度评估与预测新方法。首先通过对蛋白质相互作用数据进行低维流形建模,然后采用快速等距特征映射流形学习方法将蛋白质相互作用数据映射到低维度量空间,从而把蛋白质相互作用数据可信度评估与预测的生物问题转化为低维嵌入空间中数据点之间相似性度量的数学问题,最后根据蛋白质对在低维嵌入空间的相似性度量来构造加权CD-Dist可靠性指数用于评估与预测可信度。在三个由不同高通量实验技术产生的不同规模的酵母蛋白质相互作用数据集上的实验结果表明,基于快速流形嵌入的方法所获得的高可靠性相互作用数据具有更高的功能一致性与细胞组分一致性。据我们所知,本章所提出的方法首次利用了流形学习理论来解决蛋白质相互作用数据可信度的评估与预测问题。该方法有效克服了现有方法需要额外先验信息和对蛋白质相互作用网络稀疏程度敏感的问题,为检测蛋白质相互作用网络中的假阳性与假阴性“噪声”问题提供了一条新的解决途径。
Manifold learning is a new kind of nonlinear dimensionality reduction method for finding low-dimensional compact representations of high-dimensional observation data and exploring the inherent law and intrinsic structue of data. At present, manifold learning has become a hot issue in the fields of data mining, pattern recognition, machine learning, and other related research topics. These manifold learning methods do yield impressive results on some artificial and real world benchmark data sets due to their nonlinear nature, geometric intuition, and computational feasibility. However, the original manifold learning methods still show some common problems, such as out-of-sample, supervised learning, large-scale manifold learning, and so on. In order to overcome these problems, this dissertation carries out a series of researches on algorithm design and their applications in image and protein-protein interactions (PPI) data. Firstly, classical manifold learning methods were analyzed and compared in detail. Secondly, five problems about manifold learning were mainly investigated, which include out-of-sample, supervised learning, characterization of local manifold geometry, construction of globally regularized linear regression model, and large-scale manifold learning. Finally, in this dissertation we proposed three manifold learning algorithms. Our proposed algorithms were compared with the related researches in theories and experiments. And the results demonstrate the effectiveness of our proposed algorithms.
     The main work for this dissertation can be summarized as follows:
     (1)Based on the analyses of local spline embedding (LSE) method, we proposed an efficient feature extraction algorithm called orthogonal local spline discriminant projection (O-LSDP). By introducing an explicit linear mapping, constructing different translation and rescaling models for different classes as well as orthogonalizing feature subspace, O-LSDP can effectively circumvent the two major shortcomings of the original LSE algorithm, i.e., out-of-sample and unsupervised learning. O-LSDP not only inherits the advantages of LSE which uses local tangent space as a representation of the local geometry so as to preserve the local structure, but also makes full use of class information and orthogonal subspace to significantly improve discriminant power. Extensive experiments on standard face databases verify the feasibility and effectiveness of the proposed algorithm.
     (2)A new manifold learning algorithm called local multidimensional scaling regression embedding (LMDSRE) was developed under the conceptual framework of compatible mapping. LMDSRE is to use local multidimensional scaling (LMDS) to construct the local coordinates of each data point and its neighbors for representing local geometry structure of low-dimensional manifold. Regularized linear regression models are then fitted to map each of the local coordinates to its own single low-dimensional global coordinate. LMDSRE as a new manifold learning method has the characteristic of local isometry and can be used for nonlinear dimensionality reduction and data visualization analysis. The experiments on six toy datasets and three real-world datasets illustrate the validity of our method.
     (3)For the high complexity problem of the ISOMAP algorithm, we designed a new fast isometric feature mapping (Fast-ISOMAP) method. Fast-ISOMAP is to utilize minimum set cover strategy to designate p ones among all data points to be landmark points. Instead of computing Dn×n, we only computed the p×n matrix D p×n of distances from each data point to the landmark points in constructing the shortest path distance matrix. Landmark MDS is then applied to embed all data points to low-dimensional feature subspace. It was found in experiments that Fast-ISOMAP can greatly improve the computational efficiency of the original ISOMAP and be used in large-scale manifold learning problems under the condition that it does not significantly change the performance of ISOMAP. Experimental results on many artificial benchmark datasets show the effectiveness of our proposed algorithm.
     (4)We developed a robust computational technique for assessing the reliability of protein interactions and predicting new protein interactions by fast manifold embedding algorithm. Firstly, we adopted low-dimensional manifold modeling to fit a PPI network and utilized our proposed fast isometric feature mapping (Fast-ISOMAP) to transform a PPI network into a low dimensional metric space, which recasts the problem of assessing and predicting protein interactions into the form of measuring similarity between the points located in its metric space. Then a reliability index (RI), a likelihood indicating the interaction of two proteins, is assigned to each protein pair in the PPI networks based on the similarity between the points in the embedding space. The performance of the proposed approach is evaluated by using functional homogeneity and localization coherence of protein interactions from three PPI datasets that are derived from various scales and high-throughput techniques, i.e., yeast-two-hybrid (Y2H), tandem affinity purification (TAP), and mass spectrometry (MS). Experimental results demonstrate that the interactions ranked top by our method have high functional homogeneity and localization coherence. Our proposed method can effectively and efficiently overcome the disadvantages that most existing methods require additional prior information and are sensitive to the sparseness of PPI network. Moreover, to our knowledge, our proposed method is the first research work aiming at utilizing manifold learning theory to assess and predict protein interactions. Therefore, the proposed algorithm is a much more promising method to detect both false positive and false negative interactions in PPI networks.

引文

Angelelli J B, Baudot A, Brun C, et al. 2008. Two local dissimilarity measures for weighted graphs with application to protein interaction networks. Advances in Data Analysis and Classification [J], 2: 3-16.
    Balasubramanian M, Schwartz E L 2002. The isomap algorithm and topological stability. Science [J], 295(5552): 7.
    Baudat G, Anouar F 2000. Generalized discriminant analysis using a kernel approach. Neural Computation [J], 12(10): 2385-2404.
    Belhumeur P N, Hespanha J P, Kriegman D J 1997. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 19(7): 711-720.
    Belkin M, Niyogi P 2002. Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in Neural Information Processing Systems [C], 14(2001): 585-591.
    Belkin M, Niyogi P 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation [J], 15(6): 1373-1396.
    Belkin M, Niyogi P 2004. Semi-supervised learning on Riemannian manifolds. Machine Learning [J], 56(1): 209-239.
    Belkin M, Niyogi P, Sindhwani V 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research [J], 7: 2399-2434.
    Bennett R 1969. The intrinsic dimensionality of signal collections. IEEE Transactions on Information Theory [J], 15: 517-525.
    Bernstein M, De Silva V, Langford J C, et al. 2001. Graph approximations to geodesics on embedded manifolds. Stanford University, Technical Report.
    Boser B E, Guyon I M, Vapnik V N 1992. A training algorithm for optimal margin classifiers. Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory [C], 144-152.
    Brand M 2003. Charting a manifold. Advances in Neural Information Processing Systems [C], 15: 985-992.
    Bregler C, Omohundro S M 1995. Nonlinear manifold learning for visual speech recognition. Proceedings of the Fifth International Conference on Computer Vision [C], 494-499.
    Brun C, Chevenet F, Martin D, et al. 2003. Functional classification of proteins for the predictionof cellular function from a protein-protein interaction network. Genome Biology [J], 5: R6.
    Bruske J, Sommer G 1998. Intrinsic dimensionality estimation with optimally topology preserving maps. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 20(5): 572-575.
    Cai D, He X, Han J 2005a. Document clustering using locality preserving indexing. IEEE Transactions on Knowledge and Data Engineering [J], 12(7): 1624-1637.
    Cai D, He X, Han J 2005b. Using graph model for face analysis. Univ. Illinois Urbana-Champaign, Urbana, IL, Department of Computer Science, Technical Report. Camastra F 2003. Data dimensionality estimation methods: a survey. Pattern Recognition [J], 36(12): 2945-2954.
    Cevikalp H, Neamtu M, Wilkes M, et al. 2005. Discriminative common vectors for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 27(1): 4-13.
    Chang H, Yeung D Y 2006a. Locally linear metric adaptation with application to semi-supervised clustering and image retrieval. Pattern Recognition [J], 39(7): 1253-1264. Chang H, Yeung D Y 2006b. Robust locally linear embedding. Pattern Recognition [J], 39(6): 1053-1065.
    Chang Y, Hu C, Turk M 2003. Manifold of facial expression. IEEE International Workshop on. Analysis and Modeling of Faces and Gestures [C], 28.
    Chen J, Hsu W, Lee M L, et al. 2005. Discovering reliable protein interactions from high-throughput experimental data using network topology. Artificial Intelligence in Medicine [J], 35: 37-47.
    Chen J, Hsu W, Lee M L, et al. 2006. Increasing confidence of protein interactomes using network topological metrics. Bioinformatics [J], 22(16): 1998-2004.
    Choi H, Choi S 2007. Robust kernel Isomap. Pattern Recognition [J], 40(3): 853-862.
    Choi H, Choi, S. 2005. Kernel Isomap on noisy manifold. Proceedings of 2005 4th IEEE International Conference on Development and Learning [C], 208-213.
    Chua H N, Sung W K, Wong L 2006. Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics [J], 22(13): 1623-1630.
    Chua H N, Wong L 2008. Increasing the reliability of protein interactomes. Drug discovery today [J], 13: 652-658.
    Colak R, Hormozdiari F, Moser F, et al. 2009. Dense Graphlet Statistics of Protein Interactionand Random Networks. Pacific Symposium on Biocomputing [C], 178-189.
    Collins S R, Kemmeren P, Zhao X C, et al. 2007. Toward a comprehensive atlas of the physicalinteractome of Saccharomyces cerevisiae. Molecular & Cellular Proteomics [J], 6(3): 439-450.
    Cormen T H 2001. Introduction to algorithms [M]. The MIT press.
    Cox T F, Cox M A A 1994. Multidimensional scaling [M]. Chapman and Hall; London.
    Cristianini N, Shawe-Taylor J 2000. An introduction to support Vector Machines [M]. Cambridge university press; Cambridge.
    De Ridder D, Kouropteva O, Okun O, et al. 2003. Supervised locally linear embedding. Proceedings of the 2003 Joint International Conference on Artificial Neural Networks and Neural Information Processing [C], Springer-Verlag, 333-341.
    De Silva V, Tenenbaum J B 2003. Global versus local methods in nonlinear dimensionality reduction. Advances in Neural Information Processing Systems [C], 15: 721-728.
    De Silva V, Tenenbaum J B 2004. Sparse multidimensional scaling using landmark points. Dept. Math., Stanford University, Stanford, CA, Technical Report.
    Donoho D L 2000. High-dimensional data analysis: The curses and blessings of dimensionality. AMS Math Challenges Lecture [C], 1-32.
    Donoho D L, Grimes C 2003. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences of the United States of America [J], 100: 5591-5596.
    Duchene J, Leclercq S 1988. An optimal transformation for discriminant and principal component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 10(6): 978-983.
    Duchon J 1977. Splines minimizing rotation-invariant semi-norms in Sobolev spaces. Constructive theory of functions of several variables [J]: 85-100.
    Duda R O, Hart P E, Stork D G 2001. Pattern classification [M], second ed. John Wiley & Sons.
    Edwards A M, Kus B, Jansen R, et al. 2002. Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends in Genetics [J], 18(10): 529-536.
    Freedman D 2002. Efficient simplicial reconstructions of manifolds from their samples. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 24(10): 1349-1357.
    Friedman J H 1989. Regularized discriminant analysis. Journal of the American statistical association [J], 84(405): 165-175.
    Fukunaga K, Olsen D R 1971. An algorithm for finding intrinsic dimensionality of data. IEEE Transactions on Computers [J], C-20(2): 176-183.
    Fung G M, Mangasarian O L 2005. Multicategory proximal support vector machine classifiers. Machine Learning [J], 59(1-2): 77-97.
    Garey M R, Johnson D S 1979. Computers and Intractability: A Guide to the Theory ofNP-completeness [M]. WH Freeman & Co. New York, NY, USA.
    Gavin A C, Aloy P, Grandi P, et al. 2006. Proteome survey reveals modularity of the yeast cell machinery. Nature [J], 440(7084): 631-636.
    Geng X, Zhan D C, Zhou Z H 2005. Supervised nonlinear dimensionality reduction for visualization and classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics [J], 35(6): 1098-1107.
    Gong D, Sha F, Medioni G 2010. Locally linear denoising on image manifolds. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS)2010 [C], Sardina, Italy.
    Grossberg S 1987. Competitive learning: From interactive activation to adaptive resonance. Cognitive science [J], 11(1): 23-63.
    Groth R 1999. Data Mining: Building Competitive Advantage [M].
    Gu Q, Zhou J 2009. Local learning regularized nonnegative matrix factorization. IJCAI 2009 [C]: 1046-1051.
    Gui J, Jia W, Zhu L, et al. 2010. Locality preserving discriminant projections for face and palmprint recognition. Neurocomputing [J], 73(13-15):2696-2707.
    Hastie T, Tibshirani R, Friedman J H 2001. The elements of statistical learning: data mining, inference, and prediction [M]. Springer Verlag.
    Haykin S S 1999. Neural networks: a comprehensive foundation [M]. Prentice hall.
    He X, Cai D, Liu H, et al. 2004. Locality preserving indexing for document representation. Proc. of the 27rd ACM SIGIR [C]: 96-103.
    He X, Cai D, Yan S, et al. 2005a. Neighborhood preserving embedding. Tenth IEEE International Conference on Computer Vision, 2005, [C], 2: 1208-1213.
    He X, Niyogi P 2003. Locality preserving projections. Advances in Neural Information Processing Systems [C]: 154-159.
    He X, Yan S, Hu Y, et al. 2005b. Face recognition using laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 27(3): 328-340.
    Hein M, Maier M 2007. Manifold denoising. Advances in Neural Information Processing Systems [C], 19: 561-568.
    Higham D J, Ra ajski M, Przulj N 2008a. Fitting a geometric graph to a protein¨Cprotein interaction network. Bioinformatics [J], 24(8): 1093-1099.
    Howland P, Wang J, Park H 2006. Solving the small sample size problem in face recognition using generalized discriminant analysis. Pattern Recognition [J], 39(2): 277-287.
    Ito T, Chiba T, Ozawa R, et al. 2001. A comprehensive two-hybrid analysis to explore the yeastprotein interactome. Proceedings of the National Academy of Sciences of the United States of America [J], 98: 4569-4574.
    Jansen R, Greenbaum D, Gerstein M 2002. Relating whole-genome expression data with protein-protein interactions. Genome Research [J], 12(1): 37-46.
    Jansen R, Yu H, Greenbaum D, et al. 2003. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science [J], 302(5644): 449-453.
    Jolliffe I 2002. Principal component analysis [M]. Springer.
    Jost J 2002. Riemannian geometry and geometric analysis [M]. Springer Verlag.
    Karp R 1972. Reducibility among combinatorial problems. Complexity of Computer Computations [J]: 85-103.
    Kohonen T 1982. Self-organized formation of topologically correct feature maps. Biological Cybernetics [J], 43: 59-69.
    Kohonen T 1989. Self-organization and associative memory [M], third ed. Springer-Verlag, Berlin.
    Kokiopoulou E, Saad Y 2007. Orthogonal neighborhood preserving projections: A projection-based dimensionality reduction technique. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 29(12): 2143-2156.
    Krogan N J, Cagney G, Yu H, et al. 2006. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature [J], 440(7084): 637-643.
    Kuchaiev O, Ra ajski M, Higham D J, et al. 2009. Geometric de-noising of protein-protein interaction networks. PLoS computational biology [J], 5: e1000454.
    Kumar A, Agarwal S, Heyman J A, et al. 2002. Subcellular localization of the yeast proteome. Genes & Development [J], 16(6): 707-719.
    Levine D S 2000. Introduction to neural and cognitive modeling [M]. Lawrence Erlbaum. Li B, Wang C, Huang D S 2009. Supervised feature extraction based on orthogonal discriminant projection. Neurocomputing [J], 73: 191-196.
    Li H, Jiang T, Zhang K 2006. Efficient and robust feature extraction by maximum margin criterion. IEEE Transactions on Neural Networks [J], 17(1-3): 157-165.
    Lin T, Zha H 2007. Riemannian manifold learning. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 30(5): 796-809.
    Lin T, Zha H, Lee S 2006. Riemannian manifold learning for nonlinear dimensionality reduction. ECCV2006 [C]: 44-55.
    Mahdavi M A, Lin Y H 2007. False positive reduction in protein-protein interaction predictions using gene ontology annotations. BMC Bioinformatics [J], 8: 262.
    Meinguet J 1979. Multivariate interpolation at arbitrary points made simple. Journal of Applied Mathematics and Physics [J], 30: 292-304.
    Mekuz N, Bauckhage C, Tsotsos J K 2005. Face recognition with weighted locally linear embedding. The 2nd Canadian Conference on Computer and Robot Vision, 2005 [C]: 290-296.
    Mika S, Ratsch G, Weston J, et al. 1999. Fisher discriminant analysis with kernels. Proceedings of the 1999 IEEE Signal Processing Society Workshop [C]: 41-48.
    Min W, Lu L, He X 2004. Locality pursuit embedding. Pattern Recognition [J], 37(4): 781-788. Pan Y, Ge S S, Al Mamun A 2009. Weighted locally linear embedding for dimension reduction. Pattern Recognition [J], 42(5): 798-811.
    Przulj N 2007. Biological network comparison using graphlet degree distribution. Bioinformatics [J], 23: e177-e183.
    Przulj N, Corneil D, Jurisica I 2006. Efficient estimation of graphlet frequency distributions in protein-protein interaction networks. Bioinformatics [J], 22(8): 974-980.
    Przulj N, Corneil D G, Jurisica I 2004. Modeling interactome: scale-free or geometric? Bioinformatics [J], 20(18): 3508-3515.
    Przulj N, Higham D J 2006. Modelling protein¨Cprotein interaction networks via a stickiness index. Journal of the Royal Society Interface [J], 3(10): 711-716.
    Qi Y, Klein-Seetharaman J, Bar-Joseph Z 2005. Random forest similarity for protein-protein interaction prediction from multiple sources. Proc. Pacific Symp. Biocomputing, 2005 [C], 10: 531–542.
    Rhodes D R, Tomlins S A, Varambally S, et al. 2005. Probabilistic model of the human protein-protein interaction network. Nature Biotechnology [J], 23(8): 951-959.
    Roweis S T, Saul L K 2000. Nonlinear dimensionality reduction by locally linear embedding. Science [J], 290(5500): 2323-2326.
    S?derkvist O 2001. Computer vision classification of leaves from Swedish trees. Master's Thesis [J], Linkoping University.
    Saito R, Suzuki H, Hayashizaki Y 2002. Interaction generality, a measurement to assess the reliability of a protein-protein interaction. Nucleic Acids Research [J], 30(5): 1163-1168.
    Saito R, Suzuki H, Hayashizaki Y 2003. Construction of reliable protein-protein interaction networks with a new interaction generality measure. Bioinformatics [J], 19(6): 756-763.
    Saul L K, Roweis S T 2003. Think globally, fit locally: unsupervised learning of low dimensional manifolds. The Journal of Machine Learning Research [J], 4: 119-155.
    Sch?lkopf B, Smola A, Muller K R 1998. Nonlinear component analysis as a kernel eigenvalueproblem. Neural Computation [J], 10(5): 1299-1319.
    Sch?lkopf B, Smola A J 2002. Learning with kernels [M]. MIT Press.
    Seung H S, Lee D D 2000. The manifold ways of perception. Science [J], 290(5500): 2268.
    Seward A E, Bodenheimer B 2005. Using nonlinear dimensionality reduction in 3D figure animation. Proceedings of the 43rd Annual Southeast Regional Conference [C], 2: 388-392.
    Souvenir R, Pless R 2005. Manifold clustering. ICCV2005 [C]:648-653.
    Sprinzak E, Altuvia Y, Margalit H 2006. Characterization and prediction of protein-protein interactions within and between complexes. Proceedings of the National Academy of Sciences [J], 103: 14718-14723.
    Sprinzak E, Sattath S, Margalit H 2003. How reliable are experimental protein-protein interaction data? Journal of Molecular Biology [J], 327(5): 919-923.
    Stork D G 1989. Is backpropagation biologically plausible? IJCNN1989 [C], 2: 241-246.
    Tenenbaum J B, de Silva V, Langford J C 2000. A global geometric framework for nonlinear dimensionality reduction. Science [J], 290(5500): 2319-2323. Terradot L, Durnell N, Li M, et al. 2004. Biochemical characterization of protein complexes from the Helicobacter pylori protein interaction map. Molecular & Cellular Proteomics [J], 3(8): 809-819.
    Tong A H Y, Lesage G, Bader G D, et al. 2004. Global mapping of the yeast genetic interaction network. Science [J], 303(5659): 808-813.
    Turk M, Pentland A 1991. Eigenfaces for recognition. Journal of cognitive neuroscience [J], 3(1): 71-86.
    Wahba G 1990. Spline models for observational data [M]. SIAM Press.
    Wang J, Zhang Z, Zha H 2005. Adaptive manifold learning. Advances in neural information processing systems [C], 17: 1473-1480.
    Weinberger K Q, Packer B D, Saul L K 2005. Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics [C]: 381-388.
    Weinberger K Q, Saul L K 2004. Unsupervised learning of image manifolds by semidefinite programming. CVPR-04 [C]: 988-995.
    Weinberger K Q, Saul L K 2006. An introduction to nonlinear dimensionality reduction by maximum variance unfolding. AAA06 [C]: 1683-1686.
    Weinberger K Q, Sha F, Saul L K 2004. Learning a kernel matrix for nonlinear dimensionality reduction. ICML-04 [C]: 106-113.
    Wong L S, Liu G M 2010. Protein interactome analysis for countering pathogen drug resistance.Journal of Computer Science and Technology [J], 25(1): 124-130.
    Xenarios I, Rice D W, Salwinski L, et al. 2000. DIP: the database of interacting proteins. Nucleic Acids Research [J], 28(1): 289-291.
    Xiang S, Nie F, Zhang C 2006. Spline embedding for nonlinear dimensionality reduction. Machine Learning: ECML 2006 [C]: 825-832.
    Xiang S, Nie F, Zhang C 2008. Nonlinear dimensionality reduction with local spline embedding. IEEE Transactions on Knowledge and Data Engineering [J], 21(9): 1285-1298.
    Yan S, Xu D, Zhang B, et al. 2007. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 5(1): 40-51.
    Yang J, Zhang D, Niu B 2007. Globally maximizing, locally minimizing: unsupervised discriminant projection with applications to face and palm biometrics. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 29(4): 650-664.
    Yang L 2005. Building k edge-disjoint spanning trees of minimum total length for isometric data embedding. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 27(10): 1680-1683.
    Ye J 2006. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research [J], 6: 483-502.
    Ye J, Janardan R, Li Q 2004a. GPCA: an efficient dimension reduction scheme for image compression and retrieval. KDD2004 [C]: 354-363.
    Ye J, Janardan R, Park C H, et al. 2004b. An optimization criterion for generalized discriminant analysis on undersampled problems. IEEE Transactions on Pattern Analysis and Machine Intelligence [J], 26(8): 982-994.
    You Z H, Lei Y K, Gui J, et al. 2010. Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data. Bioinformatics [J], 26(21): 2744-2751.
    Youwen L, Shixiong X, Yong Z 2009. A Supervised Local Linear Embedding Based SVM Text Classification Algorithm [C]: 21-26.
    Yu H, Yang J 2001. A direct LDA algorithm for high-dimensional data-with application to face recognition. Pattern Recognition [J], 34(10): 2067-2070.
    Zeng X 2008. Applications of average geodesic distance in manifold learning. Rough Sets and Knowledge Technology [J]: 540-547.
    Zhang L V, Wong S L, King O D, et al. 2004. Predicting co-complexed protein pairs using genomic and proteomic data integration. BMC Bioinformatics [J], 5(1): 38.
    Zhang T, Yang J, Zhao D, et al. 2007. Linear local tangent space alignment and application to face recognition. Neurocomputing [J], 70(7-9): 1547-1553.
    Zhang Z, Zha H 2003. Local linear smoothing for nonlinear manifold learning [M]. Department of Computer Science and Engineering, Pennsylvania State University; University Park, PA, USA.
    Zhang Z, Zha H 2005. Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Scientific Computing [J], 26(1): 313-338.
    Zhao H, Sun S, Jing Z, et al. 2006. Local structure based supervised feature extraction. Pattern Recognition [J], 39(8): 1546-1550.
    施恩伟2004.流形上的微积分[M].科学出版社;北京.
    卢进军,杨杰,梁栋等2006. LLE在图像检索中的应用.
    赵连伟,罗四维,赵艳敞等2005.高维数据流形的低维嵌入及嵌入维数研究.软件学报[J], 16: 1423-1430.
    陈维桓2001.微分流形初步[M].高等教育出版社;北京.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700