稀疏优化在机器学习中的若干应用

英文题名：Some Applications of Sparse Optimization in Machine Learning
作者：梁锡军
论文级别：博士
学科专业名称：运筹学与控制论
中文关键词：非光滑优化 ; 稀疏优化 ; 协同过滤 ; Lasso ; 同伦算法 ; 基因正则化网络 ; 肽段识别
英文关键词：Nonsmooth Optimization ; Sparse Optimization ; Collaborative Filtering ; Lasso ; Homotopy Algorothm ; Gene Regularization Network ; Peptide Identification
学位年度：2013
导师：张宏伟
学科代码：070105
学位授予单位：大连理工大学
论文提交日期：2013-08-01

摘要

近年来,利用解的稀疏性和其他内在结构成为众多计算和工程领域中共同关注的问题.稀疏的内含不仅是指“只有很少的非零分量”,它蕴含着“具有一种简单结构”.本文对机器学习中不同问题的稀疏结构进行建模,并在必要时改进经典的稀疏优化算法进行求解.论文的主要工作可概括如下：
     1.第2章给出了本文在解决不同的机器学习问题中所提出的稀疏优化模型及算法.所提出的稀疏优化模型有同样的抽象结构,即在一个具有某种简单或特定结构的假设空间上极小化某个损失泛函.本文中给出的盒子约束的Lasso模型及块PCA模型均具有这一结构.该章给出了求解盒子约束的Lasso模型的同伦算法及求解块PCA模型的Splitting算法.
     2.第3章研究了求解盒子约束的Lasso模型的同伦算法的收敛性并检验了该算法的数值性能.该章的工作指出同伦算法收敛性不是显然成立.在无退化指标假设和其它较弱的条件下,该章证明了同伦算法具有有限终止性.另外,该章讨论了退化和循环的问题.当前已有众多算法可求解该模型,但数值实验证明同伦算法具有特别的优势：适于最优解非常稀疏的问题及需要计算整条正则化路径的情形.这是第4章协同过滤数据可预测性问题的计算中所采用的关键技术.
     3.第4章研究了协同过滤问题中评分数据的可预测性问题.当前协同过滤方面的大部分工作主要研究算法性能的改进.该章指出,受评分数据自身的限制,评分矩阵中有一部分未知评分是难于给出准确预测的.第4章提出了一个新的度量——相关性,以度量用户在某个商品上的评分能被准确预测的可能性.一个用户一商品对的相关性由相关的用户和商品构成的社区所确定.作为相关性度量的应用,提出了基于数据的组合方法(DOC)以应用于推荐系统.
     4.第5章研究从时间序列基因表达数据中推断基因正则化网络(GRN).由于计算复杂度较大,大部分GRN重建方法仅限于推断较低连通性的单个网络.该章提出了网络和社区识别方法,结合社区结构信息,从基因表达数据中推断多个子网络.其中的块PCA模型,通过第2章给出的Splitting算法,可有效求解网络中的社区结构.
     5.第6章研究了作为蛋白质鉴别关键步骤的肽段识别问题.序列数据库搜索是当前肽段识别的主流方法.但搜索引擎给出的大量的匹配是不正确的.现有方法大多基于半监督或监督学习框架,充分利用了诱骗PSM的样本及标签信息,但目标PSM样本点自身信息没有被充分利用.该章提出了一个称为FC-Ranker的新的评分方法,给每个目标PSM赋予一个非负权重,反映其匹配正确的可能性.特别地,FC-Ranker通过模糊支持向量机分类模型和所提出的模糊Silhouette指标迭代更新该权重.FC-Ranker在ROC指标、相同FDR水平下鉴别目标PSM的数目等方面的性能表现超过了主流后验数据库搜索方法.
In recent years, exploring the sparsity of the solutions and other special structure has become a common issue in many computational and engineering areas. Sparsity is much broader than "having few nonzero elements". It roughly means "having a simple structure". In this thesis, we utilize the sparsity structures to construct models for various practical problems in machine learning and improve the corresponding classical algorithms to solve them when necessary. The main work can be summarized as follows.
     1. In Chapter2, we present the proposed sparse optimization models and algorithms for various practical problems in machine learning. All the models have a common abstract struc-ture, i.e., minimizing a loss functional over a hypothesis space with a certain simple or special structure. The proposed box-constrained Lasso model and Block PCA model possess this struc-ture. We present the Splitting algorithm for solving the Block PCA model and the Homotopy algorithm for solving the Lasso model with box constraints in this chapter.
     2. In Chapter3we analyze the convergence of the Homotopy algorithm for solving the box-constrained Lasso model and evaluate its numerical performance. It is shown that the con-vergence is not as apparent as it looks. Under a non-degeneration index assumption and some other weak conditions, it is proved that Homotopy terminates and finds an optimal solution after finite iterations. Moreover, the issue of degeneration and recursion is discussed. Although there exist many algorithms for solving this model, the numerical experiments have shown the special advantage of the proposed algorithm, and illustrated that the Homotopy algorithm is particularly useful when the underlying solution is quite sparse or the entire regularization path is required. This is the key technique for the calculation of measuring the prediction capability of data for collaborative filtering in Chapter4.
     3. Chapter4is devoted to the prediction capability of data for collaborative filtering. Most work of collaborative filtering has focused on improving the performance of the algorithms. In this chapter we point out that part of unknown ratings could not be accurately predicted, limited by the information of known ratings. We propose a metric,"relatedness", to measure the potential that a user's preference on an item can be accurately predicted. The relatedness of a user-item pair is determined by a community which consists of users and items related to the pair. As an application of the relatedness metric, we develop the Data-Oriented Combination (DOC) method for recommender systems.
     4. Chapter5is devoted to the problem of identifying gene regulatory network (GRN) from time course gene expression data. Due to the computational complexity, most approaches for GRN reconstruction can only identify a single network of low connectivity for a given set of genes. We propose the network and community identification (NCI) method for identifying multiple sub-networks from gene expression data by incorporating community structure infor-mation into GRN inference. The Block PCA model can search communities effeciently in GRNs by employing the Splitting algorithm proposed in Chapter2.
     5. Chapter6is devoted to the peptide identification problem which is the key procedure for protein identification. The sequence database searching has been the dominant method for peptide identification. However, lots of the peptide spectrum matches (PSMs) output by the searching engine are not correct. Based on supervised or semi-supervised learning, most of the current methods make use of the decoy PSMs fully, but the information of target PSM samples themselves has not been explored entirely. A novel scoring method named FC-Ranker is de-veloped by assigning a nonnegative weight to each target PSM based on the possibility of its correctness. Particularly, the weights of PSMs are updated by using a fuzzy SVM classification model and a fuzzy Silhouette index iteratively. In terms of ROC and the number of identified PSMs under common FDR level, FC-Ranker outperforms other popular post-database search algorithms.

引文

[I]Domingos P. The role of Occam's razor in knowledge discovery[J]. Data Mining and Knowledge Discovery,1999,3(4):409-425.
    [2]Vapnik V N. Statistical Learning Theory[M]. New York:John Wiley & Sons,1998.
    [3]Vapnik V N. The Nature of Statistical Learning Theory[M]. New York:Springer-Verlag,2000.
    [4]Tibshirani R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society. Series B (Methodological),1996,58(1):267-288.
    [5]Hale E T, Yin W, Zhang Y. Fixed-point continuation applied to compressed sensing:implementation and numerical experiments[J]. Journal of Computational Mathematics,2010,28(2):170-194.
    [6]Yang J, Zhang Y. Alternating direction algorithms for l1-problems in compressive sensing[J]. SIAM Journal on Scientific Computing,2011,33(1):250-278.
    [7]Daubechies I, Defrise M, De Mol C. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint[J]. Communications on Pure and Applied Mathematics,2004,57(11):1413-1457.
    [8]Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems[J]. SIAM Journal on Imaging Sciences,2009,2(1):183-202.
    [9]Yun S, Toh K C. A coordinate gradient descent method for l1-regularized convex minimization[J]. Computational Optimization and Applications,2011,48(2):273-307.
    [10]Osborne M, Presnell B, Turlach B. A new approach to variable selection in least squares problems[J]. IMA Journal of Numerical Analysis,2000,20(3):389-403.
    [11]Efron B, Hastie T, Johnstone I, et al. Least angle regression[J]. The Annals of Statistics,2004,32(2): 407-499.
    [12]Donoho D L, Tsaig Y. Fast solution of 11-norm minimization problems when the solution may be sparse[J]. IEEE Transactions on Information Theory,2008,54(11):4789-4812.
    [13]Tseng P, Yun S. A coordinate gradient descent method for nonsmooth separable minimization[J]. Math-ematical Programming,2009,117(1-2):387-423.
    [14]Chang C C, Lin C J. LIBSVM:A library for support vector machines[J]. ACM Transac-tions on Intelligent Systems and Technology,2011,2:27:1-27:27. Software available at http:// www.csie.ntu.edu.tw/cjlin/libsvm.
    [15]Fan R E, Chen P H, Lin C J. Working set selection using second order information for training support vector machines[J]. The Journal of Machine Learning Research,2005,6:1889-1918.
    [16]Lin Z, Chen M, Ma Y. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices[J]. arXiv preprint arXiv:1009.5055,2010.
    [17]Everett H. Generalized lagrange multiplier method for solving problems of optimum allocation of resources[J]. Operations Research,1963,11(3):399-417.
    [18]Dantzig G B, Wolfe P. Decomposition principle for linear programs[J]. Operations Research,1960, 8(1):101-111.
    [19]Benders J F. Partitioning procedures for solving mixed-variables programming problems[J]. Nu-merische Mathematik,1962,4(1):238-252.
    [20]Natarajan B K. Sparse approximate solutions to linear systems[J]. SIAM Journal on Computing,1995, 24(2):227-234.
    [21]Chen S S, Donoho D L, Saunders M A. Atomic decomposition by basis pursuit[J]. SIAM Journal on Scientific Computing,1998,20(1):33-61.
    [22]Donoho D L, Huo X. Uncertainty principles and ideal atomic decomposition[J]. IEEE Transactions on Information Theory,2001,47(7):2845-2862.
    [23]Fazel M. Matrix rank minimization with applications[D]. PhD thesis. Stanford University,2002.
    [24]Rockafellar R. Convex Analysis[M]. New Jersey:Princeton University Press,1970.
    [25]Osborne M R, Presnell B, Turlach B A. On the lasso and its dual[J]. Journal of Computational and Graphical Statistics,2000,9(2):319-337.
    [26]Knight K, Fu W. Asymptotics for lasso-type estimators[J]. Annals of Statistics,2000,28(5):1356-1378.
    [27]Zou H. The adaptive lasso and its oracle properties[J]. Journal of the American Statistical Association, 2006,101(476):1418-1429.
    [28]Zhang C H, Huang J. The sparsity and bias of the Lasso selection in high-dimensional linear regres-sion[J]. The Annals of Statistics,2008,36(4):1567-1594.
    [29]Wainwright M J. Sharp thresholds for high-dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso)[J]. IEEE Transactions on Information Theory,2009,55(5): 2183.
    [30]Meinshausen N,Rocha G, Yu B. Discussion:A tale of three cousins:Lasso, L2Boosting and Dantzig[J]. The Annals of Statistics,2007,35(6):2373-2384.
    [31]Bickel P J, Ritov Y, Tsybakov A B. Simultaneous analysis of lasso and Dantzig selector[J]. The Annals of Statistics,2009,37(4):1705-1732.
    [32]James G M, Radchenko P, Lv J. Dasso:connections between the Dantzig selector and lasso[J]. Journal of the Royal Statistical Society:Series B,2009,71(1):127-142.
    [33]Candes E J, Wakin M B. An introduction to compressive sampling[J]. IEEE Signal Processing Maga-zine,2008,25(2):21-30.
    [34]Yin W, Osher S, Goldfarb D, et al. Bregman iterative algorithms for -minimization with applications to compressed sensing[J]. SIAM Journal on Imaging Sciences,2008,1(1):143-168.
    [35]Figueiredo M A, Nowak R D. An EM algorithm for wavelet-based image restoration[J]. IEEE Trans-actions on Image Processing,2003,12(8):906-916.
    [36]Wright S J, Nowak R D, Figueiredo M A. Sparse reconstruction by separable approximation[J]. IEEE Transactions on Signal Processing,2009,57(7):2479-2493.
    [37]Yuan M, Lin Y. Model selection and estimation in regression with grouped variables[J]. Journal of the Royal Statistical Society:Series B (Statistical Methodology),2006,68(l):49-67.
    [38]Roth V, Fischer B. The group-lasso for generalized linear models:uniqueness of solutions and efficient algorithms[C]//Proceedings of the 25th International Conference on Machine Learning. ACM.2008: 848-855.
    [39]Zeng J, Xu Z, Zhang B, et al. Accelerated l1/2 regularization based SAR-imaging via BCR and reduced Newton skills[J]. Signal Processing,2013.
    [40]Xu Z, Chang X, Xu F, et al. l1/2 regularization:a thresholding representation theory and a fast solver[J]. IEEE Transactions on Neural Networks and Learning Systems,2012,23(7):1013-1027.
    [41]Rudin L I, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms[J]. Physica D: Nonlinear Phenomena,1992,60(1):259-268.
    [42]He L, Chang T C, Osher S, et al. MR image reconstruction by using the iterative refinement method and nonlinear inverse scale space methods[R].2006.
    [43]Tibshirani R, Saunders M, Rosset S, et al. Sparsity and smoothness via the fused lasso[J]. Journal of the Royal Statistical Society:Series B (Statistical Methodology),2005,67(1):91-108.
    [44]Recht B, Fazel M, Parrilo P A. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization[J]. SIAM Review,2010,52(3):471-501.
    [45]Fazel M, Pong T, Sun D, et al. Hankel matrix rank minimization with applications in system identifi-cation and realization[J]. SIAM Journal on Matrix Analysis and Applications, to appear,2013,34.
    [46]Liu Y J, Sun D, Toh K C. An implementable proximal point algorithmic framework for nuclear norm minimization[J]. Mathematical Programming,2012,133(1-2):399-436.
    [47]David J. Algorithms for analysis and design of robust controllers[D]. PhD thesis. ESAT,3001 Leuven, Belgium:Kat. Univ. Leuven,1994.
    [48]Wang C, Sun D, Toh K C. Solving log-determinant optimization problems by a Newton-CG primal proximal point algorithm[J]. SIAM Journal on Optimization,2010,20(6):2994-3013.
    [49]Candes E J, Li X, Ma Y, et al. Robust principal component analysis?[J]. Journal of the ACM (JACM), 2011,58(3):11.
    [50]Wright J, Ganesh A, Rao S, et al. Robust principal component analysis:Exact recovery of corrupted low-rank matrices via convex optimization[C]//Advances in Neural Information Processing Systems. 2009:2080-2088.
    [51]Candes E J, Li X, Ma Y, et al. Robust principal component analysis?[J]. Journal of the ACM (JACM), 2011,58(3):11.
    [52]Zhang Z, Ganesh A, Liang X, et al. TILT:transform invariant low-rank textures[J]. International Journal of Computer Vision,2012,99(1):1-24.
    [53]Wright J, Ganesh A, Rao S, et al. Robust principal component analysis:Exact recovery of corrupted low-rank matrices via convex optimization[C]//Advances in Neural Information Processing Systems. 2009:2080-2088.
    [54]Lin Z, Ganesh A, Wright J, et al. Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix[J]. Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP),2009, 61.
    [55]Toh K C, Yun S. An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems[J]. Pacific Journal of Optimization,2010,6(615-640):15.
    [56]Candes E J, Recht B. Exact matrix completion via convex optimization[J]. Foundations of Computa-tional Mathematics,2009,9(6):717-772.
    [57]Cai J F, Candes E J, Shen Z. A singular value thresholding algorithm for matrix completion[J]. SIAM Journal on Optimization,2010,20(4):1956-1982.
    [58]Evgeniou T, Pontil M, Poggio T. Regularization networks and support vector machines[J]. Advances in Computational Mathematics,2000,13(1):1-50.
    [59]Smola A J, Scholkopf B. A tutorial on support vector regression[J]. Statistics and Computing,2004, 14(3):199-222.
    [60]Girosi F. An equivalence between sparse approximation and support vector machines[J]. Neural Com-putation,1998,10(6):1455-1480.
    [61]Scholkopf B, Smola A J. Learning with Kernels[M]. Cambridge, Massachusetts; London, England: The MIT Press,2002.
    [62]Suykens J A, De Brabanter J, Lukas L, et al. Weighted least squares support vector machines:robustness and sparse approximation[J]. Neurocomputing,2002,48(l):85-105.
    [63]Tipping M E. Sparse bayesian learning and the relevance vector machine[J]. The Journal of Machine Learning Research,2001,1:211-244.
    [64]Meinshausen N, Buhlmann P. High-dimensional graphs and variable selection with the lasso[J]. The Annals of Statistics,2006,34(3):1436-1462.
    [65]Banerjee O, El Ghaoui L, d'Aspremont A. Model selection through sparse maximum likelihood esti-mation for multivariate gaussian or binary data[J]. The Journal of Machine Learning Research,2008, 9:485-516.
    [66]Wainwright M J, Lafferty J D, Ravikumar P K. High-dimensional graphical model selection using l1-regularized logistic regression[J]. Advances In Neural Information Processing Systems,2006,19:1465-1472.
    [67]Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso[J]. Biostatistics,2008,9(3):432-441.
    [68]Chambolle A, Pock T. A first-order primal-dual algorithm for convex problems with applications to imaging[J]. Journal of Mathematical Imaging and Vision,2011,40(1):120-145.
    [69]He B, Yuan X. Convergence analysis of primal-dual algorithms for a saddle-point problem:From contraction perspective[J]. SIAM Journal on Imaging Sciences,2012,5(1):119-149.
    [70]安百国.关于模型稀疏性的研究[D]. PhD thesis.东北师范大学，2012.
    [71]尚丽.稀疏编码算法及其应用研究[D]. PhD thesis.中国科学技术大学,2006.
    [72]Cai J, Candes E, Shen Z. A singular value thresholding algorithm for matrix completion[J]. Siam Journal of Optimization,2010,20(4):1956-1982.
    [73]Candes E J, Wakin M B, Boyd S P. Enhancing sparsity by reweighted l1 minimization[J]. Journal of Fourier Analysis and Applications,2008,14(5-6):877-905.
    [74]Friedlander M P, Mansour H, Saab R, et al. Recovering compressively sampled signals using partial support information[J]. IEEE Transactions on Information Theory,2012,58(2):1122-1134.
    [75]Tikhonov A N, Arsenin V Y. Solutions of Ill-posed Problems[M]. Washington, D.C.:Winston & Sons, 1977.
    [76]Vapnik V N, Chervonenkis A Y. On the uniform convergence of relative frequencies of events to their probabilities[J]. Theory of Probability & Its Applications,1971,16(2):264-280.
    [77]Rosset S, Zhu J. Piecewise linear regularized solution paths[J]. The Annals of Statistics,2007,1012-1030.
    [78]Hastie T, Rosset S, Tibshirani R, et al. The entire regularization path for the support vector machine[C]// Journal of Machine Learning Research.2004:1391-1415.
    [79]Davis T A, Hager W W. Row modifications of a sparse Cholesky factorization[J]. SIAM Journal on Matrix Analysis and Applications,2005,26(3):621-639.
    [80]Bach F, Jenatton R, Mairal J, et al. Optimization with sparsity-inducing penalties[J]. arXiv preprint arXiv:1108.0775,2011.
    [81]Grant M, Boyd S. CVX:Matlab software for disciplined convex programming, version 1.21.2011. http://cvxr.com/cvx.
    [82]He B, Tao M, Yuan X. A splitting method for separate convex programming with linking linear con-straints[R].2010.
    [83]Boyd S, Parikh N, Chu E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends in Machine Learning,2010,3(1):1-122.
    [84]Xiao L, Zhang T. A proximal-gradient homotopy method for the sparse least-squares problem [J]. SIAM Journal on Optimization,2013,23(2):1062-1091.
    [85]Rockafellar R T, Wets R J B. Variational Analysis:Grundlehren der Mathematischen Wis-senschaften[M]. Berlin Heidelberg:Springer-Verlag,1998.
    [86]Wang X, Yuan X. The linearized alternating direction method of multipliers for Dantzig selector[J]. SI AM Journal on Scientific Computing,2012,34(5):A2792-A2811.
    [87]Candes E, Tao T. The dantzig selector:statistical estimation when p is much larger than n[J]. The Annals of Statistics,2007,35(6):2313-2351.
    [88]Glowinski R, Marroco A. Sur Tapproximation, par elements finis d'ordre un, et la resolution, par penalisation-dualite d'une classe de problemes de dirichlet non lineaires[J]. Rev. Francaise Aut. Inf. Rech. Open,1975,9(2):41-76.
    [89]Gabay D, Mercier B. A dual algorithm for the solution of nonlinear variational problems via finite element approximation[J]. Computers and Mathematics with Applications,1976,2(1):17-40.
    [90]Resnick P, Iacovou N, Suchak M, et al. Grouplens:an open architecture for collaborative filtering of netnews[C]//Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work. 1994:175-186.
    [91]Adomavicius G, Tuzhilin A. Toward the next generation of recommender systems:a survey of the state-of-the-art and possible extensions[J]. IEEE Transactions on Knowledge and Data Engineering, 2005,17(6):734-749.
    [92]Herlocker J, Konstan J, Terveen L, et al. Evaluating collaborative filtering recommender systems[J]. ACM Transactions on Information Systems (TOIS),2004,22(1):5-53.
    [93]Gunawardana A, Shani G. A survey of accuracy evaluation metrics of recommendation tasks[J]. The Journal of Machine Learning Research,2009,10:2935-2962.
    [94]Breese J, Heckerman D, Kadie C, et al. Empirical analysis of predictive algorithms for collaborative filtering[C]//Proceedings of the 14th conference on Uncertainty in Artificial Intelligence.1998:43-52.
    [95]Takacs G, Pilaszy I, Nemeth B, et al. Scalable collaborative filtering approaches for large recommender systems[J]. The Journal of Machine Learning Research,2009,10:623-656.
    [96]Sarwar B, Konstan J, Borchers A, et al. Using filtering agents to improve prediction quality in the GroupLens research collaborative filtering system[C]//Proceedings of the 1998 ACM Conference on Computer Supported Cooperative Work.1998:345-354.
    [97]Good N, Schafer J, Konstan J, et al. Combining collaborative filtering with personal agents for better recommendations[C]//Proceedings of the National Conference on Artificial Intelligence.1999:439-446.
    [98]Sarwar B, Karypis G. Konstan J. et al. Analysis of recommendation algorithms for e-commerce[C]// Proceedings of the 2nd ACM Conference on Electronic Commerce.2000:158-167.
    [99]McNee S, Lam S, Guetzlaff C, et al. Confidence displays and training in recommender systems[C]// Proceedings of INTERACT"03 IFIP TCI3 International Conference on Human-Computer Interaction. 2003:176-183.
    [100]Adomavicius G, Kamireddy S, Kwon Y. Towards more confident recommendations:improving rec-ommender systems using filtering approach based on rating variance[C]//Proc. of the 17th Workshop on Information Technology and Systems.2007.
    [101]Sarwar B, Karypis G, Konstan J, et al. Item-based collaborative filtering recommendation algorithm-s[C]//Proceedings of the 10th International Conference on World Wide Web. ACM. Hong Kong:2001: 285-295.
    [102]McLaughlin M, Herlocker J. A collaborative filtering algorithm and evaluation metric that accurately model the user experience[C]//Proceedings of the 27th annual international ACM SIGIR Conference on Research and Development in Information Retrieval.2004:329-336.
    [103]Wang J, De Vries A, Reinders M. Unifying user-based and item-based collaborative filtering approaches by similarity fusion[C]//Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.2006:501-508.
    [104]Wu F, He L, Xia W, et al. A collaborative filtering algorithm based on users'partial similarity[C]//10th International Conference on Control, Automation, Robotics and Vision (ICARCV).2008:1072-1077.
    [105]Yang J, Wang H, Wang W, et al. Enhanced biclustering on expression data[C]//Proc. Third IEEE Symp. Biolnformatics and BioEng (BIBE).2003:321-327.
    [106]Symeonidis P, Nanopoulos A, Papadopoulos A, et al. Nearest-biclusters collaborative filtering based on constant and coherent values[J]. Information Retrieval,2008,11(1):51-75.
    [107]Jin R, Si L, Zhai C. A study of mixture models for collaborative filtering[J]. Informational Retrieval, 2006, (9):357-382.
    [108]Riedl J, Konstan J, Terveen L. MovieLens 2006. http://www.grouplens.org/node/73.
    [109]DEC Systems Research Center. EachMovie 1997. http://www.research.digital.com/SRC/eachmovie/.
    [110]Chen S, Donoho D, Saunders M. Atomic decomposition by basis pursuit[J]. SIAM Journal on Scientific Computing,1999,20(1):33-61.
    [111]Nesterov Y, Nemirovsky A. Interior point polynomial methods in convex programming 1994.
    [112]Wright S. Primal-dual interior-point methods[M]. Philadelphia, PA:Society for Industrial Mathematics, 1997.
    [113]McCrae J, Piatek A, Langley A. Collaborative filtering 2004. http://www.imperialviolet.org/.
    [114]Koren Y. Factorization meets the neighborhood:a multifaceted collaborative filtering model[C]//Pro-ceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Min-ing.2008:426-^34.
    [115]Herlocker J, Konstan J, Borchers A, et al. An algorithmic framework for performing collaborative filtering[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM. Berkeley, California:1999:230-237.
    [116]Koren Y. The bellkor solution to the Netflix grand prize[R].2009.
    [117]Breiman L. Bagging predictors[J]. Machine Learning,1996,24(2):123-140.
    [118]Freund Y, Schapire R. Experiments with a new boosting algorithm[C]//13th International Conference on Machine Learning.1996:148-156.
    [119]Breiman L, Friedman J, Olshen R, et al. Classification and Regression Trees[M]. Wadsworth,1984.
    [120]Bell R, Koren Y, Volinsky C. The bellkor 2008 solution to the Netflix prize[R].2008.
    [121]Toscher A, Jahrer M, Bell R. The bigchaos solution to the netflix grand prize[R].2009.
    [122]Konstan J, Miller B, Maltz D, et al. Grouplens:applying collaborative filtering to usenet news[J]. Communications of the ACM,1997,40(3):77-87.
    [123]Hofmann T. Latent semantic models for collaborative filtering[J]. ACM Transactions on Information Systems (TOIS),2004,22(1):89-115.
    [124]Akutsu T, Miyano S, Kuhara S, et al. Identification of genetic networks from a small number of gene expression patterns under the boolean network model[C]//Pacific Symposium on Biocomputing. vol 4. World Scientific Maui, Hawaii.1999:17-28.
    [125]Bernard A, Hartemink A, et al. Informative structure priors:joint learning of dynamic regulatory networks from multiple types of data[C]//Pacific Symposium on Biocomputing. vol 10.2005:459-470.
    [126]Chen T, He H, Church G, et al. Modeling gene expression with differential equations[C]//Pacific Symposium On Biocomputing. vol 4.1999:29-40.
    [127]De Hoon M, Imoto S, Kobayashi K, et al. Inferring gene regulatory networks from time-ordered gene expression data of bacillus subtilis using differential equations[C]//Biocomputing 2003:Proc. Pacific Symposium, vol 8.2003:17-28.
    [128]Hu X, Ng M, Wu F, et al. Mining, modeling, and evaluation of subnetworks from large biomolecular networks and its comparison study[J]. IEEE Transactions on Information Technology in Biomedicine, 2009,13(2):184-194.
    [129]Huang Y, Tienda-Luna I, Wang Y. A survey of statistical models for reverse engineering gene regulatory networks[J]. IEEE Signal Processing Magazine,2009,26(1):76-97.
    [130]Wu F. Inference of gene regulatory networks and its validation[J]. Current Bioinformatics,2007,2(2): 139-144.
    [131]Yuan Y, Li C, Windram O. Directed partial correlation:inferring large-scale gene regulatory network through induced topology disruptions[J]. PLos One,2011,6(4):e16835.
    [132]Newman M. Fast algorithm for detecting community structure in networks[J]. Physical Review E, 2004,69(6):e066133.
    [133]Pothen A, Simon H, Liou K, et al. Partitioning sparse matrices with eigenvectors of graphs[J]. SIAM J. Matrix Anal. Applic.,1990,11(3):430-452.
    [134]Kernighan B, Lin S. An efficient heuristic procedure for partitioning graphs[J]. Bell System Technical Journal,1970,49(2):291-307.
    [135]Girvan M, Newman M. Community structure in social and biological networks[J]. Proceedings of the National Academy of Sciences,2002,99(12):7821-7826.
    [136]Radicchi F, Castellano C, Cecconi F, et al. Defining and identifying communities in networks[J]. Pro-ceedings of the National Academy of Sciences of the United States of America,2004,101(9):2658-2663.
    [137]Palla G, Derenyi I, Farkas I, et al. Uncovering the overlapping community structure of complex net-works in nature and society[J]. Nature,2005,435:814-818.
    [138]Newman M. Detecting community structure in networks[J]. The European Physical Journal B-Condensed Matter and Complex Systems,2004,38(2):321-330.
    [139]Wu F, Liu L, Xia Z. Identification of gene regulatory networks from time course gene expression da-ta[C]//Engineering in Medicine and Biology Society (EMBC),2010 Annual International Conference of the IEEE.2010:795-798.
    [140]Lee M, Shen H, Huang J, et al. Biclustering via sparse singular value decomposition[J]. Biometrics, 2010,66:1087-1095.
    [141]The users'manual and references (version 6.0) 2012. http://docs.mosek.com/6.0/tools/index.html.
    [142]Chandrasekaran V, Sanghavi S, Parrilo P, et al. Rank-sparsity incoherence for matrix decomposition[J]. SIAM Journal on Optimization,2011,21(2):572-596.
    [143]AGN. http://http://www.comp-sys-bio.org/AGN/index.html.
    [144]web 100-023. http://www.comp-sys-bio.org/AGN/Web/Web100-023.html.
    [145]SDPNAL. http://www.math.nus.edu.sg/matsundf/.
    [146]Elias J, Gygi S. Target-decoy search strategy for increased confidence in large-scale protein identifica-tions by mass spectrometry[J]. Nature methods,2007,4(3):207-214.
    [147]Perkins D, Pappin D, Creasy D, et al. Probability-based protein identification by searching sequence databases using mass spectrometry data[J]. Electrophoresis,1999,20(18):3551-3567.
    [148]Ramakrishnan S, Mao R, Nakorchevskiy A, et al. A fast coarse filtering method for peptide identifica-tion by mass spectrometry [J]. Bioinformatics,2006,22(12):1524-1531.
    [149]Keller A, Nesvizhskii A, Kolker E, et al. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search[J]. Analytical Chemistry,2002,74(20):5383-5392.
    [150]Ding Y, Choi H, Nesvizhskii A. Adaptive discriminant function analysis and reranking of MS/MS database search results for improved peptide identification in shotgun proteomics[J]. Journal of Pro-teome Research,2008,7(11):4878-4889.
    [151]Choi H, Nesvizhskii A. Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics[J]. Journal of Proteome Research,2007,7(1):254-265.
    [152]Richard E, Knierman M, Freeman A, et al. Estimating the statistical significance of peptide identi-fications from shotgun proteomics experiments[J]. Journal of Proteome Research,2007,6(5):1758-1767.
    [153]Olsen J, Mann M. Improved peptide identification in proteomics by two consecutive stages of mass spectrometric fragmentation[J]. Proceedings of the "National Academy of Sciences of the United States of America,2004,101(37):13417-22.
    [154]Bianco L, Mead J, Bessant C. Comparison of novel decoy database designs for optimizing protein identification searches using ABRF sPRG2006 standard MS/MS data sets[J]. Journal of Proteome Research,2009,8(4):1782-1791.
    [155]Anderson D, Li W, Payan D, et al. A new algorithm for the evaluation of shotgun peptide sequencing in proteomics:support vector machine classification of peptide MS/MS spectra and SEQUEST scores[J]. Journal of Proteome Research,2003,2(2):137-146.
    [156]Spivak M, Weston J, Bottou L, et al. Improvements to the percolator algorithm for peptide identification from shotgun proteomics data sets[J]. Journal of Proteome Research,2009,8(7):3737-3745.
    [157]Kail L, Canterbury J, Weston J, et al. Semi-supervised learning for peptide identification from shotgun proteomics datasets[J]. Nature Methods,2007,4(11):923-925.
    [158]Rousseeuw P. Silhouettes:a graphical aid to the interpretation and validation of cluster analysis[J]. Journal of Computational and Applied Mathematics,1987,20:53-65.
    [159]Petrovic S. A comparison between the silhouette index and the davies-bouldin index in labelling ids clusters[C]//Proceedings of the 11th Nordic Workshop of Secure IT Systems.2006:53-64.
    [160]Zhou W, Zhang L, Jiao L. Linear programming support vector machines[J]. Pattern Recognition,2002, 35(I2):2927-2936.
    [161]Sanders S, Jennings J, Canutescu A, et al. Proteomics of the eukaryotic transcription machinery:iden-tification of proteins associated with components of yeast tfiid by multidimensional mass spectrome-try[J]. Molecular and Cellular Biology,2002,22(13):4723-4738.
    [162]SGD. Saccharomyes Genome Database 2012. http://www.yeastgenome.org/.
    [163]GenBank. NCBI gene bank 2012. http://www.ncbi.nlm.nih.gov/genbank/.
    [164]Lewis A. The mathematics of eigenvalue optimization[J]. Mathematical Programming,2003,97(1): 155-176.
    [165]Watson G. Characterization of the subdifferential of some matrix norms[J]. Linear Algebra and its Applications,1992,170:33-45.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700