基于核的学习算法与应用

英文题名：Kernel Based Learning Algorithm and Application
作者：渐令
论文级别：博士
学科专业名称：运筹学与控制论
中文关键词：核学习算法 ; 多核学习 ; 高炉冶炼过程 ; 串联质谱 ; 多肽鉴定
英文关键词：Kernel Learning Algorithm ; Multiple Kernel Learning ; Blast Furnace
英文关键词：Ironmaking Process ; Tandem Mass Spectrometry ; Peptide Identification
学位年度：2012
导师：夏尊铨
学科代码：070105
学位授予单位：大连理工大学
论文提交日期：2012-03-01

摘要

核技巧是解决非线性问题的强力工具,基于核的学习理论与算法研究是机器学习领域的研究热点.本文主要针对核学习算法设计及其在高炉冶炼过程、蛋白质鉴定问题中的若干应用展开研究.
     核学习算法设计方面,设计了二进制编码支持向量机(Support Vector Ma-chines:SVM)算法,将N-分类问题转化为[log2N]个二分类子问题,相比于传统的one-against-one方法需要(?)(N2)个子分类器,one-against-all方法需要(?)(N)个子分类器,二进制编码SVM显著提高了子分类器的效率；将最小二乘支持向量机(Least Square SVM:LS-SVM)的多核学习(Multiple Kernel Learning:MKL)(?)问题转化为半定规划问题(Semidefinite Programming:SDP),在MKL统一框架下实现了对核系数和正则化参数的优化,进而推动了核和正则化参数的自动化选取,与SVM MKL相比LS-SVM MKL在保持精度的同时计算复杂度大大降低,UCI基准数据库上的数值试验验证了所设计LS-SVM MKL算法的有效性.
     高炉冶炼过程的炉温预测与趋势分类是本文研究的应用问题之一.本文以高炉炉内热状态的重要指标高炉铁水硅含量([Si])为研究对象,在光滑支持向量回归机(Smooth Support Vector Regression:SSVR)模型中引入滑动窗口(Sliding Windows:SW)机制建立了SW-SSVR模型,通过不断更新学习样本,能够及时追踪系统的变化,应用SW-SSVR模型对[Si]进行数值预报,数值试验表明,SW-SSVR模型有较高的预测成功率,较短的计算时间,适合在线应用；将[Si]趋势预报问题转化为一个4分类问题,即剧升、微升、微降、剧降,应用二进制编码SVM对国内两座高炉[Si]进行趋势预报,该模型使得高炉工长在控制高炉炉温方向的同时可以决定调控力度；使用MKL整合高炉冶炼过程中出现的异质数据提高了模型预测精度,应用MKL对高炉采集变量进行特征约简,增强了黑箱模型的可解释性.
     基于串联质谱(MS/MS)的多肽鉴定问题是本文研究的另一个应用问题.蛋白质组学是后基因组时代的前沿热点,而串联质谱、蛋白质芯片等高通量实验技术极大地推动了蛋白质组学的发展.通过串联质谱鉴定多肽序列进而鉴定蛋白质是当前蛋白质组学研究中常用的研究方法.由于蛋白质样品和生物实验的复杂性,质谱图富含噪声,数据库搜索得到的多肽匹配中存在大量阴性鉴定,目前已提出多种算法用来优化多肽鉴定,但仍不能完美地区分阳性和阴性多肽鉴定.鉴于此,本文应用基于MKL SVM的De-Noise算法将串联质谱数据多肽鉴定问题转化为特殊分类问题：正类样本点被严重污染并不可信,而负类样本点完全可信De-Noise算法首先依赖距离关系执行去噪处理,然后基于去噪后的样本集训练SVM分类器并执行2次精炼过程,最后整合多肽的酶切信息给出鉴定结果.在3个蛋白质数据集Yeast(LCQ质谱仪)、UPS1(LTQ质谱仪)、Ta108(Orbit质谱仪)的SEQUEST搜库结果中对比了De-Noise算法和PeptideProphet、Percolator的多肽鉴定结果,在给定期望假阳性率(False Discovery Rate:FDR)下De-Noise算法显著提高了多肽鉴定的灵敏度和特异性.
Kernel trick is a powerful tool for solving nonlinear problems, kernel based learning theory and algorithm are research focuses in machine learning field. This thesis mainly focuses on the design of kernel based learning algorithm and its application in blast furnace ironmaking process and protein identification problems.
     The main studies in design of kernel based learning algorithms lie in:propose a novel binary coding SVM algorithm which takes a N-classes classification task as multiple binary classification problem and only requires [log2N]binary classifiers, greatly lower than the con-ventional one-against-one method (?)(N2) and one-against-all method (?)(N); formulate the is-sue of multiple kernel learning(MKL) for LS-SVM as a semidefinite programming to get the global optimal solution, furthermore, optimize the regularization parameter with the kernel co-efficients in a unified framework, which leads to an automatic process for model selection, the computational complexity of LS-SVM MKL reduces greatly compared with that of SVM MKL but sharing evenly matched precision, which makes LS-SVM MKL be suitable for dealing with large scale data sets, and perform extensive validation experiments.
     As one application problem, this paper studies the prediction and trend classification mod-els of temperature in blast furnace(BF) ironmaking progress. Focus on the silicon content in hot metal([Si]), a chief indicator of the furnace temperature, this thesis explores the nonlinear approximation ability of SVM and constructs data-based models for [Si] prediction includes: incorporate the sliding windows schematic into smooth support vector regression and construct the sliding windows smooth support vector regression(SW-SSVR) model, which can update learning samples and track the state change of the studied system in time, the SW-SSVR model is employed to address the [Si] prediction problem, which exhibits good performance with high percentage of successful trend prediction, competitive computational speed and timely online service; through the proposed binary coding SVM algorithm, a four-class problem, i.e., sharp descent, slight descent, sharp ascent and slight ascent of [Si], is reduced into two binary classifi-cation problems to solve, to heel, the four-class classification results can guide the blast furnace operators to determine the blast furnace control span together with the control direction in ad- vance; aiming at the prediction problem of [Si] change trend, MKL is employed to integrate heterogeneous data which improves the prediction accuracy, further more MKL is utilized to do feature reduction which is quite helpful for increasing the comprehensibility on explaining which variable is important for black box modeling.
     Peptide identification by tandem mass spectrometry(MS/MS) is another application is-sue of this thesis. Proteomics has become a hot subject in the post-genomic era. Peptide identification by MS/MS is widely used for high-throughput identification of proteins in com-plex biological samples. A flexible algorithm based on MKL SVM, named De-Noise, is pro-posed to transform the peptide identification problem into a special binary classification prob-lem. The De-Noise algorithm starts with the pre-process in which some of the noisy target PSM are eliminated from the target PSM dataset to provide more reliable training dataset. The noisy PSM are determined by computing their distance to the centroid of decoy PSM. Once the noisy target PSM are discarded from the original target PSM dataset in the data pre-process step, two rounds of refining processes are taken to distinguish the correct PSM from the incorrect PSM. At last, proteolytic information is integrated for validating PSM.We test the De-Noise algorithm on three data sets from multiple mass spectrometry platforms, Yeast(LCQ)、UPS1(LTQ)、Ta108(Orbit) and compared it with PeptideProphet and percolator. The performance of the De-Noise algorithm is shown to be superior on all data sets searched on sensitivity and spectificity. Thus, the De-Noise algorithm could be able to validate the database search results effectively.

引文

[1]Haykin S. Bounded rationality and organizational learning [J]. Organizaiton Science,1991,2(1):125-134.
    [2]范玉刚.基于Kernel的机器学习在建模与分类问题的应用研究[D].杭州：浙江大学,2006.
    [3]Mitchell T M. Machine Learning [M]. New York:McGraw-Hill,1997.
    [4]Rosenblatt F. The perceptron:a probabilistic model for information storage and organization in the brain [J]. Psychological Review,1958,65(6):386-408.
    [5]胡崇海.基于图的半监督机器学习[D].杭州：浙江大学,2008.
    [6]张玉柱,胡长庆.炼铁节能与工艺计算[M].北京：冶金工业出版社,2002.
    [7]赵敏.高炉冶炼过程的复杂性机理及其预测研究[D].杭州：浙江大学,2008.
    [8]罗世华.高炉冶炼过程的分形特征辨识及其应用研究[D].杭州：浙江大学,2006.
    [9]Pandit S M, Clum J A, Wu S M. Modeling, Prediction and control of blast furnace operation observed data by multivanate time sries. Ironmaking Proceedings [C], Metallurgical Society of AIME, Iron and Steel Division,1975.
    [10]Radhakrishnan V R, Mohamed A R. Neural networks for the identification and control of blast furnace hot metal quality [J]. Journal of Process Control,10(6):509-524,2000.
    [11]Pettersson F, Chakraborti N, Saxen H. A genetic algorithms based multi-objective neural net applied to noisy blast furnace data [J]. Applied Soft Computing,7(1):387-397,2007.
    [12]高小强,郑忠,黄庆周.高炉铁水含硅量和含硫量动力学预报研究[J].钢铁,1995,4：10-13.
    [13]Miyano T, Kimoto S, Shibuta H, et al. Time series analysis and prediction on complex dynamical behavior observed in blast furnace [J]. Physica D,2000,135(3-4):305-330.
    [14]郜传厚.高炉炼铁过程的混沌动力学研究[D].杭州：浙江大学,2004.
    [15]李启会.高炉冶炼过程的模糊辨识、预测及控制[D].杭州：浙江大学,2005.
    [16]Zhao J, Wang W, Liu Y, Pedrycz W. A two-stage online prediction method for a blast furnace gas system and its application [J]. IEEE Transactions on Control Systems Technology,2011,19(3):507-520.
    [17]张京芬.蛋白质鉴定中串联质谱数据预处理的算法研究[D].北京：中国科学院研究生院,2006.
    [18]邵晨.机器学习方法预测蛋白质相互作用应用Logistic回归提高质谱多肽鉴定的准确度[D].北京：中国医学科学院,2008.
    [19]王淑琴.机器学习方法及其在生物信息学领域中的应用[D].长春：吉林大学,2009.
    [20]杜伟.机器学习及数据挖掘在生物信息学中的应用研究[D].长春：吉林大学,2011.
    [21]Keller A, Nesvizhskii A I, Kolker E, Aebersold R. Empirical statistical model to estimate the accu-racy of peptide identification made by MS/MS and database search [J]. Analytical Chemistry,2002, 74:5383-5392.
    [22]Anderson D C, Li W, Payan D G, Noble W S. A new algorithm for the evaluation of shotgun pep-tide sequencing in proteomics:support vector machine classification of peptide MS/MS spectra and SEQUEST scores [J]. Journal of Proteome Research,2003,2(2):137-146.
    [23]Kall L, Canterbury J, Weston J, Noble W S, MacCoss M J A. Semisupervised machine learning tech-nique for peptide identification from shotgun proteomics datasets [J]. Nature Methods,2007,4:923-925.
    [24]Spivak M, Weston J, Bottou L, Kall L, Noble W S. Improvements to the Percolator algorithm for peptide identification from shotgun proteomics data sets [J]. Journal of Proteome Research,2009, 8:3737-3745.
    [25]Mercer J. Functions of positive and negative type and their connection with the theory of integral equations [J]. Philosophical Transactions of the Royal Society A,1909,209:415-446.
    [26]Moore E H. On properly positive Hermitian matrices [J]. Bulletin of the American Mathematical So-ciety,23:59,66-67,1916.
    [27]Aronszajn N. Theory of Reproducing Kernels [J]. Transactions of the American Mathematical Society, 1950,68(3):337-404.
    [28]Aizerman M, Braverman E, Rozonoer L. Theoretical foundations of the potential function method in pattern recognition learning [J]. Automation and Remote Control,1964,25:821-837.
    [29]汪洪桥,孙福春,蔡艳宁,陈宁,丁林阁.多核学习法方法[J].自动化学报,2010,36(8)：1037-1050.
    [30]Poggio T. On optimal nonlinear associative recall [J]. Biological Cybernetics,1975,19:201-209.
    [31]Boser B E, Guyonand I M, Vapnik V N. A training algorithm for optimal margin classifiers. Proceed-ings of the 5th Annual ACM Workshop on Computational Learning Theory [C], ACM Press,144-152, 1992.
    [32]Scholkopf B. Nonlinear component analysis as a kernel eigenvalue problem [J]. Neural Computation, 1998,10(5):1229-1319.
    [33]Mika S,Ratsch G, Weston J, Scholkopf B, Muller K. Fisher diseriminant analysis with kernels. In Hu Y H et al. editors. Neural Networks for Signal Proeessing IX [C], Piscataway:IEEE,41-48,1999.
    [34]Baudat B, Anouar F. Generalized discriminant analysis using a kernel approach [J]. Neural Computa-tion,12(10):2358-2404,2000.
    [35]Lai P L, Fyfe C. Kernel and nonlinear canonical correlation analysis [J]. International Journal of Neural Systems,10(5):365-77,2000.
    [36]Bach FR, Jordan MI. Kernel independent component analysis [J]. Journal of Machine Learning Re-search,3:1-48,2002.
    [37]Lanckriet G R G, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI. Learning the kernel matrix with semidefinite programming [J]. Journal of Machine Learning Research,5:27-72.
    [38]Lanckriet G R G, Bie T De, Cristianini N, Jordan M I, Noble W S. A statistical framework for genomic data fusion [J]. Bioinformatics,2004,20:2626-2635.
    [39]邓乃扬,田英杰.数据挖掘中的新方法——支持向量机[M].北京：科学出版社,2004.
    [40]Cristianini N, Shawe-Taylor J, Campbell C. Dyanmically adapting kernels in support vector machines. Advances in Neural Information Processing Systems [C]. Denver, U.S.A:MIT Press,2008.
    [41]董玉林.支持向量机中参数选取的平衡约束规划方法[D].大连：大连理工大学,2006.
    [42]田英杰.支持向量回归机及其应用研究[D].北京：中国农业大学,2005.
    [43]Suykens J A K, Vandewalle J. Least squares support vector machine classifiers [J]. Neural Processing Letters,1999,9:293-300. http://www.esat.kuleuven.be/sista/lssvmlab.
    [44]张立卫.锥约束优化：最优性理论与增广Lagrange方法[M].北京：科学出版社,2009.
    [45]State Metallurgical Industry Bureau, Chinese Society of Metal, Center of Metallurgical Science and Technology. Guide for Development of Metallurgical Science and Technology (2000-2005),1999.
    [46]Takahashi H, Kawai H, Kobayashi M, Fukui T. Two dimensional cold model study on unstable solid descending motion and control in blast furnace operation with low reducing agent rate [J]. ISIJ Inter-national,2005,45(10):1386-1395.
    [47]Nogami H, Chu M S, Yagi J. Multi-dimensional transient mathematical simulator of blast furnace pro-cess based on multi-fuid and kinetic theories [J]. Computer and Chemical Engineering,2005,29(11-12):2438-2448.
    [48]Chu M S, Yang X F, Shen F M, Yagi J, Nogami H. Numerical simulation of innovative operation of blast furnace based on multi-fuid model [J]. Journal of Iron and Steel Research,2006,13(6):8-15.
    [49]郜传厚,渐令,陈积明,孙优贤.复杂高炉炼铁过程的数据驱动建模及预测算法[J].自动化学报,2009,35(6)：725-730.
    [50]Saxen H. Short-term prediction of silicon content in pig iron [J]. Canadian Metallurgical Quarterly, 1994,33(4):319-326.
    [51]Chen J. A predictive system for blast furnaces by integrating a neural network with qualitative analy-sis [J]. Engineering Applications of Articial Intelligence,2001,14(1):77-85.
    [52]刘学艺,刘祥官,王文慧.贝叶斯网络在高炉铁水硅含量预测中的应用[J].钢铁,2005,40(3)：17-20.
    [53]Bhattacharya T. Prediction of silicon content in blast furnace hot metal using partial least squares (PLS)[J]. ISIJ International,2005,45(12):1943-1945.
    [54]Saxen H, Pettersson F. Nonlinear prediction of the hot metal silicon content in the blast furnace [J]. ISIJ International,47(12):1732-1737,2007.
    [55]Martfn R D, Obeso F, Mochon J, Barea R, Jimenez J. Hot metal temperature prediction in blast furnace using advanced model based on fuzzy logic tools [J]. Ironmaking and Steelmaking,2007,34(3):241-247.
    [56]渐令,刘祥官.支持向量机在铁水硅含量预报中的应用[J].冶金自动化,2005,29(3)：33-36.
    [57]Jian L, Gao C H, Li L, Zeng J S. Application of least squares support vector machines to predict the silicon content in blast furnace hot metal [J]. ISIJ International,2008,48(11):1659-1661.
    [58]Gao C H, Zhou Z M, Chen J M. Assessing the predictability for blast furnace system through nonlinear time series analysis [J]. Industrial and Engineering Chemistry Research,2008,47(9):3037-3045.
    [59]Gao C H, Chen J M, Zeng J S, Liu X Y, Sun Y X. A chaos-based iterated multistep predictor for blast furnace ironmaking process [J]. American Institute Chemical Engineers Journal,2009,55(4):947-962.
    [60]Lee Y J, Hsieh W F, Huang C M. SSVR:a smooth support vector machine for ε-insensitive regres-sion [J]. IEEE Transactions on Knowledge and Data Engineering,2005,17(5):678-685,.
    [61]Lee Y J, Mangasarian O L. SSVM:a smooth support vector machine [J]. Computational Optimization and Applications,2001,20:5-22.
    [62]Mouratidis K, Papadias D. Continuous nearest neighbor queries over sliding windows [J]. IEEE Trans-actions on Knowledge and Data Engineering,2007,19(6):789-803.
    [63]Pettersson F, Chakraborti N, Singh S B. Neural networks analysis of steel plate processing augmented by multi-objective genetic algorithms [J]. Steel Research International,2007,78:890-898.
    [64]Jindal A, Pujari S, Sandilya P, Ganguly S. A reduced order thermo-chemical model for blast furnace for real time simulation [J]. Computer and Chemical Engineering,2007,31(11):1484-1495.
    [65]Waller M, Saxen H. Time-varying event-internal trends in predictive modeling methods with appli-cations to ladlewise analyses of hot metal silicon content [J]. Industrial & Engineering Chemistry Research,2003,42(1):85-90.
    [66]Warren P, Harvey S. Development and implementation of a generic blast-furnace expert system [J]. Transactions of the Institution of Mining and Metallurgy Section C-Mineral Processing and Extractive Metallurgy,2001,110:43-49.
    [67]Jian L, Gao C H, Xia Z Q. A sliding-window smooth support vector regression model for nonlinear blast furnace system [J]. Steel Research International,2011,82(3):169-179.
    [68]Ghosh A, Majumdar S K. Modeling blast furnace productivity using support vector machines [J]. In-ternational Journal of Advanced Manufacturing Technology,2011,52:989-1003.
    [69]Johansson A, Medvedev A. Detection of incipient clogging in pulverized coal injection lines [J]. IEEE Transactions on Industrial Application,2000,36(3):877-883.
    [70]Gao C H, Jian L, Luo S H. Modeling of the thermal state change of blast furnace hearth with support vector machines [J]. IEEE Transactions on Industrial Electronics,2012,59(2):1134-1145.
    [71]Hsu C W, Lin C J. A Comparison of methods for multiclass support vector machines [J]. IEEE Trans-actions on Neural Networks,2002,13(2):415-425.
    [72]Vapnik V N. The Nature of Statistical Learning Theory [M]. New York:Wiley,1998.
    [73]Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines [M]. Cambridge, U.K. Cambridge University Press,2000.
    [74]Scholkopf B, Smola A J. Learning with Kernels:Support Vector Machines, Regularization, Optimiza-tion, and Beyond [J]. Cambridge, MA:MIT Press,2002.
    [75]Crammer K, Singer Y. On the algorithmic implementation of multiclass kernel-based vector ma-chines [J]. Journal Machine Learning Research,2001,2:265-292.
    [76]Weston J, Watkins C. Multi-class support vector machines[R]. Technical Report. CSD-TR-98-04, De-partment of Computer Science, Royal Holloway, University of London, Egham, TW20 OEX, England, 1998. http://www.cs.rhbnc.ac.uk/jasonw.
    [77]Knerr S, Personnaz L, Dreyfus G. Single-layer learning revisited:A stepwise procedure for building and training a neural network. In Neurocomputing:Algorithms, Architectures and Applications [C], Fogelman-Soulie and Herault, Eds., NATO ASI Series. Springer,1990.
    [78]Platt J C, Cristianini N, Shawe-Taylor J. Large margin DAG's for multiclass classification. In Advances in Neural Information Processing Systems [C]. Cambridge, MA:MIT Press,12:547-553,2000.
    [79]Friedman JH. Another approach to polychotomous classification[R]. Technical report, Stanford University, Department of Statistics,1996. http://www-stat.stanford.edu/reports/friedman/poly.ps.Z.
    [80]Dietterich TJ, Bakiri G, Solving multiclass learning problems via error-correcting output codes [J]. Journal of Artificial Intelligence Research,1995,2:263-286.
    [81]Allwein EL, Schapire RE, Singer Y, Reducing multiclass to binary:a unifying approach for margin classifiers [J]. Journal Machine Learning Research,2000,1:113-141.
    [82]Cawley G C, Talbot N L C. Fast exact leave-one-out corss-validation of spare least-squares support vector machines [J]. Neural Networks,2004,17(10):1467-1475.
    [83]Platt J C. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods [J]. In Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholkopf, D. Schuur-mans, Eds. Cambridge:MIT Press,2000.
    [84]Huang TK, Weng RC, Lin CJ. Generalized Bradley-Terry models and multi-class probability esti-mates [J]. Journal Machine Learning Research,2006,7:85-115.
    [85]Bradley R A, Terry M. The rank analysis of incomplete block designs:I. the method of paired com-parisons [J]. Biometrika,1952,39:324-345.
    [86]Lin HT, Lin CJ, Weng RC. A note on platt's probabilistic outputs for support vector machines [J]. Machine Learning,2007,68:267-276.
    [87]Bi D, Li Y F, Tso S K, Wang G L. Friction modeling and compensation for haptic display based on support vector machine [J]. IEEE Transactions on Industrial Electronics,2004,51:491-500.
    [88]Maric I, Ivek I. Self-organizing polynomial networks for time constrained applications [J]. IEEE Trans-actions on Industrial Electronics,2011,58:2019-2029.
    [89]Chang CC, Lin CJ. LIBSVM:A Library for Support Vector Machines.2001. http://www.csie.ntu.edu.tw/-cjlin/libsvm.
    [90]Zhang T. Statistical behavior and consistency of classification methods based on convex risk mini-mization [J]. Annals of Statistics,2004,32(1):56-134.
    [91]Wu T F, Lin C J, Weng R C. Probability estimates for multi-class classification by pairwise coupling [J]. Journal Machine Learning Research,2004,5:975-1005.
    [92]Duan K, Keerthi S S, Poo A N. Evaluation of simple performance measures for tuning svm hyperpa-rameters [J]. Neurocomputing,2003,51:41-59.
    [93]Berg C, Christensen J, Ressel P. Harmonic Analysis on Semigroups:Theory of Positive Definite and Related Functions [M]. New York, USA:Springer,1984.
    [94]Conforti D, Guido R. Kernel based support vector machine via semidefinite programming:Application to medical diagnosis [J]. Computers & Operations Research,2010,37:1389-1394.
    [95]Yu S, Falck T, Daemen A, Tranchevent L C, Suykens J A K, De Moor B, Moreau Y. L2-norm multiple kernel learning and its application to biomedical data fusion [J]. BMC Bioinformatics,2010,11:1-53.
    [96]Bach F R, Lanckriet G R G, Jordan MI. Multiple kernel learning, conic duality, and the SMO algo-rithm. In C. E. Brodley, Proceedings of the Twenty-first International Conference on Machine Learn-ing [C]. Banff, Canada:ACM,2004.
    [97]Tsang I W, Kwok J T. Efficient hyperkernel learning using second-order cone programming [J]. IEEE Transctions on Neural Networks,2006,17:48-58.
    [98]Sonnenburg S, Raosch G, Schaer C, Schokopf B. Large Scale Multiple Kernel Learning [J]. Journal of Machine Learning Research,2006,7:1531-1565.
    [99]Rakotomamonjy A, Bach F, Canu S, Grandvalet Y. SimpleMKL[J]. Journal of Machine Learning Research,2008,9:2491-2521.
    [100]Ong C S, Smola A J, Williamson R C.Learning the kernel with hyperkernels [J]. Journal of Machine Learning Research,2005,6:1043-1071.
    [101]Fung G, Dundar, Bi J, Rao B. A fast iterative algorithm for Fisher discriminant using heteroge-neous kernels. In C. E. Brodley, Proceedings of the Twenty-First International Conference on Machine Learning [C]. Banff, Canada:ACM,2004.
    [102]Kim S J, Magnani A, Boyd S. Optimal kernel selection in kernel Fisher discriminant analysis. In W. W. Cohen & A. Moore, Proceedings of the Twenty-Third International Conference on Machine Learning [C]. Pittsburgh, USA:ACM,2006.
    [103]Ye JP, Ji SW, Chen JH. Multi-class discriminant kernel learning via convex programming [J]. Journal of Machine Learning Research,2008,9:719-758.
    [104]Kloft M, Brefeld U, Laskov P, Sonnenburg S. Non-sparse multiple kernel learning. NIPS workshop: Kernel Learning Automatic Selection of Optimal Kernels [C]. Vancouver, Canada:NIPS,2008.
    [105]Sturm J F. Using SeDuMi 1.02 a MATLAB toolbox for optimization over symmetric cones [J]. Opti-mization Methods and Software,1999,11-12:625-653.
    [106]Boyd S, Vandenberghe L. Convex optimization [M]. New York, USA:Cambridge University Press, 2004.
    [107]Vandenberghe L, Boyd S. Semidefinite programming [J]. SIAM Review,1996,38:49-95.
    [108]Andersen E D, Andersen A D. The MOSEK interior point optimizer for linear programming:An im-plementation of the homogeneous algorithm. In H. Frenk, C. Roos, T. Terlaky, & S. Zhang, High Performance Optimization [C]. Norewll, USA:Kluwer Academic Publishers,2000.
    [109]Blake C L, Merz C J. UCI Repository of Machine Learning Databases. University of California, Irvine. http://archive.ics.uci.edu/ml/datasets.html.
    [110]Cawley G C, Talbot L C. Preventing over-fitting during model selection via Bayesian regularisation of the hyper-parameters [J]. Journal of Machine Learning Research,2007,8:841-861.
    [111]Mewes H W, Frishman D, Gruber C, Geier B, Haase D, Kaps A, Emcke K, Mannhaupt G, Pfeiffer F, Schuller C, Stocker S, Weil B. MIPS:a database for genomes and protein sequences [J]. Nucleic Acids Research,2000,28:37-40.
    [112]Xiang H, Wei YM, Diao H A. Perturbation analysis of generalized saddle point systems [J]. Linear Algebra and its Applications,2006,419:8-23.
    [113]Nesvizhskii A I, Vitek O, Aebersold A R. Analysis and validation of proteomic data generated by tandem mass spectrometry [J]. Nature Methods,2007,4:787-797.
    [114]Choi H, Nesvizhskii A I. Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics [J]. Journal of Proteome Research,2008,7:254-265.
    [115]Elias J E, Gygi S P. Target-decoy search strategy for increased confidence in large-scale protein iden-tifications by mass spectrometry [J]. Nature Methods,2007,4:207-214.
    [116]Sadygov R G, Cociorva D, Yates J R. Large-scale database searching using tandem mass spectra: Looking up the answer in the back of the book [J]. Nature Methods,2004,1:195-202.
    [117]马洁.蛋白质组肽段鉴定质量控制方法的研究与应用[D].北京：中国人民解放军军事医学科学院,2010.
    [118]孙瑞祥,付岩,李德泉等.基于质谱技术的计算蛋白质组学研究[J].中国科学E辑信息科学,2006,36(2)：222-234.
    [119]高雪,郑俊杰,贺福初.我国蛋白质组学研究现状及展望[J].生命科学,2007,19(3)：257-263.
    [120]Kapp E A, Schutz F, Connolly L M, et al. An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms:sensitivity and specificity analysis [J]. Pro-teomics,2005,5(13):3475-3490.
    [121]Alves G, Wu W W, Wang G, Shen R F, Yu Y K. Enhancing peptide identification confidence by combining search methods [J]. Journal of Proteome Research,2008,7(8):3102-3113.
    [122]Jones A R, Siepen J A, Hubbard S J, Paton N W. Improving sensitivity in proteome studies by analysis of false discovery rates for multiple search engines [J]. Proteomics,2009,9(5):1220-1229.
    [123]Baczek T,Bucniksi A, Ivnaovand A R, Kaliszna R. Artificial neural network analysis of evaluation of peptide MS/Ms spectra in Porteomlcs [J]. Analytical Chemistry,2004,76:1726-1732.
    [124]Shi J H, Chen B L, Wu F X. Improve accuracy of peptide identification with consistency between peptides. IEEE International Conference on Bioinformatics and Biomedicine [C], Atlanta:IEEE Press, 2011.
    [125]He Z Y, Zhao H Y, Yu W C. Score regularization for peptide identification [J]. BMC Bioinformatics, 2011,12:1-10.
    [126]Lin C, Wang S. Fuzzy support vector machines [J]. IEEE Transactions on Nerual Networks,2002, 13:464-471.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700