用户名: 密码: 验证码:
从头预测蛋白质结构元启发方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
蛋白质因其具有特定结构而成为具体生命功能的执行者。蛋白质结构预测在基因数据高速膨胀,而结构解析成本高、效率低的情况下显得尤为重要。从头预测蛋白质结构不依赖于已知的结构模板,是蛋白质结构预测领域中一项技术难度高、现实意义深远的研究内容。
     从计算机的角度来说,蛋白质结构预测本质上是一个组合优化问题,而该项组合优化问题所面临的前所未有的搜索空间与纷繁庞杂的约束机制,是计算机领域的一大挑战。本文在综述蛋白质结构从头预测以及并行元启发相关内容的基础上,着重研究了搜索空间、搜索策略、聚类方案三个方面的内容。主要研究内容包括:
     1.结构预测搜索空间研究。研究了骨架预测的片段结构及生成方式,以及侧链旋转异构体的结构及生成方式。在此基础上,针对侧链旋转异构体的生成,提出一种基于动态贝叶斯网络的四层模型。该模型主要有以下两个特点:一是考虑到骨架信息以及侧链4个扭角之间的相互关联及依赖,体现出明确的推理层次,更符合蛋白质分子的生物特性;二是在每一个层次上减少了未知变量个数,降低了模型复杂度,有利于在训练数据集合不变的情况下,缓解数据稀疏现象,提高模型精度。实验表明,该四层模型获得了高质量的结果。此外,还提出一种以极端构象与随机构象评价旋转异构体库的方法,通过在CASP9的FM类数据集上进行实验,验证了方法有效性。
     2.并行元启发搜索策略研究。以ACO为例,深入剖析了元启发工作原理,提出以任务分解与经验反馈为基本特点的并行元启发策略。针对从头预测蛋白质结构优化目标难以准确量化、解的构造复杂等问题,提出一种并行元启发搜索框架,融合了不同的能量函数及搜索策略。同时,结合GPCR预测详细设计了任务分配策略。基于ACO机制设计了蛋白质骨架及侧链预测算法。在骨架预测中,详细设计并实现了蚁群内搜索方案、解的构造方法、局部搜索策略以及并行分配机制。最后在Science上一篇文章所采用的16个小蛋白质数据集以及CASP8的FM类数据集上进行了实验,实验结果表明本文的方法具有很强的竞争力。
     3.蛋白质结构聚类研究。主要涉及两个方面的研究:一是提出一种用于蛋白质结构聚类的聚类中心选择算法。该方法在深入研究目前常用的蛋白质结构聚类算法――QT算法与AP算法的基础上,着眼于利用统计信息来提高发现最优构象的能力,克服了原有算法受限于具体参数的弊端。二是提出利用能量信息优化结构相似性矩阵的分布特性,提高相似性矩阵对蛋白质天然状态的表现能力,为聚类算法的工作奠定良好基础。最后在两个权威数据集上进行了实验,实验结果表明本文的方法能够针对特定数据集合有效提高聚类性能,从而选择到更加接近天然构象的候选结构。
     本文的创新点主要表现在:提出了用于生成侧链旋转异构体库的四层推理模型,该模型充分考虑到骨架与侧链之间的相互关联及依赖关系,并在降低模型复杂度、缓解数据稀疏方面做了合理设计;提出适合蛋白质从头预测的并行元启发方案,在骨架预测中取得了明显效果;提出用于蛋白质结构聚类的聚类中心选择算法以及相似性分布优化方案,提高了搜寻最优构象的准确率。实验表明,这些研究对蛋白质结构从头预测起到了积极的推进作用,对后续相关研究有重要参考价值。
Proteins with certain structure are executants of the material life function. Pre-diction of protein structure is quite significant in the context of Gene data explosionbut structure parsing with high cost and low e?ciency. De novo prediction of pro-tein structure with no structure template is a significant research content with hightechnical di?culty and great practical significance.
     Prediction of protein structure is essentially a combination optimization problemin computer view. And this problem with an enormous search space and complicatedconstraining mechanism is a major challenge in computer field. We summarized denovo prediction of protein structure and parallel metaheuristics in this dissertation.And the main content of this dissertation includes search space, search scheme, andclustering scheme.
     1. The research on search space of structure prediction. The fragment structureand its building method for both backbone and side chains are summarized in this dis-sertation. A four-level model for building rotamer library based on Dynamic BayesianNetworks is proposed. The relation of backbone with four side chain torsion anglesis considered in this model, so it shows an obvious ratiocination hierarchy, and thismodel is in accord with the biology characteristic of protein molecule. It holds onlyone unknown parameter in every level, so the complexity of this model is reduced,and the problem caused by parse data is solved to a certain extent for the same scaleof training data. Experiment results show that this model obtain models with highquality. Moreover, assessment of rotamer library with ultra conformation and randomconformation is proposed. Experiment on CASP9 FM targets shows that this methodis effective.
     2. The research on parallel metaheuristics. A parallel metaheuristic strategywhich main characteristics are task parsing and experience feedback is proposed basedon metaheuristics such as ACO. And a parallel metaheuristic search frame with fusingdi?erent energy functions or search strategies is proposed for solving the problems of optimization target is hard to quantify and solution structure is extraordinary complex.And the task distribute strategy is designed for prediction of GPCR. Further more,algorithms of prediction of backbone and side chain are designed. The search schemein one ant colony, the solution construction, the local search, and the parallel distributestrategy in prediction of backbone are implemented. Experiments on data sets of 16small proteins provided by a paper on Science and FM targets in CASP8 show thatthe method proposed in our dissertation had got a considerable e?ect.
     3. The research on protein structures clustering. It includes two aspects. First,an exemplar selection algorithm for clustering protein structures is proposed basedon the widely-used quality threshold and a?nity propagation algorithms in proteinstructure prediction. The ability to find the best conformation is enhanced based onstatistical information, and the algorithm does not depend on experience parameter.Second, a scheme of optimizing the similarity matrix based on energy is proposed. Itcan form good basis of clustering. Experiments on authoritative data sets show thatthe methods proposed in our dissertation can enhance the performance of clustering,and find the closer decoys to native structure.
     The major contribution of this dissertation includes: the proposal of the four-levelmodel for building rotamer library, the relation of backbone with four torsion anglesis considered, and the problem caused by parse data is solved to a certain extent; theproposal of the parallel metaheuristic search strategy fitting for de novo prediction ofprotein structure, it has got a considerable e?ect in backbone prediction; the proposalof an exemplar selection algorithm for clustering protein structures and the schemefor optimization of similarity distribution, they can improve the correctness of optimalstructure selection. Experiments show that this work will exert positive e?ects on denovo prediction of protein structure, and exhibits a great reference value to the futurecorrelative research.
引文
[1] R. M. Twyman. Principles of Proteomics[M]. Taylor & Francis Group, 2004.
    [2] Naomi E Chayen and Emmanuel Saridakis. Protein crystallization: from purified proteinto di?raction-quality crystal[J]. Nature Methods, February 2008, 5(2): 147–153.
    [3] http://www.rcsb.org[EB/OL].
    [4] http://www.expasy.ch/sprot/[EB/OL].
    [5] http://www.ebi.ac.uk/uniprot/[EB/OL].
    [6] C. B. Anfinsen, E. Haber, M. Sela and Jr F. H. White. The Kinetics of Formation of NativeRibonuclease During Oxidation of the Reduced Polypeptide Chain[J]. PNAS, 1961, 47:1309–1314.
    [7] Christian B. Anfinsen. Principles that Govern the Folding of Protein Chains[J]. Science,1973, 181(4096): 223–230.
    [8] Ora Schueler-Furman, Chu Wang, Phil Bradley, Kira Misura and David Baker. Progressin Modeling of Protein Structures and Interactions[J]. Science, 2005, 310: 638–642.
    [9] Frank Alber, Svetlana Dokudovskaya, Liesbeth M. Veenho?, Wenzhu Zhang, Julia Kipper,Damien Devos and Adisetyantari Suprapto. The molecular architecture of the nuclear porecomplex[J]. Nature, 2007, 450: 695–701.
    [10] Paul A. Rota, M. Steven Oberste, Stephan S. Monroe, W. Allan Nix, Ray Campagnoli,Joseph P. Icenogle, Silvia Penaranda and et.al. Characterization of a Novel CoronavirusAssociated with Severe Acute Respiratory Syndrome[J]. Science, 2003, 300: 1394–1399.
    [11] Marco A. Marra, Steven J. M. Jones, Caroline R. Astell, Robert A. Holt, Angela Brooks-Wilson, Yaron S. N. Butterfield and Jaswinder Khattra. The Genome Sequence of theSARS-Associated Coronavirus[J]. Science, 2003, 300: 1399–1404.
    [12] H. Yang, M. Yang, Y. Ding, Y. Liu, Z. Lou, L. Sun, Z. Zhou, S. Ye, H. Pang, G. Gao, K.Anand, M. Bartlam, R. Hilgenfeld and Z. Rao. The Crystal Structures of SARS Virus MainProtease Mpro and Its Complex with an Inhibitor[J]. PNAS, 2003, 100(23): 13190–13195.
    [13] Marcin von Grotthuss, Lucjan S. Wyrwicz and Leszek Rychlewski. Letter to the Editor:mRNA Cap-1 Methyltransferase in the SARS Genome[J]. Cell, 2003, 113: 701–702.
    [14] Bin Qian, Srivatsan Raman, Rhiju Das, Philip Bradley, Airlie J.McCoy, Randy J.Readand David Baker. High-resolution structure prediction and the crystallographic phaseproblem[J]. Nature, 2007, 450: 259–264.
    [15] David Baker and Andrej Sali. Protein Structure Prediction and Structural Genomics[J].Science, 2001, 294: 93–96.
    [16]王镜岩,朱圣庚,徐长法.生物化学(第三版)[M].高等教育出版社,北京, 2002.
    [17]张毓敏. A Knowledge-based Scoring Function for Predicting Protein Structures[D].上海交通大学, 2006.
    [18] http://predictioncenter.org/[EB/OL].
    [19] J. Moult, K. Fidelis, A. Kryshtafovych, B. Rost, T. Hubbard and A. Tramontano. Criticalassessment of methods of protein structure prediction - round VII[J]. Proteins: Structure,Function, and Bioinformatics, 2007, 69: 3–9.
    [20] MichaelL. Tress, Iakes Ezkurdia and Jane S. Richardson. Target domain definition and clas-sification in CASP8[J]. Proteins: Structure, Function, and Bioinformatics, 2009, 77(Sup-plement S9): 10–17.
    [21] Michael Tress, Chin-Hsien Tai, Gouli Wang, Iakes Ezkurdia, Gonzalo Lopez, Al?onsoValencia, Byungkook Lee and Roland L. Dunbrack Jr. Domain Definition and TargetClassification for CASP6[J]. Proteins: Structure, Function, and Bioinformatics, 2005,Suppl 7: 8–18.
    [22] Jie Liang. Computational Methods for Structural Bioinformatics and Computational Bi-ology[R]. Department of Bioengineering University of Illinois at Chicago, 2009.
    [23] Daniel M. Rosenbaum, S?ren G. F. Rasmussen and Brian K. Kobilka. The structure andfunction of G-protein-coupled receptors[J]. Nature, 2009, 459: 356–363.
    [24] Warren Lyford DeLano. The PyMOL User’s Manual. DeLano Scientific, San Carlos, CA,2002.
    [25] Carol A. Rohl, Charlie E. M. Strrauss, Kira M. S. Misura and David Baker. Proteinstructure prediction using Rosetta[J]. Methods in enzymology, 2004, 383: 66–93.
    [26] Ora Schueler-Furman, Chu Wang, Phil Bradely, Dira Missura and David Baker. Progressin Modeling of Protein Structures and Interactions[J]. Science, 2005, 310: 638–642.
    [27] Ken A. Dill and Hue Sun Chan. From Levinthal to pathways to funnels[J]. Nature Struc-tural & Molecular Biology, 1997, 4: 10–19.
    [28]冯端,冯少彤.溯源探幽:熵的世界[M].科学出版社,北京, 2005.
    [29] Philip Bradley, Kira M. S. Misura and David Baker. Twoard High-Resolution de NovoStructure Prediction for Small Proteins[J]. Science, 2005, 309: 1868–1871.
    [30] David Eramian, Min yi Shen, Damien Devos, Francisco Melo, Andrej Sali and Marc A.Marti-Renom. A composite score for predicting errors in protein structure models[J].Protein Sci., 2006, 15: 1653–1666.
    [31]陈正隆,徐为人,汤立达.分子模拟的理论与实践[M].化学工业出版社,北京, 2007.
    [32] Brian Kuhlman, Gautam Dantas, Gregory C. Ireton, Gabriele Varani, Barry L. Stoddardand David Baker. Design of a Novel Globular Protein Fold with Atomic-Level Accuracy[J].Science, 2003, 302: 1364.
    [33] T. Lazaridis and M. Karplus. New view of protein folding reconciled with the old throughmultiple unfolding simulations[J]. Science, 1997, 278: 1928–1931.
    [34] T. Lazaridis and M. Karplus. E?ective energy function for proteins in solution[J]. Proteins,1999, 35: 133–152.
    [35] T. Lazaridis and M. Karplus. E?ective energy functions for protein structure prediction[J].Curr.Opin. Struct. Biol, 2000, 10: 139–145.
    [36] W. C. Still, A. Tempczyk, R. C. Hawley and T. Hendrickson. Semianalytical treatmentof solvation for molecular mechanics and dynamics[J]. J. Am. Chem. Soc., 1990, 112:6127–6129.
    [37] D. Qiu, P. S. Shenkin, F. P. Hollinger and W. C. Still. The GB/SA continuum model forsolvation. A fast analytical method for the calculation of approximate Born radii.[J]. J.Phys. Chem., 1997, 101: 3005–3014.
    [38] F. Melo and E. Feytmans. Novel knowledge-based mean force potential at atomic level[J].J. Mol. Biol, 1997, 267: 207–222.
    [39] F. Melo and E. Feytmans. Assessing protein structures with a non-local atomic interactionenergy[J]. J. Mol. Biol, 1998, 277: 1141–1152.
    [40] F. Melo, D. Devos, E. Depiereux and E. Feytmans. ANOLEA: A www server to assessprotein structures[C]. 1997: 187–190.
    [41] H. Zhou and Y. Zhou. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction[J]. ProteinSci, 2002, 11: 2714–2726.
    [42] M-Y. Shen and A. Sali. Statistical Potential for Assessment and Prediction of ProteinStructures[J]. Protein Science, 2006, 15: 2507– 2524.
    [43] D. T. Jones. GenTHREADER: An e?cient and reliable protein fold recognition methodfor genomic sequences[J]. J. Mol. Biol, 1999, 287: 797–815.
    [44] F. Pazos, M. Helmer-Citterich, G. Ausiello and A. Valencia. Correlated mutations containinformation about protein-protein interaction[J]. J. Mol. Biol, 1997, 271: 511–523.
    [45] R. Adamczak, A. Porollo and J. Meller. Accurate prediction of solvent accessibility usingneural networks-based regression[J]. Proteins, 2004, 56: 753–767.
    [46] L. Holm and C. Sander. Evaluation of protein models by atomic solvation preference[J]. J.Mol. Biol, 1992, 225: 93–105.
    [47] S. C. E. Tosatto. The victor/FRST function for model quality estimation[J]. J. Comput.Biol, 2005, 12: 1316–1327.
    [48] F. Melo, R. Sanchez and A. Sali. Statistical potentials for fold assessment[J]. Protein Sci,2002, 11: 430–448.
    [49] F. Melo and A. Sali. Fold Assessment for Comparative Protein Structure Modelling[J].Protein Science, 2007, 16: 2412–2426.
    [50] W. Kabsch and C. Sander. Dictionary of protein secondary structure: Pattern recognitionof hydrogen-bonded and geometrical features[J]. Biopolymers, 1983, 22: 2577–2637.
    [51] D. T. Jones. Protein secondary structure prediction based on positionspecific scoring ma-trices[J]. J. Mol. Biol., 1999, 292: 195–202.
    [52] I. Andre, P. Bradley, C. Wang and D. Baker. Prediction of the structure of symmetricalprotein assemblies[J]. PNAS, 2007, 104: 17656–17661.
    [53] R. Das and D. Baker. Automated de novo prediction of native-like RNA tertiary struc-tures[J]. PNAS, 2007, 104: 14664–14669.
    [54] Rhiju Das and David Baker. Macromolecular modeling with rosetta[J]. Annu. Rev.Biochem., 2008, 77: 363–382.
    [55] Kim T Simons, Charlie Strauss and David Baker. Prospects for ab initio protein structuralgenomics[J]. J. Mol. Biol., 2001, 306: 1191–1199.
    [56]马文淦.计算物理学[M].科学出版社,北京, 2008.
    [57]江凡.蛋白质空间结构的实验技术和理论方法[J].物理, 2007, 36(4): 272–279.
    [58]严菡,朱皓淼,沈健.分子动力学模拟研究两性离子结构对生物分子谷胱甘肽正常构象的维持[J].中国科学B辑:化学, 2007, 37: 274–278.
    [59]朱皓淼,李波,李利,沈健.分子动力学模拟研究两性离子结构对聚13丙氨酸(polyalanine13)在水溶液中天然行为的维持[J].中国科学B辑:化学, 2008, 38: 48–54.
    [60]齐建勋,肖奕.基于小波方法的蛋白质非规则二级结构预测[J].科学通报, 2002, 47:425–430.
    [61]董启文,王晓龙,林磊,关毅,赵健.蛋白质二级结构预测:基于词条的最大熵马尔科夫方法[J].中国科学C辑:生命科学, 2005, 35: 87–96.
    [62]杨博,王亚东,苏小红,唐降龙.基于Agent和数据切片的分布式神经网络协同学习研究[J].计算机研究与发展, 2006, 43: 2096–2103.
    [63] K. A. Dill. Theory for the folding and stability of globular proteins[J]. Biochemistry, 1985,24: 1501–1509.
    [64] K. A. Dill, S. Bromberg, K. Yue and et al. Proinciples of protein folding: A perspectivefrom simple exact models[J]. Protein Science, 1995, 4: 561–602.
    [65]黄文奇,吕志鹏.求解蛋白质折叠问题的拟人算法:对PERM的改进[J].科学通报, 2004,49: 1801–1804.
    [66]陈矛,黄文奇,吕志鹏.求解HP模型蛋白质折叠问题的改进PERM算法[J].计算机研究与发展, 2007, 44: 1456–1461.
    [67] Frank H. Stillinger, Teresa Head-Gordon and Catherine L. Hirshfeld. Toy model for proteinfolding[J]. Physical Review E, 1993, 48: 1469–1477.
    [68] Xiaolong Zhang and Xiaoli Lin. E(?)ective Protein Folding Prediction Based on Genetic-Annealing Algorithm in Toy Model[J].
    [69] Andriy Kryshtafovych, Krzysztof Fidelis and John Moult. Progress from CASP6 toCASP7[J]. Proteins: Structure, Function, and Bioinformatics, 2007, 69(S8): 194–207.
    [70] Robert F. Service. Problem Solved(sort of)[J]. Science, 2008, 321: 784–786.
    [71] Wouter Boomsma, Kanti V. Mardia, Charles C. Taylor, Jesper Ferkingho(?)-Borg, AndersKrogh and Thomas Hamelryck. A generative, probabilistic model of local protein struc-ture[J]. PNAS, 2008, 105(26): 8932–8937.
    [72] T. A. Jones and S. Thirup. Using known substructures in protein model building andcrystallography[J]. EMBO Journal, 1986, 5: 819–822.
    [73] Kim T. Simons, Charles Kooperberg, Enoch Huang and David Baker. Assembly of Pro-tein Tertiary structures from Fragments with Similar Local Sequences using SimulatedAnnealing and Bayesian Scoring Functions[J]. J. Mol. Biol., 1997, 268: 209–225.
    [74] G. Chikenji, Y. Fujitsuka and S. Takada. Shaping up the protein folding funnel by localinteraction: Lesson from a structure prediction study[J]. Proc. Natl. Acad. Sci. USA, 2006,103: 3141–3146.
    [75] R. Jauch, H. Yeo, P. Kolatkar and N. Clarke. Assessment of CASP7 structure predictionsfor template free targets[J]. Proteins, 2007, Suppl 8: 57–67.
    [76] Yang Zhang. I-TASSER server for protein 3D structure prediction[J]. BMC Bioinformatics,2008, 9(40): 40–47.
    [77] Sitao Wu, Je(?)rey Skolnick and Yang Zhang. Ab initio modeling of small proteins byiterative TASSER simulations[J]. BMC Bioinformatics, 2007, 5(1): 17–26.
    [78] Janusz M. Bujnicki. Protein structure prediction by recombination of fragments[J]. Chem-BioChem, 2006, 7: 19–27.
    [79] Yang Zhang, Mark E. DeVries and Je(?)rey Skolnick. Structure Modeling of All Identified GProtein-Coupled Receptors in the Human Genome[J]. PLoS Computational Biology, 2006,2: 0088–0099.
    [80] C. Bystro(?), V. Thorsson and D. Baker. HMMSTR: a hidden Markov model for localsequence-structure correlations in proteins[J]. J. Mol Biol, 2000, 301: 173–190.
    [81] Yang Zhang. Template-based modeling and free modeling by I-TASSER in CASP7[J].Proteins: Structure, Function, and Bioinformatics, 2007, 69(S8): 108–117.
    [82] S. F. Altschul, T. L. Madden, A. A. Scha(?)er, J. Zhang, Z. Zhang, W. Miller and D. J.Lipman. Gapped BLAST and PSI-BLAST: a new generation of protein database searchprograms[J]. Nucleic Acids Research, 1997, 25(17): 3389–3402.
    [83] Y. Zhang, D. Kihara and J. Skolnick. Local energy landscape (?)attening: Parallel hyperbolicMonte Carlo sampling of protein folding[J]. Proteins, 2002, 48: 192–201.
    [84] G. N. Ramachandran, C. Ramakrishnan and V. Sasisekharan. Stereochemistry of polypep-tide chain configurations[J]. J. Mol Biol, 1963, 7: 95–99.
    [85] B. Berger and T. Leighton. Protein Folding in the Hydrophilic-Hydrophobic(HP) Model isNP-Complete[J]. Journal of Computational Biology, 1998, 5(1): 27–40.
    [86] W. E. Hart and S. Istrail. Robust proofs of NP-hardness for protein folding: Generallattices and energy potentials[J]. Journal of Computational Biology, 1997, 4: 1–22.
    [87] M. Dorigo and T. Stu¨tzle. Ant Colony Optimization[M]. MIT Press, Boston, MA., 2004.
    [88] F. Glover. Future paths for integer programming and links to artificial intelligence[J].Computer Operation Research, 1986, 13: 533–549.
    [89] C. R. Reeves. Modern Heuristic Techniques for Combinatorial Problems[M]. BlackwellScientific Publishing, Oxford, 1993.
    [90]吕强.关于元启发的综述[R].苏州大学, 2006.
    [91] C. Blum and A. Roli. Metaheuristics in Combinatorial Optimization: Overview and Con-ceptual Comparison[J]. ACM Computing Surveys, 2003, 35(3): 268–308.
    [92] E.-G. Talbi. A Taxonomy of Hybrid Metaheuristics[J]. Journal of Heuristics, 2002, 8:541–564.
    [93]陈国良.并行算法研究进展[J].中国计算机学会通讯, 2005, 1(2): 18–21.
    [94] T. G. Crainic and M. Toulouse. Handbook of Metaheristics, volume 57 of InternationalSeries in Operations Research and Management Science, chapter Parallel Strategies forMetaheuristics. Kluwer Academic Publishers, Norwell, MA, 2002.
    [95] Van-Dat Cung, Simone L. Martins, Celso C. Riberro and Catherine Roucairol. [C]C.C.Riberro and P. Hansen, editors, Strategies For The Parallel Implementation of Metaheuris-tics. C.C. Riberro and P. Hansen, editors, Essays and surveys in metaheuristics, 2001.
    [96]冯圣中,谭光明,徐琳,孙凝晖,徐志伟.曙光4000H生物信息处理专用计算机的高性能算法研究[J].计算机研究与发展, 2005, 42: 1053–1058.
    [97]徐国市,鲁发凯,许卓群,余华山,丁文魁.一种面向生物基因组可变剪接问题的网络并行求解方案[J].计算机研究与发展, 2007, 44: 1682–1687.
    [98]孙向东,刘拥军,黄保续,谢仲伦.蛋白质结构预测――支持向量机的应用[M].科学出版社,北京, 2008.
    [99] C. Chothia. One thousand families for the molecular biologist[J]. Nature, 1992, 357: 543–544.
    [100] Alexey G. Murzin, Steven E. Brenner, Tim Hubbard and Cyrus Chothia. SCOP: A struc-tural classification of proteins database for the investigation of sequences and structures[J].J. Mol. Biol., 1995, 247: 536–540.
    [101] C. A. Orengo, A. D. Michie and S. Jones. CATH-a hierarchic classification of proteindomain structures[J]. J. M. Structure, 1997, 5(8): 1093–1108.
    [102] Claus A. F. Andersen and Burkhard Rost. Secondary Strucrure Assignment[J]. StructuralBioinformatics, 2005, 44: 339–363.
    [103] A.G. de Brevern, C. Etchebest and S. Hazout. Bayesian probabilistic approach for predict-ing backbone structures in terms of protein blocks[J]. Proteins: Structure, Function, andGenetics, 2000, 41(3): 271–287.
    [104] Peter M. Bowers, Charlie E.M. Strauss and David Baker. De novo protein structure deter-mination using sparse NMR data[J]. Journal of Biomolecular NMR, 2000, 18: 311–318.
    [105]郭海娟,吕强,吴宏杰,吴进珍,杨鹏,黄旭.一个识别蛋白质折叠模式的SVM分类器[J].生物信息学, 2010.
    [106] IUPAC-IUB Commission on Biochemical Nomenclature. Abbreviations and Symbols forthe Description of the Conformation of Polypeptide Chains[J]. European Journal of Bio-chemistry, December 1970, 17(2): 193–201.
    [107] Jing-Liang Hsin. An ant colony optimization approach for the protein side chain packingproblem. Master’s thesis, National Sun Yat-sen University, 2006.
    [108] Jr. Roland L. Dunbrack. Rotamer libraries in the 21st century[J]. Current Opinion inStructural Biology, 2002, 12: 431–440.
    [109] Jr. Roland L. Dunbrack and F. E. Cohen. Bayesian statistical analysis of protein side-chainrotamer preferences[J]. Protein Science, 1997, 6(8): 1661–1681.
    [110] Jr. Roland L. Dunbrack and M. Karplus. Backbone-depedent rotamer library for proteins:Application to side-chain prediction[J]. Journal of Molecular Biology, 1993, 230(2): 543–574.
    [111] R. Chandrasekaran and G. N. Ramachandran. Studies on the conformation of amino acids.XI. Analysis of the observed side group conformations in proteins[J]. International Journalof Protein Research, 1970, 2: 223–233.
    [112] Jay W. Ponder and Frederic M. Richards. Tertiary templates for proteins: Use of packingcriteria in the enumeration of allowed sequences for di(?)erent structural classes[J]. Journalof Molecular Biology, 1987, 193(4): 775–791.
    [113] John Kuszewski, Angela M. Gronenborn and G. Marius Clore. Improving the qualityof NMR and crystallographic protein structures by means of a conformational databasepotential derived from structure databases[J]. Protein Science, 1996, 5(6): 1067–1080.
    [114]温炜,吕强,杨鹏,杨凌云,吴进珍,黄旭.一种基于HMM的蛋白质侧链旋转异构体构造方法[J].小型微型计算机系统, 2011, 32(1): 189–192.
    [115] Z. Ghahramani. Learning dynamic Bayesian networks[J]. Lecture Notes in ComputerScience, 1998, 1387: 168–197.
    [116] K. Murphy. Dynamic Bayesian Networks: Representation, Inference and Learning[D]. UCBerkeley Computer Science Division, 2002.
    [117] Gideon Schwarz. Estimating the dimension of a model[J]. The Annals of Statistics, 1978,6(2): 461–464.
    [118] Martin Paluszewski and Thomas Hamelryck. Mocapy++ - A toolkit for inference andlearning in dynamic Bayesian networks[J]. BMC Bioinformatics, 2010, 11(126).
    [119] Martin Paluszewski and Thomas Hamelryck. Mocapy: A Parallelized Toolkit for Learningand Inference in Dynamic Bayesian Network. Bioinformatics Center Department of BiologyUniversity of Copenhagen, Ole Maaloes Vej 5 2200 Copenhagen N Denmark, 1.0 edition,June 2009.
    [120] http://dunbrack.fccc.edu/bbdep/[EB/OL].
    [121] Andriy Kryshtafovych, Maciej Milostan, Lukasz Szajkowski, Pawel Daniluk and KrzysztofFidelis. CASP6 Data Processing and Automatic Evaluation at the Protein Structure Pre-diction Center[J]. Proteins: Structure, Function, and Bioinformatics, 2005, 61(Suppl 7):19–23.
    [122] D. Applegate, R. Bixby, V. Chva′tal and W. Cook. Finding cuts in the TSP[R]. DIMACSCenter, Rutgers University, Technical report 95-05, Piscataway, NJ, 1995.
    [123] D. Applegate, R. Bixby, V. Chva′tal and W. Cook. On the solution of traveling salesmanproblems[J]. Documenta Mathematica, 1998, Extra Volume ICM III: 645–656.
    [124] D.S. Hochbaum. Approximation Algorithms for NP-Hard Problems[M]. PWS PublishingCompany, Boston, 1997.
    [125] J. Hromkovic. Algorithmics for Hard Problems[M]. Springer-Verlag, Berlin, 2nd edition,2003.
    [126] V. V. Vazirani. Approximation Algorithms.[M]. Springer-Verlag, Berlin, 2001.
    [127] M. Dorigo, V. Maniezzo and A. Colorni. The ant system: optimization by a colony ofcooperating agents[J]. IEEE Transactions on Systems, Man and Cybernetics, 1996, PartB 26: 29–41.
    [128] M. Dorigo and L. M. Gambardella. Ant Colony System: A Cooperative learning approachto the travelling salesman problem[J]. IEEE Transactions on Evolutionary Computation,1997, 1: 53–56.
    [129] T. Stu¨tzle and H. Hoos. MAX-MIN Ant System[J]. Future Generation Computer SystemsJournal, 2000, 16(8): 889–914.
    [130] E. Bonabeau, M. Dorigo and G. Theraulaz. Inspiration for optimization from social insecthehavior[J]. Nature, 2000, 406: 39–42.
    [131] J. L. Deneubourg, S. Aron and S. Goss J. M. Pasteels. The self-organizing exploratorypattern of the Argentine ant[J]. Journal of Insect Behavior, 1990, 3: 159–168.
    [132] S. Goss, S. Aron, J. L. Deneubourg and J. M. Pasteels. Self-organized shortcuts in theArgentine ant[J]. Naturwissenschaften, 1989, 76: 579–581.
    [133] F. Glover and M. Laguna. Tabu Search[M]. Kluwer Academic Publisher, 1997.
    [134] A. E. Eiben and C.A. Schippers. On Evolutionary Exploration and Exploitation[J]. Fun-damenta Informaticae, 1998, 35: 1–16.
    [135] Qiang Lv and Xiaoyan Xia. Towards Termination Criteria of Ant Colony Optimization[C].Proceedings of the Third International Conference on Natural Computation (ICNC 2007),2007: 276–282.
    [136]吕强,黄旭.从头预测蛋白质结构的并行元启发方法综述[R].苏州大学,技术报告SD2009/091, March 2009.
    [137] C. C. Coello. An updated survey of GA-based multiobjective optimization techniques[J].ACM Computer Survey, 2000, 32(2): 109–143.
    [138] Qiang Lu¨, Xiaoyan Xia and Peide Qian. A parallel aco approach based on one pheromonematrix[J]. Lecture Notes in Coumputer Science, 2006, 4150: 322–329.
    [139]吕强,高彦明,钱培德.共享信息素矩阵:一种新的并行ACO的方法[J].自动化学报, 2007,33(4): 418–421.
    [140]潘吉斯,吕强,王红玲.一种并行蚁群Bayesian网络学习的算法[J].小型微型计算机系统,2007, 28(4): 651–655.
    [141] Haijuan Guo, Qiang Lu¨, Jinzhen Wu, Xu Huang and PeiDe Qian. Solving 2D HP Pro-tein Folding Problem by Parallel Ant Colonies[C]. 2009 2nd International Conference onBioMedical Engineering and Informatics, 2009: 1525–1530.
    [142]栾忠兰,吕强,杨凌云,徐超.一种蛋白质点突变计算机预测的并行模型[J].小型微型计算机系统, 2011,已录用.
    [143] Stephen H. White. Biophysical dissection of membrane proteins[J]. Nature, 2009, 459:344–346.
    [144] Vsevolod Katritch, Manuel Rueda, Polo Chun-Hung Lam, Mark Yeager and RubenAbagyan. GPCR 3D homology models for ligand screening: Lessons learned from blindpredictions of adenosine A2a receptor complex[J]. Proteins: Structure, Function, Bioinfor-matics, 2010, 78: 197–211.
    [145] P.Barth, B. Wallner and D. Baker. Prediction of membrane protein structures with complextopologies using limited constraints[J]. PNAS, 2009, 106(5): 1409–1414.
    [146] Vladimir Yarov-Yarovoy, Jack Schonbrun and David Baker. Multipass Membrane ProteinStructure Prediction Using Rosetta[J]. Proteins: Structure, Function, and Bioinformatics,2006, 62: 1010–1025.
    [147] P. Barth, J. Schonbrun and D. Baker. Toward high-resolution prediction and design oftransmembrane helical protein structures[J]. PNAS, 2007, 104(40): 15682–15687.
    [148] H. H. Hoos and T. Stu¨tzle. Stochastic Local Search: Foundations and Applications[M].Morgan Kaufmann Publishers, 2004.
    [149] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller and E. Teller. Equationsof state calculations by fast computing machines[J]. Journal of Chemical Physics, 1953,21: 1087– 1091.
    [150] http://robetta.org/fragmentsubmit.jsp[EB/OL].
    [151] Yang Zhang and Je(?)rey Skolnick. Scoring function for automated assessment of proteinstructure template quality[J]. Proteins: Structure, Function, and Bioinformatics, 2004,57(4): 702–710.
    [152] A. R. Ortiz, C. E. Strauss and O. Olmea. MAMMOTH (matching molecular models ob-tained from theory): an automated method for model comparison[J]. Protein Science,2002, 11: 2606–2621.
    [153] A. Zemla. LGA: a method for finding 3D similarities in protein structures[J]. Nucleic AcidsResearch, 2003, 31: 3370–3374.
    [154] Michael Tress, Iakes Ezkurdia, Osvaldo Grana, Gonzalo Lopez and Alfonso Valencia. As-sessment of Predictions Submitted for the CASP6 Comparative Modeling Category[J].Proteins: Structure, Function, and Bioinformatics, 2005, 61(Suppl 7): 27–45.
    [155] Moshe Ben-David, Orly Noivirt-Brik, Aviv Paz, Jaime Prilusky, Joel L. Sussman andYaakov Levy. Assessment of CASP8 structure predictions for template free targets[J].Proteins: Structure, Function, and Bioinformatics, 2009, 77(S9): 50–65.
    [156] L. N. Kinch, J. O. Wrabl, S. S. Krishna, I. Majumdar, R. I. Sadreyev, Y. Qi, J. Pei, H.Cheng and N. V. Grishin. CASP5 assessment of fold recognition target predictions[J].Proteins: structure, Function, and Bioinformatics, 2003, Suppl 6: 359–409.
    [157] Gouli Wang, Yumi Jin and Roland L. Dunbrack Jr. Assessment of Fold Recognition inCASP6[J]. Proteins: Structure, Function, and Bioinformatics, 2005, Suppl 7: 46–66.CASP6.
    [158] S. Raman, D. Baker, B. Qian and R. C. Walker. Advances in Rosetta protein structureprediction on massively parallel systems[J]. IBM Journal of Research and Development,2008, 52(1): 7–17.
    [159] Srivatsan Raman, Robert Vernon, James Thompson, Michael Tyka, Ruslan Sadreyev,Jimin Pei, David Kim, Elizabeth Kellogg, Frank DiMaio, Oliver Lange, Lisa Kinch, WillShe(?)er, Bong-Hyun Kim, Rhiju Das, Nick V. Grishin and David Baker. Structure predic-tion for CASP8 with all-atom refinement using Rosetta[J]. Proteins: Structure, Function,and Bioinformatics, 2009, 77(Supplement S9): 89–99.
    [160] Yang Zhang and Je(?)rey Skolnick. SPICKER: A clustering approach to identify near-nativeprotein folds[J]. Journal of Computational Chemistry, 2004, 25(6): 865–871.
    [161]岳峰,孙亮,王宽全,王永吉,左旺孟.基因表达数据的聚类分析研究进展[J].自动化学报,2008, 34(2): 113–120.
    [162] Laurie J. Heyer, Semyon Kruglyak and Shibu Yooseph. Exploring Expression Data: Iden-tification and Analysis of Coexpressed Genes[J]. Genome Research, 1999, 9: 1106–1115.
    [163]王开军,张军英,李丹,张新娜,郭涛.自适应仿射传播聚类[J].自动化学报, 2007, 33(12):1242–1245.
    [164] Brendan J. Frey and Delbert Dueck. Clustering by Passing Messages Between DataPoints[J]. Science, 2007, 315(5814): 972–976.
    [165] D. Shortle, K. T. Simons and D. Baker. Clustering of low-energy conformations near thenative structures of small proteins.[J]. Proceedings of the National Academy of SciencesUSA, 1998, 95(19): 11158–11162.
    [166]肖宇,于剑.基于近邻传播算法的半监督聚类[J].软件学报, 2008, 19: 2803–2813.
    [167]刘铭,王晓龙,刘远超.一种大规模高维数据快速聚类算法[J].自动化学报, 2009, 35(7):859–866.
    [168]倪巍伟,孙志挥,陆介平. k-LDCHD――高维空间k邻域局部密度聚类算法[J].计算机研究与发展, 2005, 42(5): 784–791.
    [169] Mia Hubert and Stephan Van der Veeken. Outlier detection for skewed data[J]. Journal ofChemometrics, 2008, 22(3): 235–246.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700