蛋白质构象预测算法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
蛋白质的天然构象是由其氨基酸序列确定的,而蛋白质的生物学功能在很大程度上又依赖于其构象,因此蛋白质构象预测是蛋白质研究中发展已久但仍具有挑战性的问题,是后基因组时代生命科学中重大的研究课题之一。
     研究发现,蛋白质的天然构象形式完全包含在组成其分子的氨基酸序列的信息之中,这一观点奠定了蛋白质构象预测的理论计算基础。事实上,目前已有许多基于HP模型求解蛋白质构象预测问题的算法,而且取得了一定的成果,其中最具代表性的有蒙特卡罗算法、遗传算法、近似算法、基于重要性抽样的SISPER算法以及基于裁减复制策略的PERM算法等,但从求解效率来看,此类算法还存在较大的提升空间。本文分别将改进的蚁群优化算法和模拟退火算法应用于两种蛋白质构象简化HP模型的预测问题,使问题的求解效率得到了进一步提高。
     本文采用改进的蚁群优化算法求解基于格点模型的蛋白质构象预测问题。针对以往算法中存在容易产生非法构象和算法运行时间长的问题,本文提出用“克隆”的方法处理非法构象;用“单点变异并向前重构”的方法用于局部搜索阶段来缩短算法运行的时间。通过实验证明,这两点改进在算法中的应用是正确有效的。
     基于非格点模型的蛋白质构象预测问题,可以视为一个连续函数优化问题,因此本文采用改进的模拟退火算法对其进行求解。针对模拟退火算法的特点提出了三点改进方法:增加记忆功能,限制接受退化解向量以及邻域的再次搜索。数值实验结果表明,算法结构简单,达到最优解的效率高,是一种较好的启发式连续全局优化算法。
     从上述两种算法得到的构象模型可以看出,HP模型虽然简单,但能够反映出蛋白质折叠构象的一些简单性质,即在蛋白质天然构象中,疏水氨基酸残基总是被极性氨基酸残基所包围,形成一个疏水核心。由此表明两种改进算法用于蛋白质构象预测是可行有效的。
     最后,本文实现了一个简单的蛋白质折叠构象的图形模拟系统,进一步验证了论文中给出的两种算法的可行性和有效性。
The protein natural structure is decided by its amino-acid sequence, and its biological functions are dependent on this structure extensively. So prediction of protein structure is a long historic task, and still a challenge in the research of protein, which is becoming an important research domain in the life science on the post-genome era.
     Researches have shown that protein’s natural structure is decided by their amino-acid sequences, which is theoretical computation base of the protein’s structure prediction. Many algorithms have been proposed for the protein structure prediction problem in simple HP model, and have get some fruits, such as Monte Carlo algorithm, the genetic algorithm, approximate algorithm, SISPER and PERM. Shown from the efficiency of solving this problem, those algorithms need to advance. In this paper, we apply improved Ant Colony Optimization Algorithm (ACO) and improved Simulated Annealing Algorithm (SA), to the prediction of protein structure in two kinds of simple HP model respectively, advancing the efficiency of solving this problem.
     For the protein structure prediction problem based on lattice model, we apply improved ACO to solve it. In old algorithm, the infeasible structures are frequently encountered and required long CPU time. For those problems, propose a“Clone”method to deal with the infeasible structures, and propose a“Point mutation and Reconstruction”method to reduce CPU time in local search phase. Shown from the empirical results, two methods are accurate and feasible.
     For the protein structure prediction problem based on off-lattice model, it can be treated as a continuous function optimization problem, we apply improved SA to solve it. For the characteristic of SA, propose three improvements in this paper: increasing memorial function; restrained accepting exasperate solution; search in neighborhood. Numerical results illustrate that this algorithm has simple configuration and high efficiency to get optimization solution, and is very suitable for continuous global optimization.
     Shown from the model of the structure which is get from the algorithm, although HP model is simple, it shows the structure’s characters. It’s that in the natural structure, the hydrophobic residues are always surrounded by the polar residues, forming a hydrophobic core. It illustrate that two improved algorithms are feasible and effective for the protein structure prediction.
     In the end, we realize the simple graphics simulation system for protein folding structures, farther it demonstrates the feasibility and validity of the algorithms in this paper.
引文
1陈源,李朝军.虚拟细胞[J].细胞生物学杂志. 2004, 26 (3): 231~234
    2 N. Ishii, M. Robert, Y. Nakayama, A. Kanai, M. Tomita. Toward large-scale modeling of the microbial cell for computer simulation [J]. Journal of Biotechnology. 2004, 113(1-3):281~294
    3 K. Takahashi, N. Ishikawa, Y. Sadamoto. E-Cell 2: Multi-platform E-Cell simulation system [J]. Bioinformatics. 2003, 19(13):1727~1729
    4 K. Takahashi, K. Yugi, K. Hashimoto. Computational challenges in cell simulation [J]. IEEE Intelligent Systems. 2002, 17(5):64~71
    5赵明生,尚彤,孙冬泳.电子细胞的研究现状与展望[J].电子学报. 2001, 29: 1740~1743
    6杨冬,欧阳红生,王云龙,逄大欣.虚拟细胞研究进展及应用价值[J].细胞与分子免疫学杂志. 2005, 21: 65~67
    7赵南明,周海梦.生物物理学.高等教育出版社. 2000: 3~5
    8 M. Tomita. Whole cell simulation: a grade challenge of the 21th century [J]. Trends Biotechnol. 2001, 19(6):205~210
    9 A. Pettinen, T. Aho, O. Smolander, T. Manninen. Simulation tools for biochemical networks: evaluation of performance and usability [J]. Bioinformatics. 2005, 21(3): 357~363
    10 P. Ortoleva. Model of a Living Cell [J]. College of Arts & Sciences Alumni Association. 2001, 46:1~2
    11 E. L. Weitzke, P. J. Ortoleva. Simulating cellular dynamics through a coupled transcription, translation, metabolic model [J]. Computational Biology and Chemistry. 2003, 27: 469~480
    12 C. B. Anfinsen. Principles that Govern the Folding of Protein Chains [J]. Science. 1973, 181: 223~227
    13 R. Helling, H. Li, R. Mélin. The Designability of Protein Structures [J]. Journal of Molecular Graphics and Modeling. 2001, 19: 157~167
    14 M. S. Johnson, N. Srinivasan, R. Sowdhamini. Knowledge-based protein modeling [J]. Critical Reviews in Biochemistry and Molecular Biology. 1994, 29(1): 1~68
    15 S. Siebert, R. Backofen. MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons [J]. Bioinformatics. 2005, 21(16): 3352~3359
    16 E. Ising. Beitrag Zur Theorie des Ferromagnetismus [J]. Z. Physik. 1925, 31: 253~258
    17 W. Heisenberg. Beitrag Zur Theorie des Ferromagnetismus [J]. Z. Physik. 1928, 49: 619~636
    18方茜,朱正佑,王翼飞.应用于蛋白质折叠模拟的三种新蒙特卡罗方法的比较[J].上海大学学报(自然科学版). 2004, 10 (5): 526~531, 536
    19 F. Liang, W. H. Wong. Evolutionary Monte Carlo for Protein Folding Simulations [J]. The Journal of Chemical Physics. 2001, 115(7): 3374~3380
    20 R. Unger, J. Moult. Genetic Algorithms for Protein Folding Simulations [J]. J. Mol. Biol. 1993, 231(1): 75~81
    21 R. Unger, J. Moult. Genetic Algorithms for Three Dimensional Protein Folding Simulations [J]. In Proc. 5th Intl. Conf. On Genetic Algorithms. 1993, 581~588
    22 B. Berger, T. Leighton. Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-Complete [J]. Journal of Computational Biology. 1998, 5(1): 27~40
    23 G. Chikenji, M. Kikuchi, Y. Iba. Multi-Self-Overlap Ensemble for Protein Folding: Ground State Search and Thermodynamics [J]. Physical Review Letters. 1999, 83(9): 1886~1889
    24 S. Kirkpatrick, C. D. Gelatt, M. P. Vecchi. Optimization by Simulated Annealing [J]. Science. 1983, 220: 671~680
    25 K. A. Dill. Theory for the Folding and Stability of Globular Proteins [J]. Biochemistry.1985, 24: 1501~1509
    26牛晓辉,李娜娜.免疫算法在蛋白质折叠模拟中的应用[J].数学杂志. 2004, 24(3): 313~316
    27 L. Ingber. Simulated Annealing: Practice versus Theory [J]. Mathematical and Computer Modeling. 1993, 18: 29~57
    28周爱儒.生物化学(第6版).人民卫生出版社. 2004, 5: 7~18
    29 T. Narumi, R. Susukita, T. Ebisuzaki, G. McNiven, B. Elmegreen. Molecular dynamics machine: Special-purpose computer for molecular dynamicssimulations [J]. Molecular Simulation. 1999, 21: 401~415
    30阎隆飞,孙之荣.蛋白质分子结构.清华大学出版社. 1999: 20~30
    31 H. S. Chan. Protein folding - matching the speed and locality [J]. Nature. 1998, 392(6678): 761~763
    32 A. Gutin, A. Sali, V. Abkevich, M. Karplus, E. I. Shakhnovich. Temperature Dependence Of The Folding Rate In A Simple Protein Model: Search For A "Glass" Transition [J]. Journal of Chemical Physics. 1998, 108 (15): 6466~6483
    33 H. Li, R. Helling, C. Tang, N. Wingreen. Emergence of Preferred Structures in a Simple Model of Protein Folding [J]. Science. 1996, 273(5275): 666~669
    34 K. A. Dill. Dominant forces in protein folding [J]. Biochemistry. 1990, 29: 7133~7155
    35 K. F. Lau, K. A. Dill. A Lattice Statistical Mechanics Model of the Conformation and Sequence Space of Proteins [J]. Macromolecules.1982, 22: 3968~3997
    36 K. A. Dill, H. S. Chan. From Levinthal to pathways to funnels [J]. Nature Structural & Molecular Biology. 1997, 4: 10~19
    37 F. H. Stillinger, H. Gordon, C. L. Hirshfeld. Toy model for protein folding [J]. Physical Review E. 1993, 48: 1469~1477
    38 F. H. Stillinger. Collective aspects of protein folding illustrated by a toy model [J]. Physical Review E. 1992, 52(32): 2872~2877
    39 B. Berger, J. Kleinberg, T. Leighton. Reconstructing a three-dimensional model with arbitrary errors [J]. In Proceedings of the 28th ACM Symposium on Theory of Computing. 1996: 449~458
    40 E. Bonabeau, M. Dorigo, G. Theraulaz. Inspiration for optimization from social insect behaviour [J]. Nature. 2000, 406(6): 39~42
    41 A. Colorni, M. Dorigo. Heuristics from nature for hard combinatorial optimization problems [J]. International Trans Operational Research. 1996, 3(1): 1~21
    42汪镭,吴启迪.蚁群算法在连续空间寻优问题求解中的应用[J].控制与决策. 2003, 18(1):45~48
    43唐泳,马永开.用改进的蚁群算法求解函数优化问题[J].计算机应用研究. 2004, 18(9):89~91
    44杨勇,宋晓峰.蚁群算法求解连续空间优化问题[J].控制与决策. 2003, 18(5):573~576
    45 M. Dorigo, E. Bonabeau. Ant algorithm and stigmergy [J]. Future Generation Computer Systems. 2000, 16(8): 851~871
    46 M. Dorigo, L. M. Gambardella. Ant Colony System: A Cooperative Learning Approach to Traveling Salesman Problem [J]. IEEE Trans. On Evolutionary Computation, 1997, 1(1): 53~66
    47 A. Shmygelska, R. Hernandez, H. H. Hoos. An Ant Colony Optimization Algorithm for the 2D HP Protein Folding Problem [J]. In Proc of the 3rd Intl Workshop on Ant Algorithms, ANTS LNCS 2463 Springer Verlag. 2002: 40~52
    48 A. Shmygelska, H. H. Hoos. An Improved Ant Colony Optimization Algorithm for the 2D HP Protein Folding Problem [J]. In Proc of the 16th Canadian Conference on Artificial Intelligence, LNCS 2671 Springer Verlag. 2003:400~417
    49 A. Shmygelska. Search for folding nuclei in native protein structures [J]. Bioinformatics. 2005, 21: 394~402
    50 A. Shmygelska, H. H. Hoos. An ant colony optimization algorithm for the 2D and 3D hydrophobic polar protein folding problem [J]. BMC Bioinformatics. 2005, 6: 1~22
    51 N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, E. Teller. Equations of State Calculation by Fast Computing Machines [J]. The Journal of Chemical Physics. 1953, 21: 1087~1092
    52 A. Corana, M. Marchesi, C. Martini, S. Ridella. Minimizing multimodal functions of continuous variables with the“simulated annealing”algorithm [J]. ACM Transactions on Mathematical Software. 1987, 13(3): 262~280
    53 A. Corana, C. Martini, S. Ridella. Corrigenda:“Minimizing multimodal functions of continuous variables with the‘simulated annealing’algorithm”[J]. ACM Transactions on Mathematical Software. 1989, 15(3): 287

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700