用户名: 密码: 验证码:
基于改进磁滞优化算法的三维蛋白质折叠问题研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
蛋白质折叠问题是当前生物信息技术研究的核心问题之一,它所要解决的是蛋白质一级结构中的氨基酸序列如何折叠成三维空间结构的问题。研究蛋白质的折叠机理对完成生物信息传递的整个过程,进而揭示生命活动的规律具有重要意义。近年来,智能计算在蛋白质折叠问题的建模、预测和优化中发挥着越来越重要的作用。本文将一种新型物理优化算法——磁滞优化算法(HO),应用于三维蛋白质结构的预测问题中;并在此工作的基础上,深入的研究了磁滞优化算法和蛋白质折叠的特性,提出了有效的改进方法。
     首先,本文首次将磁滞优化算法扩展于求解三维蛋白质折叠问题。磁滞优化算法受启发于物理学中的交流去磁过程,通过模拟交变衰减的外磁场的去磁过程来求解优化问题。它由Zar a nd等人于2002年针对自旋玻璃模型的求解提出。通过分析仿真结果得出结论:磁滞优化算法由各种“构造块”组成,其应用于求解蛋白质折叠问题的关键在于如何合理定义各“构造块”使磁滞优化算法的原理、蛋白质结构及目标函数(最低能量态)三者之间建立相互联系,各“构造块”能够协同作用以达到好的求解效果。
     进而,基于之前的研究工作,提出了成功的算法改进意见,仿真结果在与其它智能算法的应用结果比较中,不仅取得较好的结果,而且计算速度也更快;特别是对于长度为64的序列找到了最低能量值-57的构象,而在其它的算法求解中只找到最低能量-56的构象。
     最后,对应用磁滞优化算法求解三维蛋白质折叠问题进行了总结,并提出了今后研究工作的展望。
Protein folding problem is one of the core research areas in bioinformatics, and describes how an amino acid sequence folds into a specific spatial protein. Due to the critical positions of its studies in the process of transmitting bio-information and the living organisms discovered, the intelligent computation has been playing more and more important role in modeling, prediction and optimization for protein folding systems. In this thesis, a novel optimization algorithm, so-called Hysteretic Optimization (HO) is applied to dealing with 3D protein folding problem with lattice model. An effective improved method to cope with the 3D protein folding problem is proposed, according to the characteristics of HO and 3D-protein folding problem.
     This study involves the first application to solving three-dimensional protein folding problem with HO, which is inspired by ac demagnetization (ACD) procedure in magnetic systems proposed in 2002 by Zarand and his co-workers. In this study, the solutions start from benchmark data inputs, problem formulation, to the algorithm development and implementations. We conclude HO consists of a number of building blocks, and the vital factor of employing HO to protein folding problem is to make proper definitions on the key ingredients of HO in order to establish the relationships connecting with the HO principle, protein folding structure, and the energy of permutation as well.
     Then based on the numerous previous publications, a proposed modified HO algorithm is developed and successfully implemented for studing 3D protein folding problems. And the benchmark based numerous simulation results show the efficiency of the proposed HO method. Especially, for the protein formed by 64 amino acids, we first find a conformation with energy-57 smaller than the minimal value -56 before. Finally, the applications of hysteresis optimization algorithm in 3D protein folding problem is summarized, and the future research work is proposed in concluding remarks..
引文
[1]范庐君.基于SVM的蛋白质折叠子识别研究[D].硕士学位论文,上海大学,2009.
    [2]顾万君.基因密码子使用和蛋白质结构的生物信息学分析[D].博士学位论文,浙江大学,2004.
    [3]查娟.基于磁滞优化和极值优化算法的蛋白质折叠问题研究[D].硕士学位论文,浙江大学,2011.
    [4]Anfinsen, C. B., Haber, E., Sela, M., et al. The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain[J]. Proceedings of the National Academy of Sciences of the United States of America,1961,47(9):1309.
    [5]林晓丽.基于AB非格模型与遗传退火算法的蛋白质折叠结构预测[D].硕士学位论文,武汉科技大学,2007.
    [6]王琼.浅谈疯牛病之病因及其检测手段[J].化学教育,2002,23(9):1-2.
    [7]严玉霖,陈玲,高洪.疯牛病致病机理研究进展[J].动物医学进展,2003,24(6):49-52.
    [8]Reiss, C., Lesnik, T., Parvez, H., et al. Conformational toxicity and sporadic conformational diseases[J]. Toxicology,2000,153(1-3):115-121.
    [9]李一凡,张勇.蛋白质巯基亚硝基化与帕金森病[J].生命的化学,2006,26(6):543-546.
    [10]Anfinsen, C. B.. Principles that govern the folding of protein chains[J]. Science, 1973,181(4096):223-230.
    [11]邹承鲁.第二遗传密码:新生肽链及蛋白质折叠的研究[M].长沙:湖南科学技术出版社,1997.
    [12]UniProtKB/Swiss-Prot.Protein knowledgebase release 54.6 statistics[EB/OL], http://expasy.org/sprot/relnotes/relstat.html,2012,2.
    [13]PDB Statistics:Growth of Released Structures Per Year[EB/OL], http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100, 2012.2
    [14]江凡.蛋白质空间结构的实验技术和理论方法[J].物理,2007,36(4): 271-279.
    [15]Berger, B., Leighton, T. Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete[J]. Journal of Computational Biology,1998,5(1):27-40.
    [16]Wang, C. F., Wu, H.. Distribution of antibody in tissues[J]. Nature,1939,143: 565.
    [17]Huang K.. Lectures on statistical physics and protein folding[M]. Singapore: World Scientific Pub Co Inc,2005.
    [18]Funding for Global Protein Database Will Create One Reliable Resource http://www.genome.gov/page.cfm?pageID=10005283,2012,2
    [19]Bairoch, A., Apweiler, R. The SWISS-PROT protein sequence data bank and its new supplement TREMBL[J]. Nucleic Acids Research,1996,24(1):21.
    [20]Boeckmann, B., Bairoch, A., Apweiler, R., et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003[J]. Nucleic Acids Research, 2003,31(1):365-70.
    [21]Wu, C. H., Yeh, L. S. L., Huang, H., et al. The protein information resource[J]. Nucleic Acids Research,2003,31(1):345-347.
    [22]王海瑜.基于多分类器组合的蛋白质结构预测研究[D].硕士学位论文,西北工业大学,2004.
    [23]汪婷.基于遗传禁忌算法的蛋白质三维折叠结构预测[D].硕士学位论文,武汉科技大学,2010.
    [24]李婷婷.面向蛋白质折叠结构问题的粒子群优化算法的改进研究[D].硕士学位论文,武汉科技大学,2008.
    [25]Dill, K. A. Theory for the folding and stability of globular proteins [J]. Biochemistry,1985,24(6):1501-1509.
    [26]Stillinger, F. H., Head-Gordon, T., Hirshfeld, C. L.. Toy model for protein folding[J]. Physical Review E,1993,48(2):1469-1477.
    [27]黄文奇,黄勤波,石赫.求解蛋白质结构预测问题的二维连续模型及其相应的拟物算法[J].计算机研究与发展,2004,41(11):1959-1965.
    [28]陈凤飞.蛋白质结构预测的三角化模型和算法[D].硕士学位论文,华中科技大学,2007.
    [29]Dubchak, I., Muchnik, I., Mayor, C., et al. Recognition of a protein fold in the context of the SCOP classification[J]. Proteins:Structure, Function, and Bioinformatics,1999,35(4):401-407.
    [30]Ding, C. H. Q., Dubchak, I.. Multi-class protein fold recognition using support vector machines and neural networks[J]. Bioinformatics,2001,17(4):349-358.
    [31]Bonomi, E., Lutton, J. L.. The N-city travelling salesman problem:Statistical mechanics and the Metropolis algorithm[J]. SIAM review,1984,26:551-568.
    [32]Rossier, Y., Troyon, M., Liebling, T. M.. Probabilistic exchange algorithms and Euclidean traveling salesman problems[J]. OR Spectrum,1986,8(3):151-164.
    [33]Knox, J. E.. The application of tabu search to the symmetric traveling salesman problem[D]. Ph. D. Thesis, University of Colorado,1988.
    [34]Malek, M. (1988), "Search methods for traveling salesman problems", Working Paper, University of Texas, Austin, TX.
    [35]Fiechter, C.-N. (1990), "A parallel tabu search algorithm for large scale traveling salesman problems", Working Paper 90/1, Departement de Mathematiques, Ecole Polytechnique Federate de Lausanne.
    [36]Goldberg, D. E., Lingle Jr, R.. AllelesLociand the Traveling Salesman Problem[C]. Proc. of 1st International Conf. on genetic algorithms and their applications,1985:154-159.
    [37]Hopfield, J. J., Tank, D. W.. "Neural" computation of decisions in optimization problems[J]. Biological cybernetics,1985,52(3):141-152.
    [38]Dorigo, M., Gambardella, L. M.. Ant colony system:A cooperative learning approach to the traveling salesman problem[J]. IEEE Transactions on Evolutionary Computation,1997,1(1):53-66.
    [39]Robuste, F., Daganzo, C. F., Souleyrette, R. R.. Implementing vehicle routing models[J]. Transportation Research Part B:Methodological,1990,24(4):263-286.
    [40]Laporte G., Osman L. H.. Metastrategy in combinatorial optimization:a bibliography[J]. Annals of Operations Research,1996,63:513-628.
    [41]Osman, I. H.. Metastrategy simulated annealing and tabu search algorithms for the vehicle routing problem[J]. Annals of operations research,1993,41(4):421-451.
    [42]Taillard, E.. Parallel iterative search methods for vehicle routing problems[J]. Networks,1993,23(8):661-673.
    [43]Gendreau, M., Hertz, A., Laporte, G. A tabu search heuristic for the vehicle routing problem[J]. Management Science,1994,1276-1290.
    [44]Xu, J., Kelly, J. P.. A network flow-based tabu search heuristic for the vehicle routing problem[J]. Transportation Science,1996,30(4):379-393.
    [45]Rego, C.. A subpath ejection method for the vehicle routing problem[J]. Management Science,1998,44(10):1447-1459.
    [46]Bullnheimer, B., Hartl, R. F., Strauss, C.. Applying the Ant System to the vehicle routing problem, in:Meta-Heuristics:Advances and Trends in Local Search Paradigms for Optimization, eds. S.Voss, S. Martello, I.H. Osman and C. Roucairol, Kluwer, Boston,1998.
    [47]Bullnheimer, B., Hartl, R. F., Strauss, C.. An improved ant System algorithm for thevehicle Routing Problem[J]. Annals of operations research,1999,89(0):319-328.
    [48]Thangiah, S. R., Nygard, K. E., Juell, P. L.. Gideon:A genetic algorithm system for vehicle routing with time windows[C]. IEEE,1991:322-328.
    [49]Thangiah, S. R., Osman, I. H., Sun, T.. Hybrid genetic algorithm, simulated annealing and tabu search methods for vehicle routing problems with time windows[J]. Technical Report SRU CpSc-TR-94-27, Computer Science Department, Slippery Rock University,1994.
    [50]Boettcher, S.. Extremal optimization for Sherrington-Kirkpatrick spin glasses[J]. The European Physical Journal B-Condensed Matter and Complex Systems,2005, 46(4):501-505.
    [51]曾国强.改进的极值优化算法及其在组合优化问题中的应用研究[D].博士学位论文,浙江大学,2011.
    [52]Unger, R., Moult,-J..-A genetic algorithm for 3D protein folding simulations[C]. Proc.5th Int. Conf. on Genetic Algorithms,1993:581-588.
    [53]Liu, J., Wang, L., He, L., et al. Analysis of toy model for protein folding based on particle swarm optimization algorithm[J]. Advances in Natural Computation, 2005,3612:444-444.
    [54]Agostini, F. P., Soares-Pinto, D. D. O., Moret, M. A., et al. Generalized simulated annealing applied to protein folding studies[J]. Journal of computational chemistry,2006,27(11):1142-1155.
    [55]Hsu, H. P., Mehra, V., Nadler, W., et al.. Growth algorithms for lattice heteropolymers at low temperatures[J]. The Journal of chemical physics,2003, 118(1):444-452.
    [56]吕志鹏.蛋白质结构预测的现实求解方法——高效启发式优化算法[D].博 士学位论文,华中科技大学,2007.
    [57]Montelione, G. T., Zheng, D., Huang, Y. J.. Protein NMR spectroscopy in structural genomics[J]. Nature Structural Biology,2000,7(11):982-985.
    [58]primary protein structure, http://protein-pdb.com/2011/10/04/primary-protein-structure/,2012.2
    [59]Boczko, E. M., Brooks, C. L.. First-Principles Calculation Of The Folding Free-Energy Of A 3-Helix Bundle Protein[J]. Science,1995,269:393-396.
    [60]张红娟,基于非格点模型的蛋白质结构预测研究[D].硕士学位论文,大连理工大学,2005.
    [61]王向红,章林溪,赵得禄.蛋白质分子的HNP格点模型[J].高分子学报,2004,(2):273-276.
    [62]Zarand, G., Pazmandi, F., Pal, K. F.. Using hysteresis for optimization [J]. Physical review letters,2002,89(15):150201.
    [63]Palaeomagnetism, http://www.egglescliffe.org.uk/physics/chur/palaeomag/palaeomag.html,2012.2
    [64]A Revolution in Current Transformer Testing, http://www.utilityproducts.com/articles/print/volume-8/issue-4/product-focus/test-m anagement/a-revolution-in-current-transformer-testing.html.2012.2
    [65]Pal, K. F.. Hysteretic optimization for the traveling salesman problem[J]. Physica A:Statistical Mechanics and its Applications,2003,329(1):287-297.
    [66]Zha, J., Zeng, G. Q., Lu, Y. Z.. Hysteretic optimization for protein polding on the lattice[C]. IEEE,2010.
    [67]Rosenbluth, M. N., Rosenbluth, A. W.. Monte Carlo calculation of the average extension of molecular chains[J]. The Journal of chemical physics,1955,23: 356-359.
    [68]Wall F. T., Erpenbeck J. J.. The Chain-Enrichment Procedure [J]. Journal of Chemical Physics,1959,30:364-369.
    [69]程念华,黄文奇.求解蛋白质折叠问题的改进PERM算法[D].硕士学位论文,华中科技大学,2005.
    [70]黄文奇,崔茂林.求解蛋白质折叠构型预测问题的PERM改进算法[J].微计算机应用,2004,25(3):268-273.
    [71]LU H, YANG G. Extremal Optimization for protein folding simulations on the lattice [J]. Computers & Mathematics with Applications,2009,57(11-12): 1855-1861.
    [72]陈矛,黄文奇,吕志鹏.求解HP模型蛋白质折叠问题的改进PERM算法[J].计算机研究与发展,2007,44(9):1456-1461.
    [73]Hoque, T., Chetty, M., Dooley, L. S.. A guided genetic algorithm for protein folding prediction using 3D hydrophobic-hydrophilic model[C]. IEEE Congress on Evolutionary Computation,2006:2339-2346.
    [74]Shmygelska, A., Hoos, H.. An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem[J]. BMC bioinformatics,2005,6(1): 30.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700