蛋白质关联图预测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
生物信息学中一个基础而尚未解决的问题是从氨基酸序列预测蛋白质的三维结构。目前,使用全分子建模方法得到蛋白质的三维结构仍然十分困难,因此,预测蛋白质三维结构的一个中间步骤----预测蛋白质残基间的关联图应运而生并得到快速发展。蛋白质关联图包含着蛋白质折叠信息和空间结构的重要信息,因此它的解决对于蛋白质折叠识别意义重大。在获取了蛋白质关联图信息后,蛋白质三级结构重建将变得比较简单,而且根据蛋白质关联图信息重建蛋白质三级结构的方法已日渐成熟。同时,在蛋白质结构比对方法中,蛋白质关联图叠合法是唯一不用预先计算蛋白质结构的方法。蛋白质的结构决定功能,因此蛋白质关联图预测问题的解决对蛋白质空间结构预测和蛋白质功能预测都有着极其重大的意义。计算智能是一种仿生计算方法,它从生物底层对智能行为进行模拟和研究,拓展了传统的计算模式,它具有在不确定及不精确环境中进行推理和学习的卓越能力,是建立智能系统的有效计算工具。随着人类基因组计划的实施,以及更多生物基因组测序计划的完成,计算智能在生物信息学中得到了广泛的应用。本文在全面分析和了解了蛋白质结构预测的研究现状、研究热点和发展趋势的基础上,重点研究了人工神经网络和人工免疫系统在蛋白质关联图预测中的应用。本文的主要贡献和研究成果如下:
     (1)对蛋白质关联图预测方法研究背景、研究现状、研究意义及相关概念进行了全面的综述。
     (2)对后基因组时代的生物信息学和蛋白质结构及其预测原理方法进行了综述,阐述了计算智能的相关理论,介绍了人工神经网络和人工免疫系统的基础理论,最后对计算智能在生物信息学中应用做了全面总结和归纳。
     (3)提出了基于偏置递归神经网络蛋白质关联图预测实现方法。
     (4)提出了基于暂态混沌神经网络蛋白质关联图预测研究方法。
     (5)提出了基于克隆选择算法的蛋白质联系图预测模型。
     本文的研究成果丰富了计算智能理论的应用研究,在递归神经网络、混沌神经网络、克隆选择算法等方面的研究具有一定的理论意义和应用价值,为蛋白质关联图预测研究提供了有意义的方法和手段。
Computational intelligence (CI) is a computing methodology from nature, which simulates and researches the intelligent behavior from the lowest level of the creature. CI develops the traditional style of computation and provides a new a pproach for solving complex problems.It has the capability of reasoning and learning from the infinite and inaccuracy environment CI is the powerful computational tool for building more intelligent system..There are several main methods in it :fuzzy system, artificial neural network, genetic algorithms,And Artificial Immune System. Computational Intelligence(CI)has been advancing rapidly in Recent years,and found applications in many fields, such as pattern recognition, machine learning,knowledge discovery,data mining.A great usage of it is in a newly evolved branch of science: bioinformatics. The accomplishment of the Human Genome Project (HGP), and the completion of more other genomes, Computational Intelligence will play bigger roles in computational biology and bioinformatics.CI have been used for analyzing the different genomic sequences, protein structures and folds, and gene expression data.At the same time, CI have been used for a fast sequence comparison and search in databases, automated gene identification, efficient modelling and storage of heterogeneous data, etc。Since the work entails processing huge amounts of incomplete or ambiguous biological data, learning ability of neural networks, uncertainty handling capacity of fuzzy sets and searching potential of genetic algorithms are synergistically utilized. Computational intelligence poses several possibilities in Bioinformatics,particularly by generating low-cost, low-precision, good solutions.
     The proteins,macro-moleculesen coded by DNA,chemical unit of which is the amino acid,attach greatly close importance to biological activities of the mankind.By combing some amino acids, a continuous long chain with spatial structure formed and the life, proteins come into being. The proteins are the basic elementary component of while they are responsible for carrying through the functions of body cell. The genome sequencing result demonstrates that in the human body there are about one hundred thousands kinds of diferent proteins, every of which possesses unique function and purpose, that realizing the function protein is completed through the efect between the structures of proteins and other molecules. The result tells us knowing about the structures of proteins is the key to grasp the function in grain. From the above, we can say that it is not exaggerated that the problem of protein structure prediction is one of the magnificent research domains of bioinformatics in twenty-first century. In the era of post-genome,the sharp increase of the biological information urges the batch processing methods by computer, which leads to the birth of the Bioinformatics. Currently, the main research field of the bioinformatics now is gene regulation and the study of protein structure and function, and protein structure prediction is the preliminary step of the latter work. In which secondary structure prediction has been brought to maturity, whereas the 3D-structure prediction of protein is still at its early stage and needs further investigation. The present protein structure prediction methods can be simply classified as ab initio prediction based on minimal energy principle and the way of protein correlative information learning. Each of them has its preponderances and shortcomings: the energy minimization method is more adaptive and highly independent, but it is hard to formulate the energy function. Even if a comparatively precise energy function is made, the grand compute scale caused by numerous parameters and the tiny energy difference between the formations which is only on the level of 1kcal/mol,make the prediction difficult. The prediction using correlative information is more precise, especially for the homological proteins, but it is extremely restricted by the known protein structure database, and is less universal. the accuracy of the methods which predict the three-dimensional structures directly from the amino acids sequences is not high enough, so intermediate steps, such as residue contacts prediction , and residue spatial distance prediction, were put forward and have been developed rapidly recently. Contacts between protein residues constrain protein folding and characterize different protein structures. Therefore its solution may be very useful in protein folding recognition and de novo design. It is much easier to get the major features of the three-dimensional (3D) structure of a protein if the residue contacts are known for the protein sequence, and methods that reconstruct the protein structure from its contact map have been developed. A similarity based on contact map overlaps is the only approache for structural comparison that does not require a pre-calculated set of residues equivalences as one of the goals of the method.
     There are a variety of measures of residues contact used in the literature. Some use the distance between the Cα-Cαatoms , while others prefer to use the distance between the Cβ-Cβ. Contact maps are two dimensional, binary representations of protein structures. For a protein with N residues, the contact map for each pair of amino acids k and l (1≤k, l≤N), will have a value C(k,l)=1, if a suitably defined distance d(k,l)     Based on understanding and analyzing the actual rearch state,research focuses and development trend in domain of protein structure prediction ,this dissertation focus on the application of artificial neural network and artificial immune system in the prediction of protein contact map , The main contributions of this dissertation are summarized as follows:
     (1)This dissertation makes a survey about protein structure prediction and prediction of protein contact maps,including the appearing background ,the research state and significance.
     ( 2 ) In this dissertation,the relevant Computational Intelligence theories are expatiated,including Deviation Units Recurrence Neural Network, Transiently Chaotic Neural Network, Artificial Immune System, Clonal Selection Algorithm.Meanwhile,making a survey about Computational Intelligence in Bioinformatics,including protein structure prediction,prediction of protein contact maps, multiple sequence comparisons.and gene expression data .
     (3)To deal with the weakness of the BP neural network in learning speed, an Deviation Units Recurrence Neural Network model is presented based on the Jo rdan and Elman neural network . The weight-regulating method is developed based on BP algorithm. Simuations on fault diagnosis are performed with this neu ral network model. Experimental result s show that the converging speed of this network model is faster than the traditional BP network and this model has a good practicability.
     In this dissertation, we capture two features of the amino acids: predicted secondary structure and hydrophobicity. The predicted secondary structures for each protein are obtained by using DSSP, we use 3 neurons to denote the 6 possible secondary structure pairs since a amino acid residue has three possible secondary structures:α-heLix,β-sheet and coil. Hydrophobicity is a measure of nonpolarity of the side chains. As the nonpolarity (hydrophobicity) of the side chain increases, it avoids being in contact with water and buried within the protein nonpolar core. This is seen as the essential driving force in protein folding. This quantity is used to encode residue specific information to the network. Since the hydrophobicity of a residue affects the non-covalent bonding between its surroundings, it can be a contributing factor to contact decision of that residue with others.In our thesis , major characteristic of the neural network is that they have ten conjunction units which are used to take into accont the influence of neighbors pairing and sequence global correlation .Another important characteristic is that the network used a novel binary input encoding.The method could assign protein contacts with an average accuracy of 0.26 and with an improvement over a random predictor of a factor greater than 8。
     (4)A algorithm based on chaotic neural network is proposed to solve the protein contact maps problem . The proposed neural networks have many merits which are transient chaos and stable convergence etc. so as to overcome the drawbacks of easily getting stuck in local minim in conventional Hopfield neural networks. It can reach a stable convergent state after shortly reversed bifurcations. Numerial simulation of protein contact maps problem show that the TCNN has higher ability to search for globaiiy optimal or near-optimal solution and higher efficiency of searching than HNN. The method could assign protein contacts with an average accuracy of 0.27 and with an improvement over a random predictor of a factor greater than 9.
     (5)This dissertation proposed a protein contact map prediction method employing protein folding rules and clonal selection algorithm, which has removed the limit of the present protein structure database by inducing the independent constraint rules from the contact maps' characteristics, and gets a satisfactory precision.
     Immune algorithm is a rising algorithm which simulates the organism immune system by computer. There is a kind of immune algorithm named clonal selection algorithm, which is widely used due to its adaptability, implicit parallelism and diversity. Clonal selection algorithm is generated by simulating the antibody producing model. In the immune system, each antibody is cloned at a speed based on its affinity to the entered antigen, and then mutates at a high frequency to generate a more adaptive antibody, which finally lead to the optimum solution. Thus the fitness of the clonal selection algorithm shows this affinity between antibody and antigen. A fitness function is constructed in this paper by using protein folding restrictions, such as:Amino acids' hydrophobicity rule,Secondary structure folding rules of protein, Amount of the contacts in contact map, Degree of vertex, and Other special rules
     Given the midway solution generated by the clonal selection algorithm penalty which subjects to the restrictions above, the more it breaks the rules, the less feasible it is for the real world, and the more penalty it will get, thus it will have a higher probability of mutation in order to produce a new solution more accordant to the protein biological characteristics in next iteration, which actually optimized the prediction.
     The testing of the prediction of 200 non-homological protein in 5 groups of different lengths shows that, this algorithm has good adaptability and high efficiency, and the average precision and coverage of each group is higher than 40% and 35% respectively. Moreover, the precision and coverage differences between groups are less than 4%. Although the results of tests differ a lot at the thresholds from 6 to 10 angstroms, their mean precision is still greater than 35%. Meanwhile, the execution time of a contact map prediction is not more than 2 minutes, with a mean value about 100 seconds.
     This dissertation is based on the National Natural Science Foundation of China“Research on relevant combinatorial theory and algorithm inbioinformatics”( No. 60433020), Science &Technology Development Project of Jilin Province“Research on prediction of protein structure and function based on evolving algorithm”(NO.20020608),Innovative Foundation Project of Jilin university”research on the method protein structure prediction”(NO.450011022211).the achievements of this dissertaion make applied research on computational intelligence theory progress,is of significance in the fields of recurrence neural network, transiently chaotic neural network, clonal selection algorithm,and provides effective methods and means for practical research of machine taste and smell sensation.
引文
[1] Fersht AR. Structure and mechanism in protein science. NewYork: WH Freeman; 1992.
    [2] Anfinsen C G. P rincip les that govern fo lding chains [J ]. Science, 1973, 181: 223-230.
    [3] Bairoch A , Apweiler R. The SWISS-PROT protein sequence data bank and its supp lement TrEMBL [J ]. Nucleic Acids Research, 1997, 25: 31- 36.
    [4] Chou K C. P rediction of p ro tein structural class and subcellular locations [J ]. Current P ro tein and Pep tide Science, 2000,1: 171- 208.
    [5] Sondek J , Sho rtleD. A ccomodation of single amino acid insertions by the native state of staphylococcal nuclease [J ]. Proteins: Structure, Function and Genetics, 1990, 7: 299- 305.
    [6] Baker D, Sail A. Protein structure prediction and structural genomics. Science. 2001;294:93–96.
    [7] 来鲁华等编著,蛋白质的结构预测与分子设计,北京大学出版社,2001。
    [8] 孙啸,陆祖宏,谢建明,生物信息学基础,清华大学出版社,2005。
    [9] Alessandro Vullo Paolo Frasconi. A Bi-Recursive Neural Network Architecture for the Prediction of Protein Coarse Contact Maps . CSB:Proceedings of the IEEE Computer Society Conference on Bioinformatics . 2002 ,ISBN:0-7695-1653-X. 187-196.
    [10] Mirny, L., Domany, E., Protein fold recognition and dynamics in the space of contact maps. Proteins: Struct.Funct. Genet. 1996,26., 391–410.
    [11] Thomas, D.J., Casari, G., Sander, C., The prediction of protein contacts from multiple sequence alignments. Protein Eng. 1996.9, 941–948.
    [12] Lund, O., Frimand, K., Gorodkin, J., Bohr, H., Bohr, J., Hansen, J., Brunak, S., Protein distance constraints predicted by neural network and probability density functions. Protein Eng. 1997. 10, 1241–1248.
    [13] Olmea, O., Valencia, A. (1997) Improving contact predictions by combining of correlated mutations and other sources of sequence information. Folding & Design, 2, s25-s32.9.
    [14] Fariselli P, Casadio R. Neural network based prediction of residue contacts in protein.Protein Eng 1999;12:15–21.
    [15] Park, K., Vendruscolo, M., Domany, E.Towards an energy function for the contact map representation of proteins. Proteins: Structure, Function and Genetics.2000.40, 237-248.
    [16] Fariselli P, Olmea O, Valencia A, Casadio R. Prediction of contact maps with neural networks and correlated mutations. Protein Eng. 2001 ,14(11):835-43.
    [17] Singer MS, Vriend G, Bywater RP. Prediction of protein residue contacts with a PDB-derived likelihood matrix. Protein Eng 2002;15:721–725.
    [18] Pollastri G, Baldi P.Prediction of contact maps by GIOHMM and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics. 2002;18,Suppl 1:S62-70.
    [19] Alessandro Vullo , Paolo Frasconi. Prediction of protein coarse contact maps, J Bioinform Comput Biol.2003 ,(2):411-31.
    [20] Ying Zhao and George Karypis, Prediction of Contact Maps Using Support Vector Machines,3rd IEEE International Conference on Bioinformatics and Bioengineering (BIBE), 2003.pp. 26–33.
    [21] Nicholas Hamilton, Kevin Burrage,Mark A. Ragan and Thomas Huber.Protein Contact Prediction Using Patterns of Correlation. PROTEINS: Structure, Function, and Bioinformatics,2004, 56:679–684.
    [22] MacCallum RM. Striped sheets and protein contact prediction. Bioinformatics. 2004 Aug 4;20 Suppl 1:I224-I231.
    [23] Guang-Zheng Zhang and De-Shuang Huang1,Prediction of inter-residue contacts map based on genetic algorithm optimized radial basis function neural network and binary input encoding scheme,Journal of Computer-Aided Molecular Design, Volume 18, Number 12, December 2004, pp. 797-810.
    [24] Gupta N, Mangal N, Biswas S. Evolution and Similarity Evaluation of Protein Structures in Contact Map Space. PROTEINS: Structure, Proteins. 2005 May 1,59(2):196-204.
    [25] Punta M, Rost B ,PROFcon: novel prediction of long-range contacts.Bioinformatics. 2005,2960-8.
    [26] Alessandro Vullo, Ian Walsh, and Gianluca Pollastri,A two-stage approach for improved prediction of residue contact maps. BMC Bioinformatics, 2006, 7: 180.
    [27] Venduscolo, M., Kussell, E., Domany, ERecovery of protein structures throughcontact maps. Folding and Design. 1997,2(5):295-306.
    [28] 张阳德,生物信息学,科学出版社,2004。
    [29] 陈润生,生物信息学,生物物理学报,1999,5-12。
    [30] 樊龙江,生物信息学札记(第二版),2001 年 6 月,页码 119-124。
    [31] http://math.hzau.edu.cn/ht/news_view.asp .
    [32] http://www.infobio.org .
    [33] 许东,龙星计划研讨班,2005。
    [34] 王镜岩等主编,生物化学(上册),高等教育出版社,2002 年 9 月,第三版,页码 123-241
    [35] 齐建勋,肖奕,基于小波方法的蛋白质非规则二级结构预测,科学通报,2002 年,47卷,6 期,页码 425-430。
    [36] 张海霞,蛋白质二级结构预测方法研究,中国学位论文文摘数据库,大连理工大学硕士学位论文,20040601,页码 6-15。
    [37] 李晓琴,罗辽复,蛋白质结构型的定义和识别,生物化学与生物物理进展,2002,29 (1),页码 124-127。
    [38] 王志新,蛋白质结构预测的现状与展望,1998 年,18 卷,6 期,《生命的化学》,页码19-22。
    [39] 邹承鲁,新生肽链及蛋白质折叠的研究,湖南科学技术出版社,1997。
    [40] http://hpdb.hbu.edu.cn/about/group.asp.
    [41] CYNTH IA G, PER J. Developing Bioinformatics Computer Skills[M ]. New York: O’Reilly Publishing Company, 2001: 232 -237.
    [42] Burke D , Deane CM , Nagarajaram H A , et al . Aniterative structure - assisted approach to 1999 [ J ] . Proteins : structure , function and gentics , 1999 , Suppl 3 : 55 - 60.
    [43] Peitsch M C. Protein modeling by E-mail. From amino acid sequence to protein structure : a free one-hour service [ J ] .Biotechnology , 1995 , 13 : 658 - 600.
    [44] RCSB Protein Data Bank. An Information Portal to Biological Macromolecular Structures[ EB /OL ]. [ 2006 - 02 - 14 ]. http: / /www. pdb. org/pdb / static. do? p = general_information /pdb_statistics/ index. html&tb = false.
    [45] PHILIPEB, HELGEW. Structural Bioinformatics[M ]. New York: JohnWiley & Sons Inc, 2003: 509 - 512, 525 - 528.
    [46] Simons K T , Bonneau R , Ruczinski I , et al . Ab inition protein structure prediction of CASP Ⅲtargets using Rosetta [J ] .Protei ns : structure , f unction and genetics , 1999 , Suppl 3 :171 - 176.
    [47] Osguthorp KJ . Improved ab initio prediction with a simplified flexible geometry model [ J ] . Proteins : structure , functionand genetics . 1999 , Suppl 3 : 186 - 183.
    [48] 殷志祥,蛋白质结构预测方法的研究进展,计算机工程与应用,2004.20 54-57 。
    [49] 何冰,蛋白质三维结构预测的研究进展,武警医学院学,Vol.10,No.2,P181-183,2001。
    [50] 宁正元, 林世强,蛋白质结构的预测及其应用,福建农林大学学报(自然科学版) 第35卷第3期,2006,308-313。
    [51] The NIH molecularmodeling. PDB at a Glance[ EB /OL ]. [ 2006 - 02 - 14 ]. http: / / cmm. info. nih. gov/modeling/pdb_at_a_glance. Html.
    [52] Zhu H , Braun W. Sequence specificity , statistical potentials , and three-dimensional structure prediction with self-correcting distance geometry calculations of beta-sheet formation in proteins [J ] . Protei n Sci , 1999 , 8 (2) : 326 - 342.
    [53] Avbelj F, Fele L. Prediction of the three-dimensional structure of proteins using the electrostatic screening model and hierarchic condensation. Protei ns , 1998 , 31 (1) : 74 - 96.
    [54] Li H , Tejero R , Monleon D , et al . Homology modeling using simulated annealing of restrained molecular dynamics and conformational search calculations with CONGEN : application in predicting the three-dimensional structure of murine homeodomain Msx21 [ J ] . Protein Sci , 1997 , 6 ( 5 ) : 956 -970.
    [55] Thiele R,Zimmer R,Lengauer T. Protein threading by recursive dynamic programming [J ] . J Mol biol, 1999 , 290 :757 - 779.
    [56] Ogata K, Ohya M , Umeyama H. Amino acid similarity matrix for homology modeling derived from structural alignment and optimized by the Monte Carlo method [J ] . J Mol Graph Model . 1998 , 16 (4 - 6) : 178 - 189 ,
    [57] PuntaM,RostB,PROFcon: novel prediction of long-range contacts.Bioinformatics. 2005 :2960-8.
    [58] MacCallum RM. Striped sheets and protein contact prediction. Bioinformatics. 2004 Aug 4;20 Suppl 1:I224-I231.
    [59] Pollastri G, Baldi P.Prediction of contact maps by GIOHMM and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics. 2002,18 Suppl 1:S62-70.
    [60] Gobel,U., Sander,C., Scheneider,R. and Valencia,A. Correlated mutations and residue contacts in proteins. Proteins, 1994,18, 309–317.
    [61] Shi J,Blundell TL,Mizuguchi K.FUGUE:sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties[J].J Mol Biol,2001,310(1):243-257.
    [62] 杨启文 计算智能及其应用.西安电子科技大学博士学位文.2001:8-12。
    [63] 莫宏伟,金鸿章,王科俊.基于生物体系的计算智能研究.信息技术.2002, 2:25-28
    [64] 董聪,郭晓华.计算智能中若干热点问题的研究与进展.控制理论与应用.2000,17(5):69 1-698。
    [65] 吴佑寿,世界计算智能大会简介.电子科技导报.1998,10 :37 -38。
    [66] 蔡自兴,徐光枯.人工智能及其应用(第三版).清华大学出版社,2004:124-126。
    [67] 谷吉海, 计算智能方法在航天器故障诊断中的应用研究, 哈尔滨工业大学博士学位论文 2005。
    [68] 周春光,梁艳春,计算智能 ,吉林大学出版社,2001 年 11 月。
    [69] 史天运,贾利民,计算智能理论及其在RITS 中的应用,交通运输系统工程与信息,Vo l.2 No.1 ,2002,P10-15。
    [70] 史天运, 贾利民. 智能自动化系统体系结构及支撑技术[J ]. 工控电子(专家论坛) , 1999, 6。
    [71] Fiesler E, Beale R. Handbook of neural computation. New York: Oxford, 1997. 65~70.
    [72] Perelson AS, Oster GF, Theoretical Studies of Clonal Selection: Minimal Antibody Repertoire Size and Reliability of Self-Non-Self Discrimination, Journal of Theoretical Biology, 1979, Dec 21, 81 (4), pp:645-670.
    [73] 徐雪松,基于人工免疫系统的函数优化及其在复杂系统中的应用研究,浙江大学博士学位论文, 2004,页码 2-6。
    [74] 莫宏伟,人工免疫系统原理与应用,哈尔滨工业大学出版社,2002.11。
    [75] 肖人彬, 王磊. 人工免疫系统: 原理、模型、分析及展望[J].计算机学报, 2002, 25(12): 1281–1293。
    [76] Bumet FM.The Clonal Selection Theory of Acquired Immunity .Cambridge University, Press ,1959.
    [77] Jerne N K,To wards a Network Theory of the Immune System, Annual Immunoogy, 1974,vol.125.
    [78] AS.Perelson, Immune Network Theory,Immunologicalre view,1986.10.
    [79] 孙勇智,人工免疫系统模型、算法及其应用研究,浙江大学博士学位论文,2004, 20040401,第四章,页码 53-61。
    [80] 史定华,王翼飞,倪红春,遗传算法在蛋白质结构预测中的应用,上海大学学报, 2001, Vol.7, No.3, pp:225-230。
    [81] 王磊,免疫进化理论的研究,西安电子科技大学博士学位论文,2004。
    [82] Qian N,Sejnowski T J,.Predicting the secondary structure of globular protein using neural network models, J. Mol Biol. 1988:202:865---884.
    [83] Bohr H, Bohr J, Brunak S .Protein secondary structure and homology by neural network. FEBS letter,1988,241: 223-228.
    [84] Holley L H, Karplus M, Protein secondary structure prediction with a neural network.Proc. Nat. Acad. Sci.USA, 1989,86:152-156.
    [85] Kneller D G,Cohen F E,Langridge R, Improvements in protein secondary structure prediction by an enhanced neural network.J.Mol.Biol.,1990, 214:171-182.
    [86] Stolorz P,Lapedes A,Yuan X,Predicting protein secondary structure using neural net and statistical methods. J.Mol. Biol.,1992,225:363-377.
    [87] Rost B,Sander C,Improved prediction of protein secondary structure by use of sequence profiles and neural network.Proc.Nat. Acad. Sci. USA.1993,90: 7558-7562.
    [88] Riis S K,Krogh A,Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignment. J.Comput.Biol.1996,3:163-183.
    [89] Jones D T.Protein secondary structure prediction based on position specific scoring matrics.J.Mol.Biol.1999,292:195-202.
    [90] 方慧生(Fang HS) ,相秉仁(Xiang BR) ,安登魁(An DK) . 改进 Madaline 学习算法预测蛋白质二级结构[J ] . 中国药科大学学报( J China Pharm Univ) ,1996 ,27 (6) :366 – 369。
    [91] 扬国慧, 周春光 ,胡成全等,一种改进的 Bayesian 网络模型用于蛋白质二级结构预测.自然科学进展,2003 年,第 6 期,667-670。
    [92] Lundegaard C,Petersen T N, Nielsen M el al.Springer-verlog heideberg ,2004,117-122.
    [93] IKSOO C, MAREK C, RUXANDRA ID. Protein threading by learning [ J ]. PNAS, 2001, 98: 143-50.
    [94] KUANG L, ALEX C W M, W ILL IAM R T. Threading Using Neural network ( TUNE) : the measure of protein sequence structure compatibility[ J ]. Bioinformatics, 2002, 18: 1350 - 1357.
    [95] Xu,Y.and Xu,D. et al.,Protein domain decomposition using a graph-theoreticapproach. Bioinformatics, 2000,16, 1091–1104.
    [96] Guo.J.T., Xu,D. et al., Improving the performance of DomainParser for structural domain partition using neural network, Nucl.Acids Res. ,2003,31,944-952 1091–1104.
    [97] Jones DT , Tress M,Bryson K, et al . Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure[J ] . Proteins ,1999 ,S(3) :104 - 111.
    [98] Muskal S.M., Kim S.H., Predicting protein secondary structure content: A tandem neural network approach, J. Mol. Biol. 225 (1992) 713– 727.
    [99] Chandonia J. M., Karplus M., Neural networks for secondary structure and structural class predictions. Protein Science, 1995, 4(2): 275-285.
    [100] Chandonia J. M., Karplus M., The importance oflarger datasets for protein secondary structure prediction with neural network, ProteinScience,1996, 5(4): 768-774.
    [101] Cai Y. D., Zhou G. P., Prediction of protein structural class by neural network, Biochimie 82(2002) 783-785.
    [102] Cai Y. D., Liu X. J., Xu X. B., Chou K. C.Artificial neural network method for predicting protein Secondary structure content, Computers& Chemistry,26(2002), 347-350.
    [103] Cezary Czaplewski . Adam Liwo. Jaros aw Pillardy. Stanisaw Odziej and llarold A. Scheraga. Improved conformational space annealing method to treat β- structure with the UNRES force - field and to enhance scalability of parallel implementation[J ] . Polymer ,2004 ,45 (2) :677- 686.
    [104] Martin Gruebele. Protein folding :the free energy surface[J].Current Opinion in Structural Biology ,2002 ,12(2) :161 - 168.
    [105] Kremer SS.Genetic algorithm and protein folding. http://www.techfak.unibielefeld. de/bcd/Curric/ProtEn/contents.html.
    [106] 孙之荣,韩浩,遗传算法在蛋白质变异理论研究中的应用[J] .科学通报,1997 , (17) :1871 – 1873。
    [107] Joo YoungLee , Harold A , Scheraga , Rackovsky S. Newoptimization method for conformational energy calculations on polypeptides : conformational space annealing[J ] . J Comput Chem,1997 ,18 (9) :1222 -1232.
    [108] 史晓红,王燕,刘文斌,殷志祥,现代优化计算方法在蛋白质结构预测中的应用,数学的实践与认识,2006 ,Vol.36,No.10,86-92。
    [109] Tilman Brodmeier, Erno Pretsch. Application of genetic algorithms in mo lecular modeling[J ]. J Compute Chem,1994, 15: 588—595.
    [110] 倪红春, 王翼飞, 史定华. 遗传算法在蛋白质结构预测中的应用[J ]. 上海大学学报(自然科学版) , 2001, (3) :225—230。
    [111] 解伟, 王翼飞. 蛋白质折叠的三维计算机模拟[J ]. 上海大学学报(自然科学版) , 2000, 6 (2) : 145—149。
    [112] M .W ayama,K .Ta kahashi,an dT .Shimizu.An approach to amino acid Sequence alignment using a genetic algorithm,Genome Informatics.1995,6 :122—123.
    [113] K .K aradimitriou and D .H .K raft.Genetic Algorithms and the MultipleSequence Alignment Problem in Biology.Proceedings of the Second Annual Molecular Biology and Biotechnology Conference,BatonR ouge,LA.1996.
    [114] D .W .M ount.Bioinformatics:Sequence and Genome Analysis,北京:科学出版社,2002.
    [115] C.Notredame and D.G.Higgins.SAGA:sequence alignment by genetic algorithm,Nuc. Acids Res.1996,24(8):1515-1524.
    [116] C.Zhang and A.K.C.Wong.A genetic algorithm for multiple molecular Sequence alignment,Comput.Appl.Biosci.1997,13(6):565-581.
    [117] C .Notredame,E.A.O'Brien,and D.G.Higgins.R AGA:RNA sequence alignment by genetic algorithm ,Nucleic Acids Res.1997,25 :4570—4580.
    [118] C .Notredame,L.Holme,and D.G.H iggins.COFFEE:A new objective Function for multiple Sequence alignment.Bioinformatics.1998,14 :407--422 .
    [119] K .Ch ellapilla,and G.B. Fo gel.Multiple sequence alignment using evolutionary pogramming,Congress on Evolutionary Computation.1999:445452.
    [120] J.T.H orng,C .M .L in,B .H.Y ang and C .Y .K ao.A Genetic Algorithm for Multiple Sequence Alignment.A vailable:http://rsdb.csie.ncu.edu.tw/ tools/m sa.htm.
    [121] T .Yokoyamaa ndT.Watanabe.A Web Server for Multiple Sequence Alignment Using Genetic Algorithm.Genome informatics.2001,12 :38 2383.
    [122] L .A.A nbarasu,P .N rayanasamy,an dV .Sundararajan.multiple sequence alignment using parallel adaptive genetic algorithm.Lecture Notesin Computer Science.20 01:130-137.
    [123] Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Smitrovsky, E.,Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: Methods and applications to hematopoietic differentiation. Proceedings of National Academy of Sciences USA 96 (1999) 2907–2912.
    [124] Wengang Zhou , Chunguang Zhou, Guixia Liu and Huiying lv, An improved quantum-inspired evolutionary algorithm for clustering gene expression data.The International Conference on Computationnal Mathods.December 15-17,2004,Singapore.
    [125] Wengang Zhou , Guixia Liu, Yanxin Huang, Yan Wang, Dongbing Han,Chunguang Zhou, A Novel Computational Based Method for Discovery of Sequence Motif from Coexpressed Genes. International Conference 2359-2368,Published by International Journal of Information Technology.
    [126] Wengang Zhou , Chunguang Zhou, Guixia Liu and Yanxin Huang, Identification of Transcription Factor Binding Sites Using Hybrid Particle Swarm Optimization.The tenth International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, 2005, Regina , LNAI,Vol. 3642, pp 438-445.
    [127] Wengang Zhou , Chunguang Zhou, Guixia Liu and Hong Zhu,Feature selection for microarray data analysis using mutual information and rough set theor, Computer Science ,Volume 204/ 2006 ,p.492—499.
    [128] Horton P , Kanehisa M. An assessment of neural network and statistical approaches for prediction of E. coli promoter sites[J ] . Nucleic Acids Research ,1992 ,20 (16) :4331 - 4338.
    [129] Oppon E. Synergistic use of promoter prediction algorithms : a choice for small training dataset ? [D] . South Africa : Western Cape University ,2000.
    [130] 李萍, 过涛, 李衍达, 基于小波分析的膜蛋白跨膜区段序列分析和预测. 生物物理学报, 2000 , 16 (3) : 576 – 85。
    [131] 吴晓明,王波, 程敬之,基于小波分析的蛋白质结构研究.西安交通大学学报, 2002 , 36 (4) : 412 – 17。
    [132] 秦红珊,杨新歧, 曹文斗, 从非同源蛋白质的一级序列预测其结构类,生物物理学报, 2002 , 18 (2) : 213 – 22。
    [133] Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. 2000. The Protein Data Bank. Nucleic Acids Research, 28 , 235-242.
    [134] Biopolymers. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.1983 Dec;22(12):2577-637.
    [135] Sander C, Schneider R. Database of homology-derived protein structures. Proteins, Structure, Function & Genetics, 1991,9 :56-68.
    [136] http://bioinfo.tg.fh-giessen.de/pdbselect/
    [137] Hobohm U., M.Scharf, R.Schneider, C.Sander. Selection of representative data sets.Prot.Sci, 1992, 409-417.
    [138] Kabsch W., C. Sander.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22(1983) 2577-2637.
    [139] 焦李成,神经网络系统理论[M ],西安: 西安电子科技大学出版社, 1996。
    [140] 史忠植,知识发现, 北京:清华大学出版社,2002-01。
    [141] 时小虎,梁艳春,徐旭.改进的Elman网模型与递归反传控制神经网络[J]. 软件学报, 2003, 14(6):1110-1119。
    [142] KALINLI A,KARABOGA D. Training recurrent neural networks by using parallel tabu search algorithm based on crossover operation[J]. Engineering Applications of Artificial Intelligence,2004,17(5):529-542.
    [143] PHAM D T, KARABOGA D. Training elman and jordan networks for system identification using genetic algorithms,[J].Artificial Intelligencein Engineering, 1999,13(2):107-117.
    [144] Elman J . Finding Structurein Time [ J ] . Cognitive Science , 1 9 9 0 ,14 : 179 - 211.
    [145] 丛爽,高雪鹏,几种递归神经网络及其在系统辨识中的应用[J ] .系统工程与电子技术. 2003 , 25 (2) : 194 – 197。
    [146] 戴谊, 丛爽,递归神经网络学习速率研究系统工程与电子技术, 2005,Vol. 27 No.5,942-947。
    [147] 刘晋钢, 韩燮, 李华玲. BP 神经网络改进算法的应用[J ]. 华北工学院学报, 2002, 6: 449- 451。
    [148] Pham D T , Karaboga D. Training Elman and Jordan networks for system identificat ion using genetic algorithm s [J ]. Artificial Intelligence in Engineering, 1999, 13: 107- 117.
    [149] U nnik rishnan K P. A lopex: A correlat lon-based learning algorithm fo r feedfo rw ard and recurrent neural network s [J ]. Neural Computation, 1994, 6:469- 490.
    [150] 闻新,周露,李翔,张宝伟,MATLAB 神经网络仿真与应用,科学出版社,2003。
    [151] 从爽. 面向Matlab 工具箱的神经网络理论[M],合肥: 中国科学技术大学出版, 1998。
    [152] 刘健, 练继建,基于带有偏差单元的递归神经网络的李家峡拱坝安全监控模型,水利水电技术,第35 卷,2004年第8 期,108-110。
    [153] 郝柏林. 从抛物线谈起─混沌动力学引论[M],上海:上海科学技术出版社,1993。
    [154] 唐巍. 混沌理论及其应用研究[J ] . 电力系统自动化,2000 , (7) :67 – 70。
    [155] 王凌,郑大钟,李清生,混沌优化方法的研究进展,计算技术与自动化,2001,Vol.20,No.1,1-5。
    [156] Kaneko K. Clustering , coding , switching , hierarchical ordering and control in a network of chaotic elements [ J ] .Physics D ,1990 , (41) :137 - 172.
    [157] Ishii S ,Fukumizu K,Watanabe S. A network of chaotic elements for information processing [J] . Neural Networks,1996 , (9) :25 - 40.
    [158] A ihara K, Takabe T, ToyodaM. Chaotic neural network [J ]. PhysicsL ettersA , 1990, 144 (6?7) : 333—340.
    [159] InoueM,NagayoshiA.Achaos neuro-computer[J].PhysLett(A),1991,158 (8):373~376.
    [160] InoueM, NagayoshiA.Solving an optimization problem with a chaos neural network [J]. Program Phys,1992,88 (4)::769~ ~773.
    [161] HOPFIELD, J.J., and TANK, D.W. Neural computations of decisions in optimization problems’, Biol.Cybern.1986, 52:141-152.
    [162] Nasrabadi N M ,Li W. Object recognition by a Hopfield neural network[J ] . IEEE Trans Smc ,1991 ,21(6) :1523~1534.
    [163] Aihara K, Takabe T, Toyoda. Chaotic neural networks[J ] . Physics Letters A ,1990 ,144(6P7) :333~340.
    [164] Chen L , Aihara K. Chaotic simulated annealing by a neural network model with transient chaos[J ] . Neural Networks ,1995 ,8(6) :915~930.
    [165] 徐耀群,郑皓,宋庆泽,史心东,一种暂态混沌神经网络及其应用,哈尔滨商业大学学报(自然科学版)2006 ,Vol.22, No.1,39-42。
    [166] L Chen1 Optimization by chaotic simulated annealing1 中日青年国际学术讨论论文集, vol 3 ,日本,神奈川,1995,57~59。
    [167] 胡世余,计算智能在高速多媒体网络路由算法中的应用研究,上海交通大学博士学位论文,2004。
    [168] 卢本卓,王存新,王宝翰,用于真实蛋白质结构预测的一种新的优化方法,化学物理学报, Chinese Journal of Chemical Physics, 2003, Vol.16, No.2, 117-121。
    [169] 靳利霞,唐焕文 蛋白质空间结构预测的一种优化模型及算法,2000 年,14 卷,2 期,应用数学与计算数学学报,页码 33-41。
    [170] David Baker, Andrej Sali, Protein Structure Prediction and Structural Genomic, Science, 2001, 294(5540), pp:93-97.
    [171] Silvia Crivelli, A Physical Approach to Protein Biophysical Journal, 2002, 82 (1), 36-49.
    [172] Chao Zhang and Sung-Hou Kim, Environment-Dependent Residue Contact Energies for Proteins, Biophysics, Vol. 97, Issue 6, March 14, 2000, 2550-2555.
    [173] Osvaldo Gra?a, David Baker, et al, Toward an Energy Function for the Contact Map Representation of Proteins, Proteins: Structure, Function, and Genetics, Volume 40, Issue 2,2000,237-248.
    [174] 解伟,王翼飞,蛋白质折叠的计算机模拟,上海大学学报:自然科学版, 2000 年,6卷,2 期,页码 145-149。
    [175] Rune Linding, Robert B. Russell, Victor Neduva and Toby J.Gibson, GlobPlot, Exploring Protein Sequences for Globularity and Disorder, Nucleic Acids Research, 2003, Vol.31, No.13,3701-3708.
    [176] MacCallum, Robert M., Striped Sheets and Protein Contact Prediction, Bioinformatics, Volume 20, Supplement 1, 4 August 2004, Oxford University Press, DOI: 10.1093/bioinformatics/bth913,i224-i231 (1).
    [177] 张立震,唐焕文,一种基于子序列分布的蛋白质结构类预测方法,计算机与应用化学, 2003, 23(2) 1-6。
    [178] 孙之荣,α 螺旋和 β 折叠连接短肽的构象分析——蛋白质超二级结构模块研究,生物物理学报, 1994, 10 (2), pp:289-296。
    [179] 孙之荣,蛋白质中频繁发生的超二级结构模式,科学通报, 1994, 39 (24), 2260-2263。
    [180] 李晓琴,罗辽复,氨基酸组成聚类蛋白质结构型和结构型的预测,生物物理学报, 1998, 14 (4),730-736。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700