详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
Exponentially exploding bioinformatics data has brought a new multidisciplinary research area-bioinformatics. One of major research issues in bioinformatics is on protein structure prediction based on protein sequence. This interdisciplinary field begs for knowledge of mathematics, computer science, information science, physics, system science, management science as well as biology. Concerning the problem of protein structure prediction, some new models and improved models are given in this dissertation.
     Graph theory plays a key role in the field of prediction of protein structure. In this dissertation, a method based on the shortest path of a graph is proposed. Three vertices of the graph give a possible secondary structure of a residue, and each edge of the graph is assigned a weight by a function. This path equated the corrected secondary structure. By this method, Several groups of proteins is tested and the result showed that this method was feasible. Finally the selection of parameter is discussed.
     DNA computing is a new computer model. This dissertation introduces DNA computing in proteins structure prediction. Each possible conformation of a residue in an amino acid sequence is represented using the notion of a node in a graph. Each node is given a weight based on the degree of the interaction between its side-chain atoms and the local main-chain atoms. Proteins structure prediction problem is mapped to find the maximal sets of completely connected nodes (cliques) in a graph and then using DNA computing model can find the maximal cliques.
     Probabilistic graphic model is an effective protein structure prediction model. By introducing a hidden state variable, a hiden Conditional Random Fields (HCRFs) is builded and used in the problem of protein structure prediction. A method of constructing the model and the algorithms is given to train and decode the model and use the model to predict the second structure of a famous protein dataset (CB513). Finally the results are compared with some other methods.
     An important problem in protein structure prediction is the correct location of disulfide bonding in proteins. The location of disulfide bonding can strongly reduce the search in the conformational space of protein structure. Therefore the correct prediction of the disulfide bonding starting from the protein residue sequence may also help in predicting its 3D structure. In this paper the LVQ artificial neural network method is applied to predict the disulfide bonding of protein structure. The local sequence arrangement of cysteine is of great significance to the disulfide bonding. Therefore the disulfide bonding can be predicted by its primary structure. This method was used to predict disulfide bonding in protein structure and a fine result was got.
     HP model is a simplified model of protein structure prediction.20 kinds of protein residues is classed into four groups. A protein sequence is converted to a new sequence including four alphabets. And then by searching the lowest energy of the new sequence we construct a protein structure prediction model. Simulated annealing algorithm is used for this model and the result gets the lower energy than using the HP model. The model can extend in predicting protein structure in 3D.
    [2]Dulbecco R. A turning point in cancer research:sequencing the human genome. Science,1986,231:1055-1056.
    [3]Baxevanis A D. The Molecular Biology Database Collection:an updated compilation of biological database resources. Nucleic Acids Res,2001,29(1):1-10.
    [4]Baxevanis A D. The Molecular Biology Database Collection:2003 update, Nucleic Acids Res,2003,31(1):1-12.
    [10]Latek D, Ekonomiuk D, Kolinski A. Protein structure prediction:Combining de novo modeling with sparse experimental data. Comput Appl Biosci.1997,1(13):291-295.
    [11]Berg J M, Tymoczko J L, Stryer L. Biochemistry, Fifth Ed. W H Freeman and Company New York,2002,104-105.
    [12]Pauling L, Corey R. B, Branson H. R. The structure of proteins:two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Nat. Acad. Sci,1951,37: 205-210.
    [13]Kendrew J C, Dickerson R B, Strandberg B E, et al. Structure of myoglobin:a three-dimensional Fourier synthesis at 2 A resolution. Nature,1960,185:422-427.
    [15]Berman H M, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Research,2000,28:235-242.
    [18]Stemberg M. Protein Structure Prediction:A practical approach. Oxford University Press, NewYork,1996.
    [19]Peitsch M.C, Jongeneel V. A 3-dimensional model for the CD40 ligand predicts that it is a compact trimer similar to the tumor necrosis factors. Int Immunol.1993,5: 233-238.
    [21]Lau K F, Dill K A. A lattice statistical mechanics model of the conformation and sequence spaces of proteins. Macromolecules,1989.22:3986.
    [22]Helles G A comparative study of the reported performance of ab initio protein structure prediction algorithms. J R Soc Interface,2008,5 (21):387-396.
    [23]F H, Stillinger T, Head-Gordon C L. Hirshfeld. Toy model for protein folding. Phys. Rev.1993, E 48:1469.
    [24]Head-Gordon T, Stillinger F H. Optimal neural networks for protein-structure prediction. Phys. Rev,1993, E 48:1502.
    [25]Helles G A comparative study of the reported performance of ab initio protein structure prediction algorithms. J R Soc Interface, April 6,2008; 5(21):387-396.
    [26]Finkelstein A V, Ptitsyn O B. Why do globular Proteins fit the limited set of folding patterns. Prog. Biophys. Molec, BioL,1987,50:171-190.
    [27]Chothia C, One thousand families for the molecular biologist, Nature,1992,357: 543-544.
    [28]Chou P Y, Fasman G D. Conformatonal parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. Biochemistry,1974,13: 211-222.
    [29]Gamier J, Osguthorpe D J, Robson B. Analysis of accuracy and implications of simple methods for predictiong the secondary structure of globular proteins. J. Mol. Biol,1978,120:97-120.
    [30]Levin J, Robson B, Gamier J. An algorithm for secondary structure determination in preoteins based on sequence similarity. FEBS Lett,1986,205:303-308.
    [31]Nishikawa K, Ooi T. Amino acid sequence homology applied to the prediction of protein secondary structures and joint prediction with existing methods. Biochim Biophys Acta,1986,871:45-54.
    [32]Qian N, Sejnowski T. Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol,1988,202:865-884.
    [33]Holley L H, Karplus M. Protein secondary structure prediction with a neural network. Proc Natl. Acad. Sci,USA,1989,86(1):152-156.
    [34]Asai K, Hayamizu S, Handa K. Prediction of protein secondary structure by the hidden Markov model. Comput. Appl. Biosci,1993,9(2):141-146.
    [35]Zvelebil M J, Barton G.J, Taylor W R, et al. Prediction of protein secondary strecture and active sites using the alignment of homologous sequences. J. Mol. Biol,1987, 195(4):957-961.
    [36]Frishman D, Argos P. Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Protein Engineering,1996,9(2): 133-142.
    [37]Frishman D, Argos P. Seventy-five percent accuracy in protein secondary structure prediction. Proteins,1997,27(3):329-335.
    [38]Salamov A A, Solovyev V.V. Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. J. Mol. Biol,1995, 247(1):11-15.
    [39]King R D, Sternberg M J. Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci,1996,5(11): 2298-2310.
    [40]Rost B, Sander C, Prediction of protein secondary structure at beyyer than 70% accuracy. J. Mol. Biol,1993,232(2):584-599.
    [41]Cuff J A, Barton G.J. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins,2000,40(3):502-511.
    [42]Jones D T, Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol,1999,292(2):195-202.
    [43]Pierre Baldi, Soren Brunak, Paolo Frasconi, et al. Exploiting the past and the future in protein secondary structure prediction. Bioinformatics,1999,15(15):937-946.
    [44]Pierre Baldi, Soren Brunak, Paolo Frasconi, et al. Bidirectional Dynamics for Protein Secondary Structure Prediction. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCA199), Stockholm, Sweden,1999.
    [45]Walker R C, Raman S, Baker D. High Resolution, High Throughput Protein Structure Prediction using IBM Blue Gene Supercomputers:Predicting CASP Targets in Record Time. Supercomputing 2006, Tampa, FL.
    [46]Bystroff C, Thorsson V, Baker D, HMMSTR:a hidden Markov model for local sequencestructure correlations in proteins. J. Mol. Biol,2000,301:173-190.
    [47]Girdhar Y, Bystroff C, Akella S, Carlson E. Efficient Sampling of Protein Folding Pathways using HMMSTR and Probabilistic Roadmaps.2005 IEEE Computational Systems Bioinformatics Conference (CSB 2005), poster, Stanford, CA, August 2005.
    [48]Lim V I. Algorithm for perdiction of a-helical and P-structural regions in globular proteins. J. Mol. Biol,1974,88:873-894.
    [49]Yi T M, Lander E S. Protein secondary structure prediction using nearest-neighbor methods. J. Mol. Biol,1993,232(4):1117-1129.
    [50]Canproux A. C, Tuffery P, Buffat L, et al. Analyzing patterns between regular secondary structure using short structural building blocks defined by a hidden Markov model. Theor. Chen. Acc,1999,101:33-40.
    [51]Hua S, Sun Z. A novel method of protein secondary structure prediction with high segment overlap measure:support vector machine approach. J. Mol. Biol,2001, 308(2):397-407.
    [52]Herbert S, Wilf. Algorithms and Complexity. University of Pennsylvania,1994.
    [53]Thomas H, Cormen, Charles E, et al. Introducion to Algorithms. MIT Press, Cam bridge, MA,2001.
    [55]Koch I, Kadon F, Selbig J. Analysis of sheet topologies by graph theory methods.
    Protein:Struct. Funct. Genet,1992, (12):314-323.
    [59]Piero F, Rita C. Prediction of disulfide connectivity in proteins. Bioinformatics,2001, 17:957-964.
    [61]Gabow H N. An efficient implementation of Edmonds'algorithm for maximum weight matching on graphs. Technical Report. CU-CS-075-75. Department of Computer Science, Colorado University,1975.
    [62]Chou K C, Nemethy G, Scheraga H A. Energetics of interactions of regular structural elements in proteins. Accts Chem. Res,1990, (23):134-141.
    [63]Patra S M, Vishveshwara S. Classification of polymer structures by agraph theory. Int. J. Quantum Chem,1998, (71):349-356.
    [64]Patra S M, Vishveshwara S. Backbone cluster identification in proteinsby agraph theoretical method. Biophysical Chemistry,2000, (84):13-25.
    [65]Chen K, Kurgan L, Ruan J. Optimization of the sliding window size for protein structure prediction.Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 06:366-372.
    [68]Lee B C, Kim D. New design of neural network input and output vectors in the protein secondary structure prediction. Bioinformatics and Biosystems,2006,1(4): 82-90.
    [69]Adleman L M. Molecular computation of solutions to combinatorial problems. Science, 1994,266(11):1021-1023.
    [70]Lipton R J. DNA solution of hard computational problems. Science,1995,268(28): 542-545.
    [71]Ouyang Q, Kaplan P D, Liu S, et al. DNA solution of the maximal clique problem. Science,1997,278:446-449.
    [72]Ram S, John M. A Graph-theoretic Algorithm for Comparative Modeling of Protein Structure. J. Mol. Biol,1998,279,287-302.
    [74]Rabiner L R. A tutorial on hidden markov models and selected applieations in sPeeeh recognition. Proceedings of the IEEE,1989.77(2):257-285.
    [75]Karplus K, Barrett C, Hughey R. Hidden markov models for detecting remote protein homologies. Bioinformatics,1998,14(10):846-56.
    [76]Durbin R, Eddy S, Krogh A, et al. Biological sequence analysis:probabilistic models of proteins and nucleic acids. Cambridge University Press,1998.
    [77]Bystroff C, Thorsson V, Baker D. HMMSTR:a hidden markov model for local sequence-structure correlations in proteins. J. Mol. Biol,2000),301:173-90.
    [78]Christos L, Costas P, Themis P, et al. Sequence-based protein structure prediction using a reduced state-space hidden Markov model. Computers in Biology and Medicine,2007,37(9):1211-1224.
    [81]Andrew McCallum, Dyane Freitag, Fernando Pereira. Maximum entropy Markov models. For inofmration extraction and segmentation. In Proc. ICML 2000,591-598.
    [82]Liu Y, Carbonell J, Klein-Seetharaman J, et al. Comparison of probabilistic combination methods for protein secondary structure prediction. Bioinformatics,2004, 20:3099-107.
    [83]Jiao X, Wang B, Su J, et al. Protein design based on the relative entropy. Physical Review E,2006,73:061903.
    [84]Wu S, Skolnick J, Zhang Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC biology.2007,5:17.
    [85]Mundra P, Kumaur M, Kumar K K, Jayaraman V K, Kulkami B D. Using pesudo amino acid composition to predict protein subnuclear localization:approached with PSSM. Pattern Recogn Lett,2007,28:1610-1615.
    [86]Colombo G, Micheletti C. Protein folding simulations:Combining coarse-grained models and all-atom molecular dynamics. Theor Chem Acc,2006,116 (5):75-86.
    [87]Shamion C E. A mathematical theory of communication. Bell SystemTeeh. Journal, 1948.27:379-423 and 623-656.
    [88]Adam L. Begrer, Vincent J. Della Pietra, Stephen A. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics,1996,22(1): 39-71.
    [89]Cuff J A, Barton G J. Evaluation and improvement of mulitiple sequence methods for Protein secondary strcture prediction. Proteins,1999,34(4):508-519.
    [90]Kim H, Park H. Protein secondary structure prediction based on an improved support vector machines approach. Protein Eng,2003,16:553-560.
    [91]Qin S, He Y, Pan X M. Prediction protein secondary structure and solvent accessibility with an improved multiple linear regression method. Proteins,2005,61, 473-480.
    [93]Muskal S M, Holbrook R S, Kim S H. Prediction of the disulfide-bonding state of cysteine in proteins. Protein Eng,1990 (3):667-672.
    [94]Fariselli P, Riccobelli P, Casadio R. Role of evolutionary information in predicting the disulfide-bonding state of cysteine in proteins. Proteins,1999 (36):340-346.
    [95]Piero F, Rita C. Prediction of disulfide connectivity in proteins. Bioinformatics,2001, 17:957-964.
    [96]Fariselli P, Martelli P L. Cassadio R. Aneural network-based method for predicting the disulfide connectivity in proteins. In Damiani, E, et al. Knowledge based intelligent information engineering systems and allied technologies (KES 2002),2002,1, 464-468.
    [97]Frasconi P, Passerini A, Vtlllo A. A two stage SVM architecture for predicting the disulfide bonding state of cysteines. In proceeding of IEEE Neural network for signal processing conference. IEEE Press,2002,25-34.
    [98]Eisenberg D, Weiss R M. The hydrophobic moment detects periodicity in protein hydrophobicity. Nature,1982 (317):2672-2685.
    [99]Lim V L. Prediction of secondary structure of proteins form their amino-acid sequence. J. Mol. Biol,1974 (88):857-869.
    [100]Stillinger F H, Head-Gordon T, C L. Toy model for protein folding. Phys. Hirschfeld Rev,1993, E48:1469-1477.
    [101]Hsu H P, Mehra V, Grassbeger P. Structure Optimization in an Off-Lattice Protein Model. Physical Review,2003,68.
    [102]Katagiri D, Fuji H, Neya S, Hoshino T. Ab initio protein structure prediction with force field parameters derived from water-phase quantum chemical calculation. Journal of Computational Chemistry.2008,29(12):1930-1944.
    [103]Yuksektepe F U, Yilmaz O, Turkay M.Prediction of secondary structures of proteins using a two-stage method. Computers & Chemical Engineering,2008,32:78-88.
    [104]Bachmann M, Arkin H, Janke W. Multicanonical Study of Coarse-grained Off-lattice Models for Folding Heteropolymers. Phys. Rev.,2005,71(3):1-15.
    [105]Kim S Y, Lee S B, Lee Jooyoung. Structure Optimization by Conformational Space Annealing in an Off-lattice Protein Model. Phys. Rev.,2005,72(1):61-66.
    [106]Eisenberg D, Weiss R M. The hydrophobic moment detects periodicity in protein hydrophobicity. PNAS,1984,20(2):81-140.
    [107]Lim V L. Prediction of secondary structure of proteins from their amino acid sequence. J. Mol. Biol,1974,88(4):857-869.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700