蛋白质的β-发夹、β(γ)-转角及四类简单超二级结构预测
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
由于蛋白质的功能与其结构是密切相关的,因此研究蛋白质的结构是获取功能信息的重要手段。随着人类基因组计划的顺利实施,蛋白质序列信息的积累速度远快于蛋白质结构数据的增长速度。然而,通过实验手段确定蛋白质的结构,不但成本高、耗时,而且实验中还会遇到一些目前无法解决的技术困难,因此人们非常希望能利用理论计算的方法直接从序列信息预测蛋白质结构,这也是生物信息学研究的重要课题。
     目前,直接从序列信息预测蛋白质的三级结构还很困难。由于局域结构有着较强的序列信号,且在三级结构中大量存在、频繁出现,对蛋白质的折叠、识别和稳定性起重要作用,因此,局域结构的预测可以简化结构预测问题,是蛋白质三级结构预测重要的中间步骤。
     本文主要研究蛋白质局域结构中超二级结构的预测,重点研究β-发夹模体的预测;研究了部分规则二级结构中β-转角和γ-转角的预测。
     1.提出了一种新的预测算法一基于离散增量的支持向量机算法,用该算法首次对超二级结构数据库(ArchDB40)中β-发夹模体进行了预测,取得较好效果。
     2.利用离散增量和序列打分值构成的向量来表示序列信息,将离散增量和打分值作为向量输入支持向量机,在向量空间中寻找最优超平面,提出了一种新的组合向量预测算法。该算法首次应用于β-发夹模体的预测,对ArchDB40超二级结构数据库中β-发夹数据集和文献(Kumar and Bhasin,Nucleic Acids Research,2005,33:154-159)中已有的β-发夹数据集的预测结果显示,我们的算法可以实现比以往方法更高的预测成功率。与文献中已有数据集的预测结果相比,对独立的检验集预测精度提高4%,β-发夹的敏感性提高6%。
     另外,将这种算法首次用于ArchDB40数据库中的四类简单超二级结构分类,无论是对5-交叉检验的训练集,还是对独立的检验集都取得较好分类结果。
     3.在离散增量和序列打分值的基础上,进一步把预测的二级结构信息加入组合向量,将它们共同输入支持向量机,对普遍使用的,分别包含426个和320个蛋白质序列的两数据集中的部分规则二级结构β-转角和γ-转角进行了预测。结果指出,对β-转角的7-交叉检验预测精度达到79.8%、相关系数为0.47:对γ-转角5-交叉检验预测的相关系数达到了0.18,这些结果都是目前最好的预测结果。
     4.建立了一个新的包括2208个非冗余蛋白质链的数据库,蛋白质结构分辨率高于2.5(?),序列相似性小于40%。得到α-α模体6799个,α-β模体6711个,β-α模体6072个和β-β模体8163个,首次将最小离散增量算法用于蛋白质四类简单超二级结构预测,当序列模式固定长取8个氨基酸残基,对“822型”序列模式3-交叉检验的平均预测精度达到78%,Jack-knife检验的平均预测精度达到76.8%;当序列模式固定长取10个氨基酸残基,对“1041型”序列模式3-交叉检验的平均预测精度达到83%,Jack-knife检验的平均预测精度达到79.8%。
     5.在蛋白质简单超二级结构分类预测、β-发夹预测、β-转角预测及γ-转角的预测工作中,引入了二肽组分信息参数和亲疏水特征信息参数,改善了预测结果。
The knowledge of the structure of a protein is important to understand its function. With the success of human genome project,a widening gap appears between rapidly increasing known protein sequences and slow accumulation of known protein structures. Determination of protein structure purely using experimental approaches is time-consuming and expensive.Thus,the theoretical or computational methods for predicting the structures of proteins become increasingly important.
     Presently,the direct prediction of the protein three-dimensional(3D) structure from sequence is a difficult task.But local structural motifs are with strong sequence signals,and commonly present in the 3D structures,and governing the stability and fold of proteins. Therefore,predicting local structure may help to simplify structure prediction problem,which is a key step of predicting 3D structure.
     In this dissertation,we investigated the super secondary structure prediction of proteins, especiallyβ-hairpin motifs.In addition,β-turns andγ-turns of secondary structures in the proteins also studied.
     1.Based on the algorithm of the least increment of diversity,a new algorithm of the increment of diversity combined with support vector machine(ID_SVM) is proposed,to predict theβ-hairpins in the ArchDB40 dataset.And better results are obtained.
     2.By using of the composite vector with increment of diversity and scoring value to express the information of sequence,and inputting the increment of diversity and scoring value to Support vector machine(SVM),SVM can find the optimization hyper plane in vector space to classify theβ-hairpins and the non-β-hairpins.A new algorithm of the increment of diversity and scoring value combined with support vector machine (ID_PCSF_SVM) for predictingβ-hairpin motifs in the ArchDB40 dataset and EVA dataset (Kumar and Bhasin,Nucleic Acids Research,2005,33:154-159, http://cubic.bioc.columbia.edu/eva/index.html) is proposed.And higher predictive success rates than the previous algorithms are obtained.The overall accuracy of prediction is improved 4%,and sensitive forβ-hairpin is increased 6%.
     We also applied our method to predict super secondary structure of the ArchDB40 dataset,and better results are obtained for training set 5-fold cross-validation and independent testing set.
     3.The increment of diversity,scoring value and predictive secondary structure information together are selected as inputting parameters of the SVM.A new algorithm for predictingβ-turns in the 426 proteins andγ-turns in the 320 proteins is proposed.The overall prediction accuracy and Matthews's correlation coefficient(Mcc) in 7-fold cross-validation are 79.8%and 0.47,respectively,for theβ-turns.And the Mcc in 5-fold cross-validation is 0.18 for theγ-turns.
     4.A database is constructed,which contained 2208 protein chains with higher resolution than 2.5(?) and lower identity than 40%.They contain 6799α-α,6711α-β,6072β-αand 8163β-βmotifs.Based on the diversity increment algorithm,the four types super-secondary structures are predicted by the 3-crossvalidation test.And results show that average prediction accuracy are 78%in the 3-crossvalidation test and 76.7%in jack-knife test for the "822type" for fixed-length pattern with 8 amino acids.If using of the "1041type" for fixed-length pattern with 10 amino acids,prediction accuracy are 83%and 79.8% respectively.
     5.By using the information of the dipeptide composition and amino acid hydropathy distribution,the predictive results for super secondary structures,β-hairpins、β-turns andγ-turns and is improved.
引文
[1]贺福初.蛋白质组(Proteome)研究-后基因组时代的生力军[J].科学通报,1999,44(2):113-122.
    [2]Aebersold,R.,and Mann,M.Mass spectrometry-based proteomics[J].Nature,2003,422(6928):198-207.
    [3]Jiang,T.,Xu,Y.,and Zhang,M.Q.Current Topics in Computational Molecular Biology[M].Tsinghua University Press,The MITPress,2002.
    [4]孙啸,陆祖宏,谢建明编著.生物信息学基础[M].北京:清华大学出版社,2005.
    [5]阎隆飞,孙之荣编著.蛋白质分子结构[M]北京:清华大学出版社,1999.
    [6]赵南明,周海梦编著.生物物理学[M].北京:高等教育出版社,2000.
    [7]Anfinsen,C.B.Principles that govern the folding of protein chains[J].Science,1973,181:223-230.
    [8]Baker,D.,and Sali,A.Protein structure prediction and structural genomics[J].Science,2001,294:93-96.
    [9]夏其昌,曾嵘等编著.蛋白质化学与蛋白质组学[M].北京:科学出版社,2004.
    [10]Peil,J.,and Grishin,N.V.Combining Evolutionary and Structural Information for Local Protein Structure Prediction[J].Proteins:Struct Funct Bioinform,2004,56:782-794.
    [11]王燕.机器学习在蛋白质结构和功能预测中的应用研究.博士学位论文.武汉:华中科技大学,2006.
    [12]Chou,K.C.Prediction of tight turns and their types in proteins[J].Anal Biochem,2000,286:1-16.
    [13]Toniolo,C.Intramolecularly hydrogen-bonded peptide conformations[J].CRC Crit Rev Biochem,1980,9:1-44.
    [14]Nagarajaram,H.A.,Paul,P.K.C.,Ramanarayanan,K.,etc.Conformational studies on β-bend containing a cis peptide unit[J].Int.J Peptide Protein Res,1992,40:383-394.
    [15]Nemethy,G.,and Printz,M.P.The γ-turns,a possible folded conformation of the polypeptide chain.Comparison with the β-turn[J].Macromolecules,1972,5:755-758.
    [16]Hutchinson,E.G.,and Thornton,J.M.PROMOTIF-a program to identify and analyze structural motifs in proteins[J].Protein Sci,1996,5:212-220.
    [17]Venkatachalam,C.M.Stereo chemical criteria for polypeptides and proteins.V.Conformation of a system of three linked peptide units[J].Biopolymers,1968,6:1425-1436.
    [18]Richardson,J.S.The anatomy and taxonimy of protein structure[J].Adv Protein Chem,1981,34:167-339.
    [19]Rose,G.D.,Gierasch,L.M.,and Smith,J.A.Turns in peptides and proteins[J].Adv.Protein Chem,1985,37:1-109.
    [20]Lewis,P.N.,Momany,F.A.,and Scheraga,H.A.Chain reversals in proteins[J].Biochem Biophys Acta,1973,303:211-229.
    [21]Chou,K.C.Prediction and classification of α-turn types[J].Biopolymers,1997,42:837-853.
    [22]Pavone,V.,Gaeta,G.,Lombardi,A.,etc.Discovering protein secondary structures:classification and description of isolated α-turns[J].Biopolymers,1996,38:705-721.
    [23]Kim,S.H.,and Sussman,J.S.π-Turn is a conformational pattern in RNA loops and bends[J].Nature,1976,260:645-646.
    [24]Rajashankar,K.R.,and Ramakumar,S.π-Turns in proteins and peptides:classification,conformation,occurrence,hydration and sequence[J].Protein Sci,1996,5:932-946.
    [25]孙之荣.蛋白质结构层次中的超二级结构[J].生物化学与生物物理进展.1995,22(6):503-506.
    [26]Kuhn,M.,Meiler,J.,and Baker,D.Strand-loop-strand motifs:prediction of hairpins and diverging turns in proteins[J].Proteins:Struct Funct Bioinform,2004,54:282-288.
    [27]James,C.Protein Secondary Structure Prediction with support vector machines[M].2002,September,4,6-9.
    [28]Chu,W.,and Ghahramani,Z.Protein Secondary Structure Prediction Using Sigmoid Belief Networks to Parameterize Segmental Semi-Markov Models[M].ESANN' 2004 proceedings-European Symposium on Artificial Neural Networks Bruges(Belgium),28-30 April 2004,d-side public,ISBN 2-930307-04-8,pp.81-86.
    [29]Burhard,R.Review:Protein Secondary Structure Prediction Continues to Rise[J].J Struct Biol,2001,0:1-15.
    [30] Chou, P. Y., and Fasman, G. D. Prediction of protein conformation [J]. Biochemistry, 1974,13: 211-215.
    [31] Gamier, J., Osguthorpe, D. J., and Robson, B. Analysis and implications of simple methods for predicting the secondary structure of globular proteins [J]. J Mol Biol, 1978,120: 97-120.
    [32] Lim, V. I. Algorithm for prediction of a-helical and b-structural regions in globular proteins [J]. J Mol Biol, 1974, 88: 873-894.
    [33] Gibrat, J. F., Gamier, J., and Robson, B. Further developments of protein secondary structure prediction using information theory, wew parameters and consideration of residue pairs [J]. J Mol Biol, 1987, 198(3): 425-443.
    [34] Levin, J., Robson, B., and Gamier, J. An algorithm for secondary structure determination in proteins based on sequence similarity [J]. FEBS Lett, 1986, 205: 303-308.
    [35] Nishikawa, K., and Ooi, T. Amino acid sequence homology applied to the prediction of protein secondary structure and joint prediction with existing methods [J]. Biochim Biophysics Acta,1986,871:45-54.
    [36] Qian, N., and Sejnowski, T. Predicting the secondary structure of globular proteins using neural network models [J]. J Mol Biol, 1988, 202: 865-884.
    [37] Holley, L. H., and Karplus, M. Protein secondary structure prediction with a neural network, proc NATL [J]. Acad Sci USA, 1989, 86(1): 152-156.
    [38] Asai, K., Hayami, S., and Handa, K. Prediction of protein secondary structure by the hidden Mar Kov Model [J]. Comput Appl Biosci, 1993,9(2): 141-146.
    [39] Yi, T. M., and Lander, E. S. Protein secondary structure prediction using nearest-neighbor methods [J]. J Mol Biol, 1993, 232(4): 1117-1129.
    [40] Zvelebil, M. J., Barton, G. J., Taylor, W. R., etc. Prediction of protein secondary structure and active sites using the alignment of homologous sequences [J]. J Mol Biol, 1987, 195(4): 957-961.
    [41] Frishman, D., and Argos, P. Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence [J]. Protein Eng, 1996, 9(2): 133-142.
    [42] Frishman, D., and Argos, P. Seventy-five percent accuracy in protein secondary structure prediction [J]. Proteins, 1997, 27(3): 329-335.
    [43] Salamov, A. A., and Solovyev, V. V. Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments [J]. J Mol Biol, 1995, 247(1): 11-15.
    
    [44] King, R. D., and Sternberg, M. J. Identification and application of the concepts important for accurate and reliable protein secondary structure prediction [J]. Protein Sci, 1996, 5(11):2298-2310.
    [45] Rost, B., and Sander, C. Prediction of protein secondary structure at better than 70% accuracy [J]. J Mol Boil, 1993, 232(2): 584-599.
    [46] Cuff, J. A., and Barton, G. J. Application of multiple sequence alignment profiles to improve protein secondary structure prediction [J]. Proteins, 2000,40(3): 502-511.
    [47] Jones, D. T. Protein secondary structure prediction based on position-speck scoring matrices [J]. J Mol Biol, 1999,292(2): 195-202.
    [48] Baldi, P., Brunak, S., Frasconi, P., etc. The past and the future in protein secondary structure prediction [J]. Bioinformatics, 1999, 15(11): 937-946.
    
    [49] Baldi, P., Brunak, S., Frasconi, P., etc. Dynamics for Protein Secondary Structure Prediction, Proceedings of the Sixteenth International. Ioint Conference on Artificial Intelligence (UCA199), Stockholm, Sweden, 1999.
    [50] Bystro, C., Thorsson, V., and Baker, D. HMMSTR: a hidden Markov model for local sequence structure correlations in proteins [J]. J Mol Biol, 2000, 301: 173-190.
    
    [51] Camproux, A. C., Tuffery, P., Buffat, L., etc. Analyzing patterns between regular secondary structures using short structural building blocks defined by a hidden Markov model [J]. Theor Chem Acc, 1999, 101:33-40.
    
    [52] Hua, S., and Sun, Z. R. A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach [J]. J Mol Biol, 2001, 308(2): 397-407.
    [53] Kim, H., and Park, H. Protein secondary structure prediction based on an improved support vector machines approach [J]. Protein Eng, 2003, 16: 553-560.
    [54] Guo, J., Chen, H., Sun, Z. R., etc. A novel method for protein secondary structure prediction using Dual-Layer SVM and profiles [J]. Proteins, 2004, 54: 738-743.
    
    [55] Minh, N. N., and Jagath, C. R. Multi-Class support vector machines for protein secondary structure prediction [J]. Genome Informatics, 2003, 14: 218-227.
    [56] Cost, S., and Salzberg, S. A weighted nearest neighbor algorithm for learning with symbolic features [J]. Machine Learning, 1993, 10: 57-58.
     [57] Cuff, J. A., and Clamp, M. E., Siddiqui, A. S., etc. A consensus structure prediction server [J]. Bioinformatics, 1998,14(10): 892-893.
    [58] Takano, K., Yamagata, Y., and Yutani, K. Role of amino acid residues at turns in the conformational stability and folding of human lysozyme [J]. Biochemistry, 2000, 39: 8655-8665.
    [59] Cruz, X., and Thornton, J. M. Factors limiting the performance of prediction-based fold recognition methods [J]. Protein Sci, 1999, 8: 750-759.
    [60] Rost, B., Schneider, R., and Sander, C. Protein fold recognition by prediction-based threading [J]. J Mol Biol, 1997, 270: 471-480.
    [61] Jones, D. T. Predicting novel protein folds by using FRAGFOLD [J]. Proteins Suppl, 2001, 5: 127-132.
    [62] Cuff, J. A and Barton, G. J. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction [J]. Proteins: Struct Funct Genet, 1999, 34: 508-519.
    [63] Frishman, D., and Argos, P. Seventy-five percent accuracy in protein secondary structure prediction [J]. Proteins: Struct Funct Genet, 199, 27: 329-335.
    [64] Chandonia, J. M., and Karplus, M. New methods for accurate prediction of protein secondary structure [J]. Proteins: Struct Funct Genet, 1999, 35: 293-306.
    [65] Pollastri, G., Przybylski, D., Rost, B and Baldi, Pierre. Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles [J]. Proteins: Struct Funct Genet, 2002,47: 228-235.
    [66] Przybylski ,D., and Rost, Burkhard. Alignments grow, secondary structure prediction improves [J]. Proteins: Struct Funct Genet, 2002,46: 197-205.
    [67] Rost, B., and Eyrich, V. A. EVA: Large-scale analysis of secondary structure prediction [J]. Proteins: Struct Funct Genet, Suppl, 2001, 5: 192-199.
    [68] Rost, B., and Sander, C. Progress of 1D Protein Structure Prediction at Last [J]. Proteins: Struct Funct Genet, 1995, 23: 295-300.
    
    [69] Chou, K. C. Prediction of β-turns [J]. J Peptide Res, 1997,49: 120-144.
    [70] Zhang, C. T., and Chou, K. C. Prediction of β-turns in proteins by 1-4 and 2-3 correlation model [J]. Biopolymers, 1997,41; 673-702.
    [71] Chou, K. C., and Blinn, J. R. Classification and prediction of β-turn types [J]. J Protein Chem, 1997,16:575-595.
    
    [72] Chou, P. Y., and Fasman, G. D. Prediction of β-turns [J]. Biophys J, 1979, 26: 367-384.
    [73] Cohen, F. E., Abarbanel, R. M., Kuntz, I. D., etc. Turn prediction in proteins using a pattern-matching approach [J]. Biochemistry, 1986, 25: 266-275.
    [74] Wilmot, C. M., and Thornton, J. M. Analysis and prediction of the different types of β-turn in proteins [J]. J Mol Biol, 1988, 203: 221-232.
    [75] McGregor, M. J., Flores, T. P., and Sternberg, M. J. E. Prediction of β-turns in proteins using neural networks [J]. Protein Eng, 1989, 2: 521-526.
    [76] Cai, Y. D., Li, Y. X., and Chou, K. C. Classification and prediction of β-turn types by neuralnetwork [J]. Adv Eng Software, 1999, 30: 347-352.
    [77] Kaur, H., and Raghava, G. P. An evaluation of beta-turn prediction methods [J]. Bioinformatics, 2002,18:1508-1514.
    [78] Shepherd, A. J., Gorse, D., and Thornton, J. M. Prediction of the location and type of β-turn types in proteins using neural networks [J]. Protein Sci, 1999, 8: 1045-1055.
    [79] Kaur, H., Raghava, G. P. Prediction of β-turns in proteins from multiple alignments using neural network [J]. Protein Sci, 2003, 12: 627-634.
    [80] Cai, Y.D., Liu, X. J., Xu, X. B., etc. Support vector machines for the classification and prediction of beta-turn types [J]. J Pept Sci, 2002, 8: 297-301.
    [81] Lin, T. H., Wang, G. M., and Wang, Y. T. Prediction of beta-turns in proteins using the first-order Markov models [J]. J Chem Inf Comput Sci, 2002,42: 123-133.
    [82] Zhang, Q. D., Yoon, S., and Welsh, W. J. Improved method for predicting β-turn using supportvector machine [J]. Bioinformatics, 2005, 21: 2370-2374.
    [83] Fuchs, P. F. J., and Alix, A. J. P. High accuracy prediction of β-turns and their types using propensities and multiple alignments [J]. Proteins: Struct Funct Bioinform, 2005, 59:828-839.
    [84] Guruprasad, K., and Rajkumar, S. Beta- and gamma-turns in proteins revisited: a new set of amino acid turn-type dependent positional preferences and potentials [J]. J Biosci, 2000, 25: 143-156.
    [85] Kaur, H., and Raghava, G. P. S. A neural network based method for prediction of γ-turns in proteins from multiple sequence alignment [J]. Protein Sci, 2003, 12: 923-929.
    [86] Guruprasad, K., Shukla, S., Adindla, S., etc. Prediction of γ-turns from amino acid sequences [J]. J Peptide Res, 2003, 61: 243-251.
    [87] Pham, T. H., Satou, K., and Ho, T. B. Support vector machines for prediction and analysis of beta and gamma-turns in proteins [J]. J Bioinform Comput Biol, 2005, 3 (2): 343-358.
     [88] Ramakrishnan, C., and Nataraj, D. V. Energy minimization studies on α-turns [J]. J Peptide Sci, 1988,4:239-252.
    [89] Cai, Y. D., Feng, K. Y., Li, Y. X., etc. Support vector machine for predicting α-turn types [J]. Peptide, 2003, 24: 629-630.
    
    [90] Kaur, H., and Raghava, G. P. S. Prediction of α-turns in proteins using PSI-BLAST profiles and secondary structure information [J]. Proteins, 2004, 55: 83-90.
    
    [91] Wang, Y., Xue, Z., and Xu, J. Better prediction of the location of α-turns in proteins with support vector machine [J]. Proteins: Struct Funct Bioinform, 2006, 65: 49-54.
    [92] Wang, Y., Xue, Z. D., Shi, X. H., etc. Prediction of π-turns in proteins using PSI-BLAST profiles and secondary structure information [J]. Biochem Biophys Res Commun, 2006, 347: 574-580.
    [93] Hertz, G., and Atormo, G. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences [J]. Bioinformatics, 1999,15(7-8): 563-577
    [94] Schones, D. E., Sumazin, P., and Zhang, M. Q. Similarity of position frequency matrices for transcription factor binding sites [J]. Bioinformatics, 2005, 21(3): 307-303.
    [95] Sandelin, A., and Wasserman, W. W. Constrained binding site diversity within families of transcription factors enhances pattern discovery [J]. J Mol Biol, 2004, 338 (2): 207-215.
    [96] Wasserman, W. W., and Sandelin, A. Applied bioinformatics for the identification of regulatory elements [J]. Nat Rev Genet, 2004,5 (4): 276-287.
    [97] Chekmenev, D. S., Haid, C., and Kel, A. E. P-Match: transcription factor binding site search by combining patterns and weight matrices [J]. Nucl Acids Res, 2005, 33: W432-437.
    [98] Kel, A. E., GoBling, E., Reuter, I., etc. MATCH~(TM): a tool for searching transcription factor binding sites in DNA sequences [J]. Nucl Acids Res, 2003, 31 (13): 3576-3579.
    [99] Quandt, K., Frech, K., Karas, H., etc. MatIand and MatInspector: New fast and versatile tools for detection of consensus matches in nucleotide sequence data [J]. Nucl Acids Res, 1995, 23 (23):4878-4884.
    [100]Cartharius,K.,Frech,K.,Grote,K.,etc.MatInspector and beyond:promoter analysis based on transcription factor binding sites[J].Bioinformatics,2005,21(13):2933-2942.
    [101]Kielbasa,S.M.,Gonze,D.,and Herzel,H.Measuring similarities between transcription factor binding sites[J].BMC Bioinformatics,2005,6(1):237.
    [102]Fickett,J.W.Quantitative Discrimination of MEF2 Sites[J].Mol Cell Biol,1996,16(1):437-441.
    [103]Li,Q.Z.,and Lu,Z.Q.The prediction of the structural class of protein:application of the measure of diversity[J].J Theor Biol,2001,213:493-502.
    [104]吕志清,李前忠.用离散量预测蛋白质的结构型[J].生物物理学报,2001,17:703-712.
    [105]吕志清,李前忠.一种预测蛋白质结构型的新方法[J].内蒙古大学学报(自然科学版),2002,33:26-30.
    [106]陈颖丽,李前忠.用离散量预测原核生物蛋白质的亚细胞位置[J].内蒙古大学学报(自然科学版),2003,34(5):510-517.
    [107]李凤敏,李前忠.用离散量方法预测蛋白质亚细胞定位[J].内蒙古大学学报(自然科学版),2003,34(4):416-419.
    [108]李凤敏,李前忠.蛋白质亚细胞定位的识别[J].生物物理学报,2004,20(4):297-306.
    [109]刘芬,李前忠.基于氨基酸亲疏水分布的最小离散增量方法识别蛋白质超家族[J].内蒙古大学学报(自然科学版),2006,37(4):416-423.
    [110]Zhang,L.R.,and Luo,L.F.Splice site prediction with quadratic discriminate analysis using diversity measure[J].Nucl Acids Res,2003,31:6214-6220.
    [111]Raghava,G.P.S.Protein secondary structure prediction using nearest neighbor and neural network approach[J].CASP4,2000,75.
    [112]Petersen,T.N.,Lundegrad,C.,Neilsen,M.,etc.Prediction of protein secondary structure at 80%accuracy[J].Proteins,2000,41:17-20.
    [113]Ortiz,A.R.,Kolinski,A.,and Skolnick,J.Fold assembly of small proteins using Monte Carlo simulations driven by restraints derived from multiple sequence alignment[J].J Mol Biol,1998,277:419-448.
    [114]Dandekar,T.,and Argos,P.Identifying the tertiary fold of small proteins with different topologies from sequence and secondary structure using the genetic algorithm and extended criteria specific for strand regions [J]. J MolBiol, 1996, 256: 645-660.
    [115] Abagyan, R., and Totrov, M. Biased probability Monte Carlo conformational searches and electrostatic calculations for peptides and proteins [J]. J Mol Biol, 1994,235: 983-1002.
    [116] Kang, H. S., Kurochkina, N. A., and Lee, B. Estimation and use of protein backbone angle probabilities [J]. J Mol Biol, 1993, 229: 448-460.
    
    [117] Bowie, J. U., and Eisenberg, D. An evolutionary approach to folding. small s-helical proteins that uses sequence information and an empirical guiding fitness function [J]. Proc Natl Acad Sci, USA, 1994, 91: 4436-4440.
    
    [118] Monge, A., Friesner, R. A., and Honig, B. An algorithm to generate low-resolution protein tertiary structures from knowledge of secondary structure [J]. Proc Natl Acad Sci, USA, 1994, 91:5027-5029.
    [119] Wang, Y., and Jardetzky, O. Probability-based protein secondary structure identification using combined NMR chemical-shift data [J]. Protein Sci, 2002, 11: 852-861.
    [120] Eyrich, A. V., Standley, D. M., and Friesner, R. A. Prediction of protein tertiary to low resolution: Performance for a large and structurally diverse test set [J]. J Mol Biol, 1999, 288: 725-742.
    
    [121] Jones, D. T., Tress, M., Bryson, K., etc. Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure [J]. Proteins, 1999,37:104-111.
    [122] Simons, K. T., Strauss, C., and Baker, D. Prospects for ab into protein structural genomics [J].J Mol Biol, 2001, 306: 1191-1199.
    [123] Rao, S. T., and Rossmann, M. G. Comparison of super-secondary structures in proteins [J]. J Mol Biol, 1973,76:241-256.
    [124] Salem, G. M., Hutchinson, E. G., Orengo, C. A., etc. Correlation of observed fold frequency with the occurrence of local structural motifs [J]. J Mol Biol, 1999, 287: 969-981.
    [125] Taylor, W. R., and Thornton, J. M. Prediction of super-secondary structure in proteins [J]. Nature (London), 1983, 301: 540-542.
    
    [126] Simons, K. T., Kooperberg, C., Huang, E., etc. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions [J]. J Mol Biol, 1997, 268: 209-225.
    [127]Bonneau,R.,Tsai,J.,Ruczinski,I.,etc.Rosetta in CASP4:progress in ab initio protein structure prediction[J].Proteins,2001,45(15):119-126.
    [128]Burke,D.F.,and Deane,C.M.Improved protein loop prediction from sequence alone[J].Protein Eng,2001,14:473-478.
    [129]Sun,Z.R.,Rao,X.,Peng,L.,etc.Prediction of protein super secondary structures based on the artificial neural network method[J].Protein Eng,1997,10:763-769.
    [130]Robert,E.S.,and Janet,M.T.Prediction of strand pairing in antiparallel and parallel β-sheet using information theory[J].Proteins:Struct Funct Genetic,2002,480:178-191.
    [131]Kumar,M.,and Bhasin,M.BhairPred:prediction of β-hairpins in a protein from multiple alignment information using ANN and SVM techniques[J].Nucl Acids Res,2005,33:154-159.
    [132]Sun,Z.R.,Zhang,C.T.,Wu,F.H.,etc.A vector projection method for predicting super secondary motifs[J].J Protein Chem,1996,15(8):721-729.
    [133]Rost,B.,and Sander,C.Improved prediction of protein secondary structure by use of sequence profiles and neural networks[J].Proc Natl Acad Sci,USA,1993,90(16):7558-7562.
    [134]Gurunath,R.,Beena,T.K.,Adiga,P.R.,etc.Enhancing peptide anti genicity by helix stabilization[J].FEBS Lett,1995,361:176-178.
    [135]孙之荣.蛋白质中较频繁发生的β发夹结构(β-hairpin)模式(蛋白质超二级结构研究(motif)Ⅲ)[J].生物物理学报.1994,10(4):665-670.
    [136]孙之荣,饶晓谦.用人工神经网络方法预测蛋白质超二级结构[J].生物物理学报.1995,11(4):570-573.
    [137]孙之荣.蛋白质中频繁发生的超二级结构模式[J].科学通报.1994,39(24):2260-2263.
    [138]孙之荣.α螺旋和β折叠连接短肽的构象分析(蛋白质超二级结构模块(motif)研究Ⅰ)[J].生物物理学报.1994,10(2):289-296.
    [139]Sun,Z.R.,and Jing,B.Patterns and conformations of commonly occurring super-secondary structures(basic motifs) in Protein Data Bank.Protein Chem,1996,15(7):675-690.
    [140]Cruz,X.,Hutchinson,E.G.,Hepherd,A.S.,etc.Toward predicting protein topology:an approach to identifying B hairpins[J].Proc Natl Acad Sci,USA,2002,99:11157-11162.
    [141]胡秀珍,李前忠.用离散量的方法识别蛋白质的超二级结构[J].生物物理学报,2006,22(6):424-428.
    [142]Hu,X.Z.and Li,Q.Z.Prediction the β-hairpin motifs in proteins using improved support vector machine[J].The Protein Journal,2007,DOI:10.1007/s 10930-007-9114-z.
    [143]Kabsch,W.,Sander,C.Dictionary of protein secondary structure:pattern recognition of hydrogen-bonded and geometrical features[J].Biopolymers,1983,22:2577-2637.
    [144]Oliva,B.,Bates,P.A.,Querol,E.,etc.An automated classification of the structure of protein loops[J].J Mol Biol.1997,266(4):814-830.
    [145]Oliva,B.,Bates,P.A.,Querol,E.,etc.Automated classification of antibody complementarity determining region 3 of the heavy chain(H3) loops into canonical forms and its application to protein structure prediction[J].J Mol Biol,1998,279(5):1193-1210.
    [146]Espadaler,J.,Fernandez-Fuentes,N.,Hermoso,A.,etc.ArchDB:Automated protein loop classification as a tool for Structural Genomics[J].Nucl Acids Res,2004,32:DataBase Issue D185-8.
    [147]Fernandez-Fuentes,N.,Hermoso,A.,Espadaler,J.,etc.Classification of common functional loops of kinase super-families[J].Proteins,2004,56(3):539-55.
    [148]Chou,K.C.Prediction of protein cellular attributes using pseudo-amino-acid-composition[J].Proteins:Struct Funct Genet,2001,43:246-255.
    [149]Richmond,T.J.and Richards,F.M.Packing of α-helices:geometrical constraints and contact areas[J].J Mol Biol,1978,119:537-555.
    [150]Panek,J.,Eidhammer,I.,and Aasland,R.A New Method for Identification of Protein(Sub)Families in a Set of Proteins Based on Hydropathy Distribution in Proteins[J].Proteins:Struct Funct Bioinform,2005,58:923-934.
    [151]Chen,Y.L.,and Li,Q.Z.Prediction of the subcellular location of apoptosis proteins[J].J Theor Biol,2007,245:775-783.
    [152]Laxton,R.R.The measure of diversity.J Theor Biol,1978,71:51-67.
    [153]徐克学.生物数学[M].北京:科学出版社,1999.
    [154]Vapnik,V.The Nature of Statistical Learning Theory[M].New York:Spinger-Verlag,1995.
    [155]Vapnik,V.Statistical Learning Theory[M].New York:Wiley-Interscience,1998.
    [156]Joachims,T.Making large-scale SVM learning practiceal[A].In:Scholkopf,B.,Burges,C.,and Smola,A.(eds),Advances in kernel methods:support vector learning[M].Cambridge,MA:MIT Press,1999,169-184.
    [157]张绍武.基于支持向量机的蛋白质分类研究.博士学位论文.西安:西北工业大学,2003.
    [158]Brown,M.,Grundy,W.,Lin,D.,etc.Knowledge-based analysis of microarray gene expression data by using support vector machines[J].Proc Natl Acad Sci,USA,2000,97:262-267.
    [159]Zien,A.,Ratsch,G.,Mika,S.,etc.Engineering support vector machine kernels that recognize translation initiation sites[J].Bioinformatics,2000,16(9):799-807.
    [160]Jaakkola,T.,Diekhans,M.,and Haussler,D.Using the fisher kernel method to detect remote protein homologies[A].Proceedings of the 7th International Conference on Intelligent systems for Molecular Biology[C].Menlo Park,CA:AAAI Press,1999,149-158.
    [161]王娴,李骜,王明会等.基于支持向量机方法的蛋白质氨基酸残基可溶性预测[J].生物物理学报,2005,21(1):60-64.
    [162]Zhang,S.W.,Pan,Q.,Zhang,H.C.,etc.Classification of protein quaternary structure with support vector machine[J].2003,19(18):2390-2396.
    [163]Chou,K.C.,and Cai,Y.D.Using functional domain composition and support vector machines for prediction of protein subcellular location[J].J Biol Chem,2002,277:45765-45769.
    [164]Hua,S.J.,and Sun,Z.R.Support vector machine approach for protein subcellular localization prediction[J].Bioinformatics,2001,17(8):721-728.
    [165]Cai,Y.D.,Liu,X.J.,Xu,X.B.,etc.Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect[J].J Cell Biochem,2002,84:343-348.
    [166]Nguyen,M.N.,and Rajapakse,J.C.Prediction of protein relative solvent accessibility with a two-stage SVM approach[J].Proteins:Struct Funct Bioinform,2005,59:30-37.
    [167]Ward,J.J.,McGuffin,B.F.B.,and Jones,D.T.Secondary Structure Prediction with support vector machines[J].Bioinformatics,2003,19:1650-1655.
    [168]Chang,C.C.,and Lin,C.J.LIBSVM:a library for support vector machines.2001.Software available at http://www.csie.ntu.edu.tw/-cjlin/libsvm.
    [169]Chou,K.C.,and Zhang,C.T.Review:Prediction of protein structural classes[J].Crit Rev Biochem Mol Biol,1995,30:275-349.
    [170]Duda,R.O.,Hart,P.E.,and Stork,D.G..Pattern classification[M].2nd edition,New York:Wiley,2001.
    [171]Baldi,P.,Brunak,S.,Chauvin,Y.,etc.Assessing the accuracy of prediction algorithms for classification:an overview[J].Bioinformatics,2000,16:412-424.
    [172]Matthews,B.W.Comparison of predicted and observed secondary structure of T4 phage lysozyrne[J].Biochirn Biophys Acta,1975,405(2):442-451.
    [173]Sibanda,B.L.,and Thornton,J.M.Beta-hairpin families in globular proteins[J].Nature (London),1985,316:170-174.
    [174]Milner-White,E.J.,and Poet,R.Four classes of beta-hairpins in proteins[J].Biochem J,1986,240:289-292.
    [175]Efimov,A.V.Standard structures in proteins[J].Prog Biophys Mol Biol,1993,60:201-239.
    [176]Godzik,A.,and Skolnick,J.Simulations of the folding pathway of triose phosphate somerase-type alpha/beta barrel proteins[J].Proc Natl Acad Sci,USA,1992,89:12098-12102.
    [177]Jenny,T.F.,Gerloff,D.L.,Cohen,M.A.,etc.Predicted secondary and super secondary structure for the serine-threonine-specific protein phosphatase family[J].Proteins,1995,21:1-10.
    [178]Wintjens,R.T.,Rooman,M.J.,and Wodak,S.J.Automatic classification and analysis of alpha alpha-turn motifs in proteins[J].J Mol Biol,1996,255:235-253.
    [179]Dill,K.A.Dominant forces in protein folding[J].Biochemistry,1990,29:7133-7155.
    [180]Altschul,S.F.,Madden,T.L.,Schaffer,A.A.,etc.Gapped BLAST and PSI-BLAST:a new generation of protein database search programs[J].Nucl Acids Res,1997,25:3389-3402.
    [181]Ahmad,S.,and Gromiha,M.M.NETASA:neural network based prediction of solvent accessibility[J].Bioinformatics,2002,18:819-824.
    [182]刘爽,贾传莹,陈鹏.一种自动选择参数的加权支持向量机算法.计算机工程与应用,2006,2:64-66.
    [183]Yang,A.S.,Hitz,B.,and Honig,B.Free energy determinants of secondary structure formation:Ⅲ.β-turns and their role in protein folding[J].J Mol Biol,1996,259:873-882.
    [184]Zimmerman,S.S.,and Scheraga,H.A.Local interactions in bends of proteins[J].Proc Natl Acad Sci USA,1977,74:4126-4129.
    [185] Milner-White, E. J., Ross, B. M., Ismail, R., etc. One type of gamma turn, rather than the other, gives rise to chain reversal in proteins [J]. J Mol Biol, 1988, 204: 777-782.
    [186] Alkorta, I., Suarez, M. L., Herranz, R., etc. Similarity Study on Peptide g-turn Conformation Mimetics [J]. J Mol Model, 1996, 2:16-25.
    [187] Sun, Z. R. The pattern of frequently occurring supersecondary motifs in proteins [J]. Chinece Science Bulletin, 1995,40(14): 1201-1206.
    [188] Efimov, A. V. Super-secondary structure of β-proteins [J]. J Mol Biol (in Russian), 1982, 16: 799-806.
    [189] Efimov, A. V. A novel super-secondary structure of proteins and the relation between the structure and the amino acid sequence [J]. FEBS Lett, 1984, 166:33-38.
    [190] Efimov, A. V. A novel super-secondary structure of proteins: α-α-corner [J]. J Mol Biol (in Russian), 1984,18: 1524-1537.
    [191] Efimov, A. V. Structure of coiled β-β-hairpins and β-β-corners [J]. FEBS Lett, 1991, 284: 288-292.
    [192] Argos, P., Rossmann, M. G, and Johnson, J. E. A four-helical super-secondary structure [J]. Biochem Biophys Res Commun, 1997, 75: 83-86.
    [193] Weber, P. C., and Salemme, F. R. Structural and functional diversity in 4-alpha-helical proteins [J]. Nature, 1980, 287(5777): 82-84.
    [194] William, R. T., and Janet, M. T. Recognition of super-secondary structure in proteins [J]. J Mol Biol, 1984,173(4): 487-514.
    
    [195] Street, T. O., Fitzkee, N. C., Perskie, L. L., etc. Physical-chemical determinants of turn conformations in globular proteins [J]. Protein Science, 2007, 16:1720.
    
    [196] Bornot, A., and de Brevern, A. G. Protein beta-turn assignments [J]. Bioinformation, 2006, 1:153-155.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700