用户名: 密码: 验证码:
单点氨基酸多态性与疾病相关关系的预测及其机制研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
单点氨基酸多态性(Single Amino Acid Polymorphism, SAP)与人类遗传性疾病密切相关,在基因药理学领域扮演着重要角色。而对于致病性SAP位点的识别可以用于考察药物效率、毒性以及代谢等方面针对特定基因群体的效果,并有助于建立针对个体病人的最优治疗方法。因此,针对致病性SAP位点的预测研究已成为了解分子水平上的致病机理的一个关键性手段,也是当前全基因组范围内研究的热点领域之一。
     本论文着眼于利用多种数学手段,以SAP与人类遗传疾病的相关关系作为主要研究对象,以探索致病性机理为主要研究目标,进行了一系列的生物信息学实验:首先,探索新型的序列描述符,并力图建立简洁、准确并可靠的SAP位点致病性预测模型,并将所建立的数学模型应用于实际工作当中,对全新SAP位点进行致病性预测,在节省实验成本和缩短实验周期的优势前提下,为实验验证提供强有力的理论支持和筛选后备样本集。而后,根据所建的数学模型和筛选的关键性描述符,在一定程度上提供解释SAP位点与疾病相关性机制的理论参考。接下来,我们从蛋白质翻译后修饰(Post-translational Modification, PTM)角度入手,统计分析因SAP破坏的PTM位点的致病性情况,进一步将致病机理解释深入到不同的PTM类型。最后,我们聚焦于棕榈酸化这一具体的PTM类型,考察分析棕榈酸化位点被SAP破坏以后的致病性情况,为SAP的致病性机制探讨提供了更为深入具体的参考资料。
     论文的第一章概述了SAP研究的背景、意义和现有数据资源,以及针对SAP疾病相关性的预测原理和方法。然后,对本论文中采用的主要研究方法和步骤进行了具体介绍。
     论文的第二章着眼于建立一个简洁高效的SAP与疾病相关性预测模型。我们本着要求输入简单、过程简洁、预测准确度高的原则,通过随机森林方法,建立了一个以疾病相关的氨基酸单点突变位点为识别目标的数学模型SubSeqPred。充分利用突变前后氨基酸的物理化学性质,仅利用44个蛋白序列描述符作为输入,避免了同源性和保守性等多种复杂计算,获取的模型达到了较为令人满意的效果。此后,将这一模型应用于SwissProt数据库中未分类的单点氨基酸突变位点中,为其进行了疾病相关性的注释。此外,我们根据此模型建立了全新的在线预测服务器(与模型同名为SubSeqPred),仅需输入蛋白序列和突变位点信息即可预测其疾病相关性。
     论文的第三章以PTM为入手点考察疾病相关SAP位点的致病性机制。我们搜索了大量数据库中实验验证的PTM数据样本,将其分别与人类疾病相关的SAP位点、癌症体细胞SAP位点以及中性SAP位点进行匹配,并对相应位点的保守性以及氨基酸突变前后的性质变化作以统计。研究结果发现,在疾病相关SAP数据中约有4.5%的氨基酸替换会通过破坏翻译后修饰而影响蛋白功能。而另一方面,约有2%的中性替换也会影响到翻译后修饰功能。这一结果表明,翻译后修饰的破坏并非人类遗传疾病的罪魁祸首。尽管如此,我们仍发现了238个修饰位点的突变会确定性的引发人体疾病以及1289个修饰位点存在于遗传疾病相关的突变的邻域范围内,这些位点信息可作为进一步致病机理研究实验的备选目标。
     论文的第四章在以上两个工作基础上,开展了针对棕榈酸化的破坏与SAP致病相关性的深入研究。首先我们利用蛋白序列描述符和随机森林方法建立了一个简洁有效的棕榈酸化位点识别模型,然后对所有的人类单点氨基酸突变位点进行预测识别,发现了若干疾病相关单点氨基酸突变位点被预测为棕榈酸化位点。通过查询文献,我们基本可以确认其中5个位点的致病性应与棕榈酸化的破坏有所关联,这一方面证明了我们所建模型的实用性,另一方面为这些SAP的致病机理解释提供了一个有效参考。
     论文的第五章和第六章分别介绍了关于数学建模研究方面的两个生物信息学工作内容,即建立了T细胞表位的预测和识别的定性模型,以及蛋白质-药物分子配体的结合能力预测研究的定量模型。这两个工作均取得了准确且可靠的预测结果,为SAP建模研究分析打下了比较坚实的数学理论基础。
Single amino acid polymorphisms (SAPs) exist universally in eukaryotic genomes and highly related with human genetic diseases, hence play a major role in pharmacogenomic.The identification of SAPs in specific genes relevant to drug efficacy, toxicity and metabolism will help to establish optimal therapeutic strategies for individual patients. Therefore, the study of SAPs is thus believed to be critical for the better understanding of the disease cause at the molecular level, and become one of the most active areas in genome wide studies.
     This dissertation focuses on disease-associated SAPs using vary mathematical and bioinformatics approaches to discover the molecular cause of human genetic disease. First, new sets of sequence features were explored and a concise, accurate and reliable identification model of deleterious SAP sites was built. The built model was applied to actual samples and used to predict function of new SAPs; with the advantages of cost and time saving, the results provide a strong theoretical support and candidate of research target for later experimental validation. After then, from the aspect of post-translational modification (PTM), statistic of disease-association of disrupted PTM sites by SAPs is performed. This study included all kinds of major PTM types, which will comprehensively interprete the relation of disruption of PTM sites with disease-cause mechanisms. At last, we focused on a specific type of PTM-palmitoylation. The relation of disrupted palmitoylation sites with human disease were carefully studied, and brought out some further insights of the mechanism of molecular cause of human genetic disease.
     In Chapter 1, a brief introduction for the backgrounds, data resources and prediction methods of deleterious SAP identification study were provided. Then, the study procedure and mathematical methods used in this dissertation were presented.
     In Chapter 2, a concise and promising deleterious amino acid polymorphisms identification method, called SeqSubPred, was developed. This method based on 44 features solely extracted from protein sequenc and achieved surprisingly good predictive ability without resorting to homology or evolution information, which is frequently utilized in similar methods and usually more complex and time-consuming in use. After then,2127 unclassified single amino acid substitutions in SwissProt database were identified whether or not disease-associated by our method which will provide a further annotation support for later experimental validation. In addition, a web server for this identification method was developed, also called SubSeqPred, requiring only protein sequence and substitution sites information as input.
     In Chapter 3, relation between human genetic disease and disruption of all main types of PTM by SAPs is estimated. The experimentally verified sites of PTMs were searched against amino acid substitution databases with the goal of investigating whether or in which ways changes of PTMs are affected by inherited and somatic disease SAPs.We found that about 4.5% of deleterious amino acid substitutions (3.9% of unique sites) may affect protein function through disruption of PTMs. On the other hand, about 2% of neutral polymorphisms may be affecting PTMs. These numbers further indicate that PTMs are not the major cause of human genetic disease. However, we had still found 238 post-translational modified sites in human proteins whose mutation was causative of disease. In total,1,289 modification sites were found to be in the close proximity to the inherited disease mutations and represent candidates for further experimental verification.
     In Chapter 4, based on the works above, we carried out an in-depth study against the relation between disruption of palmitoylation sites by SAPs and disease. First, protein sequence features and random forest modeling method were adopted to build a simple and effective identification model for palmitoylation sites. Then, all human single amino acid substitution sites were identified by this method. A number of disease-related single amino acid substitutions were predicted to be pamitoylation sites. By querying literature, five of these sites were confirmed to be related with pathogenicity, which on one hand proved the practicality of our built model, on the other hand brought some effective insights into the explanation of pathogenic mechanism of these disease related substitution sites.
     In Chapter 5 and Chapter 6, other two bioinformatics studied in related area of drug discovery were briefly introduced as the mathematical modeling basic of the SAP modeling study. They are identification of T-cell epitopes and quantitative study on prediction of protein-drug(ligand)binding affinity.
引文
[1]Mooney, S., Bioinformatics approaches and resources for single nucleotide polymorphism. Brief. Bioinform.,2005.6(1):p.44-56.
    [2]Capriotti, E., Calabrese, R., and Casadio, R., Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics,2006. 22(22):p.2729-2734.
    [3]Ng, P.C. and Henikoff, S., Predicting the Effects of Amino Acid Substitutions on Protein Function. Annu. Rev. Genom. Hum. G.,2006.7:p.61-80.
    [4]Karchin, R., Next generation tools for the annotation of human SNPs. Brief. Bioinform.,2009.10(1):p.35-52.
    [5]Noble, W.S., What is a support vector machine? Nat. Biotech.,2006.24(12):p. 1565-1567.
    [6]Yuan, H.-Y., et al., FASTSNP:an always up-to-date and extendable service for SNP function analysis and prioritization. NuclAcids Res.,2006.34(suppl 2):p.W635-W641.
    [7]Johnson, A.D., Single-nucleotide polymorphism bioinformatics:a comprehensive review of. Circ. Cardiovasc. Genet.,2009.2(5):p.530-6.
    [8]Sherry, S.T., Ward,M.H.,Kholodov,M.,Baker, J.,Phan,L., Smigielski,E.M., and Sirotkin, K., dbSNP:the NCBI database of genetic variation. Nucl. Acids Res.,2001.29(1):p.308-311.
    [9]http://www.ncbi.nlm.nih.gov.
    [10]Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., Haussler, and David, The Human Genome Browser at UCSC. Genome Res.,2002.12(6):p.996-1006.
    [11]Cochrane, G., et al., Petabyte-scale innovations at the European Nucleotide Archive. Nucl. Acids Res.,2009.37(Database issue):p. D19-25.
    [12]Frazer, K.A., et al., A second generation human haplotype map of over 3.1 million SNPs. Nature,2007.449(7164):p.851-61.
    [13]Hirakawa, M., Tanaka, T., Hashimoto, Y., Kuroda, M., Takagi, T., and Nakamura, Y., JSNP:a database of common gene variations in the Japanese population. Nucl. Acids Res.,2002.30(1):p.158-162.
    [14]Park, J., Hwang, S., Lee, Y.S., Kim, S.C., and Lee, D., SNP@Ethnos:a database of ethnically variant single-nucleotide polymorphisms. Nucl. Acids Res.,2007.35(Database issue):p. D711-5.
    [15]Osier, M.V., Cheung, K.H., Kidd, J., Pakstis, A.J., Miller, P.I., and Kidd, K.K., ALFRED:an allele frequency database for diverse populations and DNA. Nucl. Acids Res.,2001.29(1):p.317-9.
    [16]Ionita-Laza, I., Lange, C., and Laird, N., Estimating the number of unseen variants in the human genome. P. Natl. Acad. Sci. USA,2009.106(13):p. 5008-5013.
    [17]Amberger, J., Bocchini, C., Scott, A., and Hamosh, A., McKusick's Online Mendelian Inheritance in Man (OMIM). Nucl. Acids Res.,2009.37(Database issue):p. D793-796.
    [18]Stenson, P.D., et al., Human Gene Mutation Database (HGMD):2003 update. Hum. Mutat.,2003.21(6):p.577-581.
    [19]Brandon, M.C., Lott, M.T., Nguyen, K.C., Spolim, S., Navathe, S.B., Baldi, P., and Wallace, D.C., MITOMAP:a human mitochondrial genome database:2004 update. Nucl. Acids Res.,2005.33(suppl 1):p. D611-D613.
    [20]Bandelt, H.J., Salas, A., Taylor, R.W., and Yao, Y.G., Exaggerated status of "novel" and "pathogenic" mtDNA sequence variants due to inadequate database searches. Hum. Mutat.,2009.30(2):p.191-196.
    [21]Becker, K.G., Barnes, K.C., Bright, T.J., and Wang, S.A., The genetic association database. Nat. Genet.,2004.36(5):p.431-2.
    [22]Mailman, M.D., et al., The NCBI dbGaP database of genotypes and phenotypes. Nat. genet.,2007.39(10):p.1181-6.
    [23]Wang, Z. and Moult, J., SNPs, protein structure, and disease. Hum. Mutat., 2001.17(4):p.263-270.
    [24]Ramensky, V., Bork, P., and Sunyaev, S., Human non-synonymous SNPs: server and survey. Nucl. Acids Res.,2002.30(17):p.3894-3900.
    [25]Sunyaev, S., Ramensky, V., Koch, I., Lathe Iii, W., Kondrashov, A.S., and Bork, P., Prediction of deleterious human alleles. Hum. Mol. Genet.,2001. 10(6):p.591-597.
    [26]Chasman, D. and Adams, R.M., Predicting the functional consequences of non-synonymous single nucleotide. J. Mol. Biol.,2001.307(2):p.683-706.
    [27]Ng, P.C. and Henikoff, S., Predicting Deleterious Amino Acid Substitutions. Genome Res.,2001.11(5):p.863-874.
    [28]Ng, P.C. and Henikoff, S., Accounting for Human Polymorphisms Predicted to Affect Protein Function. Genome Res.,2002.12(3):p.436-446.
    [29]Ng, P.C. and Henikoff, S., SIFT:predicting amino acid changes that affect protein function. Nucl. Acids Res,2003.31(13):p.3812-3814.
    [30]Ferrer-Costa, C., Orozco, M., and de la Cruz, X., Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. J. Mol. Biol.,2002.315(4):p.771-786.
    [31]Ferrer-Costa, C., Orozco, M., and Cruz, X.d.1., Sequence-based prediction of pathological mutations. Proteins.,2004.57(4):p.811-819.
    [32]Ferrer-Costa, C., Orozco, M., and Cruz, X.d.1., Use of bioinformatics tools for the annotation of disease-associated mutations in animal models. Proteins., 2005.61(4):p.878-887.
    [33]Saunders, C.T. and Baker, D., Evaluation of Structural and Evolutionary Contributions to Deleterious Mutation Prediction. J. Mol. Biol.,2002.322(4): p.891-901.
    [34]Terp, B.N., Cooper, D.N., Christensen, I.T., Jφrgensen, F.S., Bross, P., Gregersen, N., and Krawczak, M., Assessing the relative importance of the biophysical properties of amino acid substitutions associated with human genetic disease. Hum. Mutat.,2002.20(2):p.98-109.
    [35]Mooney, S.D., Klein, T.E., Altman, R.B., Trifiro, M.A., and Gottlieb, B., A functional analysis of disease-associated mutations in the androgen receptor gene. Nucl. Acids Res.,2003.31(8):p. e42.
    [36]Mooney, S.D. and Altman, R.B., MutDB:annotating human variation with functionally relevant data. Bioinformatics,2003.19(14):p.1858-1860.
    [37]Stitziel, N.O., Tseng, Y.Y., Pervouchine, D., Goddeau, D., Kasif, S., and Liang, J., Structural Location of Disease-associated Single-nucleotide Polymorphisms. J. Mol. Biol.,2003.327(5):p.1021-1030.
    [38]Stitziel, N.O., Binkowski, T.A., Tseng, Y.Y., Kasif, S., and Liang, J., topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association. Nucl. Acids Res.,2004.32(suppl 1):p. D520-D522.
    [39]Krishnan, V.G. and Westhead, D.R., A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics,2003.19(17):p.2199-2209.
    [40]Thomas, P.D., et al., PANTHER:A Library of Protein Families and Subfamilies Indexed by Function. Genome Res.,2003.13(9):p.2129-2141.
    [41]del Sol Mesa, A., Pazos, F., and Valencia, A., Automatic Methods for Predicting Functionally Important Residues. J. Mol. Biol.,2003.326(4):p. 1289-1302.
    [42]Fleming, M.A., Potter, J.D., Ramirez, C.J., Ostrander, G.K., and Ostrander, E.A., Understanding missense mutations in the BRCA1 gene:An evolutionary approach. P. Natl. Acad. Sci. USA,2003.100(3):p.1151-1156.
    [43]Koref, M.S., Gangeswaran, R., Koref, I.S., Shanahan, N., and Hancock, J., A phylogenetic approach to assessing the significance of missense mutations in disease genes. Hum. Mutat.,2003.22(1):p.51-58.
    [44]Herrgard, S., Cammer, S.A., Hoffman, B.T., Knutson, S., Gallina, M., Speir, J.A., Fetrow, J.S., and Baxter, S.M., Prediction of deleterious functional effects of amino acid mutations using a library of structure-based function descriptors. Proteins.,2003.53(4):p.806-816.
    [45]Cai, Z., Tsung, E.F., Marinescu, V.D., Ramoni, M.F., Riva, A., and Kohane, I.S., Bayesian approach to discovering pathogenic SNPs in conserved protein domains. Hum. Mutat.,2004.24(2):p.178-184.
    [46]Lau, A.Y. and Chasman, D.I., Functional classification of proteins and protein variants. P. Natl. Acad Sci. USA,2004.101(17):p.6576-6581.
    [47]Stone, E.A. and Sidow, A., Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res.,2005.15(7):p.978-986.
    [48]Yue, P., Li, Z., and Moult, J., Loss of Protein Structure Stability as a Major Causative Factor in Monogenic Disease. J. Mol. Biol.,2005.353(2):p. 459-473.
    [49]Yue, P. and Moult, J., Identification and Analysis of Deleterious Human SNPs. J. Mol. Biol.,2006.356(5):p.1263-1274.
    [50]Bao, L. and Cui, Y., Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics,2005.21(10):p.2185-2190.
    [51]Bao, L. and Cui, Y., Functional impacts of non-synonymous single nucleotide polymorphisms:Selective constraint and structural environments. FEBS Lett., 2006.580(5):p.1231-1234.
    [52]Barenboim, M., Jamison, D.C., and Vaisman, I.I., Statistical geometry approach to the study of functional effects of human nonsynonymous SNPs. Hum. Mutat.,2005.26(5):p.471-476.
    [53]Barenboim, M., Masso, M., Vaisman, I.I., and Jamison, D.C., Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers. Proteins.,2008.71(4):p. 1930-1939.
    [54]Dobson, R., Munroe, P., Caulfield, M., and Saqi, M., Predicting deleterious nsSNPs:an analysis of sequence and structural attributes. BMC Bioinformatics, 2006.7(1):p.217.
    [55]Capriotti, E., Arbiza, L., Casadio, R., Dopazo, J., Dopazo, H., and Marti-Renom, M.A., Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans. Hum. Mutat.,2008.29(1):p.198-204.
    [56]Ye, Z.-Q., Zhao, S.-Q., Gao, G., Liu, X.-Q., Langlois, R.E., Lu, H., and Wei, L., Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP). Bioinformatics,2007. 23(12):p.1444-1450.
    [57]Bromberg, Y. and Rost, B., SNAP: predict effect of non-synonymous polymorphisms on function. Nucl. Acids Res.,2007.35(11):p.3823-3835.
    [58]Kaminker, J.S., et al., Distinguishing Cancer-Associated Missense Mutations from Common Polymorphisms. Cancer Res.,2007.67(2):p.465-473.
    [59]Kaminker, J.A., Zhang, Y., Watanabe, C., and Zhang, Z., CanPredict:a computational tool for predicting cancer-associated missense. Nucl. Acids Res., 2007.35(Web Server issue):p. W595-8.
    [60]Hon, L.S., Kaminker, J.S., and Zhang, Z., Computational Approaches for Predicting Causal Missense Mutations in Cancer Genome Projects. Curr. Bioinformatics,2008.3:p.46-55.
    [61]Kulkarni, V., Errami, M., Barber, R., and Garner, H.R., Exhaustive prediction of disease susceptibility to coding base changes in the. BMC Bioinformatics, 2008.9 Suppl 9:p. S3.
    [62]Li, B., Krishnan, V.G., Mort, M.E., Xin, F., Kamati, K.K., Cooper, D.N., Mooney, S.D., and Radivojac, P., Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics,2009. 25(21):p.2744-2750.
    [63]Masso, M. and Vaisman, I.I., Knowledge-based computational mutagenesis for predicting the disease potential of human non-synonymous single nucleotide polymorphisms. J. Theor. Biol.,2010.266(4):p.560-568.
    [64]Li, S., et al., In silico prediction of deleterious single amino acid polymorphisms from amino acid sequence. J. comput. chem. in Press.
    [65]Clauset, A., Finding local community structure in networks. Phys. Rev. E., 2005.72(2):p.026132.
    [66]Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D., Swiss-Prot/TrEMB Gapped BLAST and PSI-BLAST:a new generation of protein database search programs. Nucl. Acids Res.,1997. 25(17):p.3389-3402.
    [67]Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E., The protein data bank. Nucl. Acids Res., 2000.28:p.235-242.
    [68]Carlson, J.M., Chakravarty, A., DeZiel, C.E., and Gross, R.H., SCOPE:a web server for practical de novo motif discovery. Nucl. Acids Res.,2007.35(suppl 2):p. W259-W264.
    [69]Bader, G.D., Betel, D., and Hogue, C.W., BIND:the Biomolecular Interaction Network Database. Nucl. Acids Res.,2003.31(1):p.248-50.
    [70]Zanzoni, A., Montecchi-Palazzi, L., Quondam, M., Ausiello, G., Helmer-Citterich, M., and Cesareni, G., MINT: a Molecular INTeraction database. FEBS Lett.,2002.513(1):p.135-40.
    [71]Ashburner, M., et al., Gene ontology:tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet.,2000.25(1):p.25-9.
    [72]Okuda, S., Yamada, T., Itoh, M., Katayama, T., Bork, P., Goto, S., and Kanehisa, M., KEGG Atlas mapping for global analysis of metabolic pathways. Nucl. Acids Res.,2008.36(Web Server issue):p. W423-6.
    [73]Hubbard, T., et al., The Ensembl genome database project. Nucl. Acids Res., 2002.30(1):p.38-41.
    [74]Wang, P., et al., SNP Function Portal:a web database for exploring the function implication of SNP alleles. Bioinformatics,2006.22(14):p. e523-e529.
    [75]Reumers, J., Maurer-Stroh, S., Schymkowitz, J., and Rousseau, F., SNPeffect v2.0:a new step in investigating the molecular phenotypic effects of. Bioinformatics,2006.22(17):p.2183-5.
    [76]Conde, L., Vaquerizas, J.M., Dopazo, H., Arbiza, L., Reumers, J., Rousseau, F., Schymkowitz, J., and Dopazo, J., PupaSuite:finding functional single nucleotide polymorphisms for large-scale. Nucl. Acids Res.,2006.34(Web Server issue):p. W621-5.
    [77]de Berg, M., Cheong, O., and Van, K.M., Computational Geometry: Algorithms and Approaches.2000, Berlin, New York:Springer.
    [78]Kawabata, T., Ota, M., M, O., and Nishikawa, K., The Protein Mutant Database. Nucl. Acids Res.,1999.27(1):p.355-7.
    [79]Rennell, D., Bouvier, S.E., and Hardy, L.W., Systematic mutation of bacteriophage T4 lysozyme. J. Mol. Biol.,1991.222:p.67-88.
    [80]Markiewicz, P., Kleina, L., and Cruz, C., Genetic studies of the lac repressor. ⅩⅣ. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as 'spacers' which do not require a specific sequence. J. Mol. Biol.,1994.240:p.421-33.
    [81]Packer, B.R., et al., SNP500Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes. Nucl. Acids Res.,2006.34(suppl 1):p. D617-D621.
    [82]Norris, J., Markov Chains.1998:Cambridge University Press.
    [83]Rabiner, L.R., A tutorial on hidden Markov models and selected applications in speech recognition.1990, San Francisco:Morgan Kaufmann Publishers Inc.
    [84]Hughes, B.D., Random walks and random environments,.1996:Oxford University Press.
    [85]Cha, S.-H. and Tappert, C.C., A Genetic Algorithm for Constructing Compact Binary Decision Trees. J. Patt. Rec. Res.2009.4:p.1-13.
    [86]Bhadeshia, H.K.D.H., Neural Networks in Materials Science. ISIJ International 1999.39:p.966-979.
    [87]Neal, R.M., Bayesian Learning for Neural Networks, ed. Verlag.1996, New York:Springer.
    [88]Bremner, D., Demaine, E., Erickson, J., Iacono, J., Langerman, S., Morin, P., and Toussaint, G., Output-Sensitive Algorithms for Computing Nearest-Neighbour Decision Boundaries. Dis. Comput. Geomet.,2005.33(4): p.593-604.
    [89]Atkeson, C.G., Moore, A.W., and Schaal, S., Locally Weighted Learning. Artif. Intell. Rev.,1997.11(1):p.11-73.
    [90]Lukaszyk, S., A new concept of probability metric and its applications in approximation of scattered data sets. Comput. Mech.,2004.33(4):p.299-304.
    [91]Russell, S. and Norvig, P., Artificial Intelligence:A Modern Approach, second edition.2003:Prentice Hall.
    [92]Davidor, Y., Schwefel, H.-P., Manner, R., Eiben, A., Raue, P., and Ruttkay, Z., Genetic algorithms with multi-parent recombination, in Parallel Problem Solving from Nature — PPSN Ⅲ.1994, Springer Berlin / Heidelberg. p. 78-87.
    [93]Han, L.Y., et al., Support vector machines approach for predicting druggable proteins:recent progress in its exploration and investigation of its usefulness. Drug Discov. Today,2007.12:p.304-313.
    [94]Breiman, L., Random Forests. Machine Learning,2001.45(1):p.5-32.
    [95]Schneider, G. and Wrede, P., The rational design of amino acid sequences by artificial neural networks and. Biophys. J.,1994.66(2 Pt 1):p.335-44.
    [96]Grantham, R., Amino acid difference formula to help explain protein evolution. Science,1974.185(4154):p.862-4.
    [97]Reczko, M., Karras, D., and Bohr, H., An update of the DEF database of protein fold class predictions. Nucl. Acids Res.,1997.25(1):p.235-235.
    [98]http://www.genome.jp/aaindex.
    [99]Moreau, G. and Broto, P., Autocorrelation of a topological structure:a new molecular descriptor. Nouv. J. Chim,1980.4:p.359-360.
    [100]David, S.H., Prediction of protein helix content from an autocorrelation analysis of sequence hydrophobicities. Biopolymers,1988.27(3):p.451-477.
    [101]Chou, K.C., Prediction of Protein Subcellular Locations by Incorporating Quasi-Sequence-Order Effect. Biochem. Bioph. Res. Co.,2000.278(2):p. 477-483.
    [102]Robert, R.S. and Barbara, A.T., Population structure inferred by local spatial autocorrelation:An example from an Amerindian tribal population. Am. J. Phys. Anthropol.,2006.129(1):p.121-131.
    [103]Dubchak, I., Muchnik, I., Holbrook, S.R., and Kim, S.H., Prediction of protein folding class using global description of amino acid sequence. P. Natl. Acad. Sci. USA,1995.92(19):p.8700-8704.
    [104]Inna, D., Ilya, M., Christopher, M., Igor, D., and Sung-Hou, K., Recognition of a protein fold in the context of the SCOP classification. Proteins,1999. 35(4):p.401-407.
    [105]Chou, K.C. and Cai, Y.D., Prediction of protein subcellular locations by GO-FunD-PseAA predictor. Biochem. Bioph. Res. Co.,2004.320(4):p. 1236-1239.
    [106]Schneider, G. and Wrede, P., The rational design of amino acid sequences by artificial neural networks and simulated molecular evolution:de novo design of an idealized leader peptidase cleavage site. Biophys J.,1994.66(2):p. 335-344.
    [107]Grantham, R., Amino Acid Difference Formula to Help Explain Protein Evolution. Science,1974.185(4154):p.862-864.
    [108]Kuo-Chen, C., Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins,2001.43(3):p.246-255.
    [109]Chou, K.C., Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics,2005.21(1):p.10-19.
    [110]Feng, Z.P., An overview on predicting the subcellular location of a protein. In Silico Biol.,2002.2(3):p.291-303.
    [111]Li, Z.R., Lin, H.H., Han, L.Y., Jiang, L., Chen, X., and Chen, Y.Z., PROFEAT:a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucl. Acids Res., 2006.34(suppl_2):p. W32-37.
    [112]Cid, H., Bunster, M., Canales, M., and Gazitua, F., Hydrophobicity and structural classes in proteins. Protein Eng.,1992.5(5):p.373-375.
    [113]Bhaskaran, R. and Ponnuswamy, P., Positional flexibilities of amino acid residues in globular proteins. Int. J. Pept. Protein Res.,1988.32(4):p. 241-255.
    [114]Charton, M. and Charton, B.I., The structural dependence of amino acid hydrophobicity parameters. J. Theor. Biol.,1982.99(4):p.629-644.
    [115]Chothia, C., The nature of the accessible and buried surfaces in proteins. J. Mol.Biol.,1976.105(1):p.1-12.
    [116]Pontius, J., Richelle, J., and Wodak, S.J., Deviations from Standard Atomic Volumes as a Quality Measure for Protein Crystal Structures. J. Mol. Biol., 1996.264(1):p.121-136.
    [117]Fauchere, J.I., Charton, M., Kier, L.B., Verloop, A., and Pliska, V., Amino acid side chain parameters for correlation studies in biology and. Int. J. Pept. Protein Res.,1988.32(4):p.269-78.
    [118]Jones, D.T., Taylor, W.R., and Thornton, J.M., The rapid generation of mutation data matrices from protein sequences. CABIOS,1992.8(3):p. 275-282.
    [119]Furusj, E., Svenson, A., Rahmberg, M., and Andersson, M., The importance of outlier detection and training set selection for reliable environmental QSAR predictions. Chemosphere,2006.63(1):p.99-108.
    [120]Daszykowski, M., Walczak, B., and Massart, D.L., Representative subset selection. Anal. Chim. Acta,2002.468(1):p.91-103.
    [121]Tominaga, Y., Representative subset selection using genetic algorithms. Chemometr. Intell. Lab.,1998.43(1-2):p.157-163.
    [122]Galvao, R.K.H., Araujo, M.C.U., Jose Gledson, E., Pontes, M.J.C., Silva, E.C., and Saldanha, T.C.B., A method for calibration and validation subset partitioning. Talanta,2005.67(4):p.736-740.
    [123]Golbraikh, A. and Tropsha, A., Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. Mol. Divers.,2000.5(4):p.231-243.
    [124]Snee, R.D., Validation of regression models:methods and examples. Technometrics,1977.19:p.415-428.
    [125]Ren, Y.Y., Liu, H.X., Yao, X.J., and Liu, M.C., Three-dimensional topographic index applied to the prediction of acyclic C5-C8 alkenes Kovats retention indices on polydimethylsiloxane and squalane columns. J. Chromatogr. A,2007.1155(1):p.105-111.
    [126]Ren, Y.Y., Liu, H.X., Yao, X.J., Liu, M.C., Hu, Z.D., and Fan, B.T., The accurate QSPR models for the prediction of nonionic surfactant cloud point. J. Colloid Interf. Sci.,2006.302(2):p.669-672.
    [127]Ren, Y.Y., Liu, H.X., Yao, X.J., and Liu, M.C., Prediction of ozone tropospheric degradation rate constants by projection pursuit regression. Anal. Chim. Acta,2007.589(1):p.150-158.
    [128]Ren, Y.Y., Liu, H.X., Li, S.Y., Yao, X.J., and Liu, M.C., Prediction of binding affinities to [beta]1 isoform of human thyroid hormone receptor by genetic algorithm and projection pursuit regression. Bioorg. Med. Chem. Lett.,2007. 17(9):p.2474-2482.
    [129]Ren, Y.Y., Liu, H.X., Yao, X.J., and Liu, M.C., An accurate QSRR model for the prediction of the GC×GC-TOFMS retention time of polychlorinated biphenyl (PCB) congeners. Anal. Bioanal. Chem.,2007.388(1):p.165-172.
    [130]Kennard, R.W. and Stone, L.A., Computer Aided Design of Experiments. Technometrics,1969.11(1):p.137-148.
    [131]Eriksson, L. and Johansson, E., Multivariate design and modeling in QSAR. Chemometr. Intell. Lab.,1996.34(1):p.1-19.
    [132]Mitchell, T.J., An algorithm for the Construction of "D-optimal" Experimental Designs. Technometrics,1974.16:p.203-210.
    [133]Golbraikh, A. and Tropsha, A., Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. J. Comput. Aid Mol. Des.,2002.16(5):p.357-369.
    [134]Golbraikh, A., Molecular Dataset Diversity Indices and Their Applications to Comparison of Chemical Databases and QSAR Analysis. J. Chem. Inf. Model, 2000.40(2):p.414-425.
    [135]Daszykowski, M., Walczak, B., and Massart, D.L., On the Optimal Partitioning of Data with K-Means, Growing K-Means, Neural Gas, and Growing Neural Gas. J. Chem. Inf. Model,2002.42(6):p.1378-1389.
    [136]Adams, M.J., Chemometrics in Analytical Spectroscopy.2004, Cambridge, UK.
    [137]Taylor, R., Simulation Analysis of Experimental Design Strategies for Screening Random Compounds as Potential New Drugs and Agrochemicals. J. Chem. Inf. Model,1995.35(1):p.59-67.
    [138]Potter, T. and Matter, H., Random or Rational Design? Evaluation of Diverse Compound Subsets from Chemical Structure Databases. J. Med. Chem.,1998. 41(4):p.478-488.
    [139]Burden, F.R., Robust QSAR Models Using Bayesian Regularized Neural Networks. J. Med. Chem.,1999.42(16):p.3183-3187.
    [140]Chong, I.G. and Jun, C.H., Performance of some variable selection methods when multicollinearity is present. Chemometr. Intell. Lab.,2005.78(1-2):p. 103-112.
    [141]Mager, P.P. and Sanchez, L., Variable Subset Selection in the Presence of Flagged Observations and Multicollinear Descriptors in QSAR Curr. Comput. Aid. Drug Des.,2005.1(2):p.163-177.
    [142]任月英,QSPR/QSAR在药物、分析化学和环境科学中的应用.2007,兰州大学:兰州.
    [143]Holland, J., Adaptation in natural and artificial systems.1975:Arbor, A., Eds: University of Michigan Press.
    [144]Katritzky, A.R., Lobanov, V.S., and Karelson, M., CODESSA:Reference Manual; Version 2. University of Florida,1994.
    [145]Furnival, G.M. and Wilson, J.R.W., Regression by leaps and bounds. Technometrics,1974.16:p.499-504.
    [146]Agrafiotis, D.K., Bandyopadhyay, D., Wegner, J.K., and van Vlijmen, H., Recent Advances in Chemoinformatics. J. Chem. Inf. Model,2007.47(4):p. 1279-1293.
    [147]Pearl, J.H., Intelligent Search Strategies for Computer Problem Solving.1984: Addison-Wesley.
    [148]Kohavi, R. and John, G.H., Wrappers for feature subset selection. Artifi. Intell, 1997.97:p.273--324.
    [149]Suykens, J.A.K. and Vandewalle, J., Least Squares Support Vector Machine Classifiers. Neural Process Lett.,1999.9(3):p.293-300.
    [150]Vapnik, V.N., The Nature of Statistical Learning Theory.1995:Springer Verlag.
    [151]Vapnik, V.N., Statistical Learning Theory.1998:Wiley.
    [152]Cristianini, N. and Shawe-Taylor, J., Introduction to support vector machine and other kernel based learning methods. 2000, Cambridge:Cambridge University Press.
    [153]刘焕香,基于支持向量机方法的QSAR/QSPR在化学、生物及环境科学中的应用研究.2005,兰州大学:兰州.
    [154]薛春霞,SVM在QSPR中的应用及基于配体的计算机辅助药物设计。2005,兰州大学:兰州.
    [155]赵春燕,QSAR研究在生命分析化学和环境化学中的应用.2006,兰州大学:兰州.
    [156]Yao, X.J., Liu, H.X., Zhang, R.S., Liu, M.C., Hu, Z.D., Panaye, A., Doucet, J.P., and Fan, B.T., QSAR and Classification Study of 1,4-Dihydropyridine Calcium Channel Antagonists Based on Least Squares Support Vector Machines. Mol. Pharmaceutics,2005.2(5):p.348-356.
    [157]Liu, H.X., Papa, E., Walker, J.D., and Gramatica, P., In silico screening of estrogen-like chemicals based on different nonlinear classification models. J. Mol. Graph. Model,2007.26(1):p.135-144.
    [158]Liu, H.X., Yao, X.J., Zhang, R.S., Liu, M.C., Hu, Z.S., and Fan, B.T., Accurate Quantitative Structure-Property Relationship Model To Predict the Solubility of C60 in Various Solvents Based on a Novel Approach Using a Least-Squares Support Vector Machine. J. Phys. Chem. B,2005.109(43):p. 20565-20571.
    [159]席莉莉,计算机辅助药物和蛋白性质预测研究.2010,兰州大学:兰州.
    [160]Segal, M.R.,Machine Learning Benchmarks and Random Forest Regression. Technical Report, Center for Bioinformatics & Molecular Biostatistics, University of California, San Francisco 2004.
    [161]Liaw, A. and Wiener, M., Classification and Regression by randomForest. RNews,2002.2(3):p.18-22.
    [162]Wu, B.L., et al., Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics,2003.19(13):p. 1636-1643.
    [163]Chan, J.C.W. and Paelinckx, D., Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery. Remote Sens. Environ.,2008. 112(6):p.2999-3011.
    [164]Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., and Feuston, B.P., Random Forest:A Classification and Regression Tool for Compound Classification and QSAR Modeling. J. Chem. Inf. Model,2003.43(6):p. 1947-1958.
    [165]Gunther, E.C., Stone, D.J., Gerwien, R.W., Bento, P., and Heyes, M.P., Prediction of clinical drug efficacy by classification of drug-induced genomic expression profiles in vitro. P. Natl. Acad. Sci. USA,2003.100(16):p. 9608-9613.
    [166]Archer, K.J. and Kimes, R.V., Empirical characterization of random forest variable importance measures. Comput. Stat. Data Anal.,2008.52(4):p. 2249-2260.
    [167]Gramatica, P., Principles of QSAR models validation:internal and external. QSAR Comb. Sci.,2007.26(5):p.694-701.
    [168]Matthews, B.W., Comparison of the predicted and observed secondary structure of T4 phage lysozyme. BBA-Protein Struct. M,1975.405(2):p. 442-451.
    [169]Johnson, S.R., The Trouble with QSAR (or How I Learned To Stop Worrying and Embrace Fallacy). J. Chem. Inf. Model,2007.48(1):p.25-26.
    [170]Cramer Ⅲ, R.D., Bunce, J.D., Patterson, D.E., and Frank, I.E., Crossvalidation, Bootstrapping, and Partial Least Squares Compared with Multiple Regression in Conventional QSAR Studies. QSAR Comb. Sci.,1988.7(1):p.18-25.
    [171]Gentleman, R., Resampling Methods:A Practical Guide to Data Analysis (2nd ed.).1999:Birkhauser.
    [172]Shao, J., Linear Model Selection by Cross-Validation. J. Am. Stat. Assoc., 1993.88(422):p.486-494.
    [173]Golbraikh, A. and Tropsha, A., Beware of q21 J. Mol. Graph. Model,2002. 20(4):p.269-276.
    [174]Leach, A.R., Molecular modelling:principles and applications.2001, Harlow, England:Pearson Education Limited.
    [175]Wold, S., Sjostrom, M., and Ericksson, L., Partial least squares projections to latent structures (PLS) in chemistry., in Encyclopedia of computational chemistry., P. von Rague Schleyer, Editor.1998, John Wiley & Sons: Chichester.
    [176]Yasri, A. and Hartsough, D., Toward an Optimal Procedure for Variable Selection and QSAR Model Building. J. Chem. Inf. Model,2001.41(5):p. 1218-1227.
    [177]Wold, S. and Eriksson, L., Statistical validation of QSAR results., in Chemometrics Methods in Molecular Design., H. van de Waterbeemd, Editor. 1995, VCH:Weinheim. p.309-318.
    [178]Vedani, A. and Dobler, M.,5D-QSAR:The Key for Simulating Induced Fit? J. Med. Chem.,2002.45(11):p.2139-2149.
    [179]Gramatica, P., Evaluation of different statistical approaches to the validation of Quantitative Structure-Activity Relationships.2004, ECVAM,JRC:Ispra.
    [180]Polanski, J., Gieleciak, R., and Bak, A., Probability issues in molecular design: predictive and modeling ability in 3D-QSAR schemes. Comb. Chem. High Throughput Screen.,2004.7(8):p.793-807.
    [181]Guha, R., Serra, J.R., and Jurs, P.C., Generation of QSAR sets with a self-organizing map. J. Mol. Graph. Model,2004.23(1):p.1-14.
    [1]Mooney, S., Bioinformatics approaches and resources for single nucleotide polymorphism. Brief. Bioinform.,2005.6(1):p.44-56.
    [2]Karchin, R., Next generation tools for the annotation of human SNPs. Brief. in Bioinform.,2009.10(1):p.35-52.
    [3]Johnson, A.D., Single-nucleotide polymorphism bioinformatics:a comprehensive review of. Circ. Cardiovasc. Genet.,2009.2(5):p.530-6.
    [4]Breiman, L., Random Forests. Machine Learning,2001.45(1):p.5-32.
    [5]Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D., Swiss-Prot/TrEMB Gapped BLAST and PSI-BLAST:a new generation of protein database search programs. Nucl. Acids Res.,1997. 25(17):p.3389-3402.
    [6]Care, M.A., Needham, C.J., Bulpitt, A.J., and Westhead, D.R., Deleterious SNP prediction:be mindful of your training data! Bioinformatics,2007.23(6): p.664-672.
    [7]Capriotti, E., Arbiza, L., Casadio, R., Dopazo, J., Dopazo, H., and Marti-Renom, M.A., Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans. Hum. Mutat.,2008.29(1):p.198-204.
    [8]Grantham, R., Amino acid difference formula to help explain protein evolution. Science,1974.185(4154):p.862-4.
    [9]Schneider, G. and Wrede, P., The rational design of amino acid sequences by artificial neural networks and. Biophys.J.,1994.66(2 Pt 1):p.335-44.
    [10]Kennard, R.W. and Stone, L.A., Computer aided design of experiments. Technometrics,1969.11:p.137-148.
    [11]Ng, P.C. and Henikoff, S., SIFT:predicting amino acid changes that affect protein function. Nucl. Acids Res.,2003.31(13):p.3812-3814.
    [12]Thomas, P.D., et al., PANTHER:A Library of Protein Families and Subfamilies Indexed by Function. Genome Res.,2003.13(9):p.2129-2141.
    [1]Walsh, C.T., Posttranslational modification of proteins:expanding nature's inventory.2006, Englewood,CO:Roberts and Company Publishers.
    [2]Mann, M. and Jensen, O.N., Proteomic analysis of post-translational modifications. Nat Biotech,2003.21(3):p.255-261.
    [3]Manning, G., Whyte, D.B., Martinez, R., Hunter, T., and Sudarsanam, S., The Protein Kinase Complement of the Human Genome. Science,2002. 298(5600):p.1912-1934.
    [4]Komander, D., Clague, M.J., and Urbe, S., Breaking the chains:structure and function of the deubiquitinases. Nat Rev Mol Cell Biol,2009.10(8):p. 550-563.
    [5]Wang, D., Harper, J.F., and Gribskov, M., Systematic Trans-Genomic Comparison of Protein Kinases between Arabidopsis and Saccharomyces cerevisiae. Plant Physiol.,2003.132(4):p.2152-2165.
    [6]Grasbon-Frodl, E., Lorenz, H., Mann, U., Nitsch, R.M., Windl, O., and Kretzschmar, H.A., Loss of glycosylation associated with the T183A mutation in human prion disease. Acta Neuropathologica,2004.108(6):p.476-484.
    [7]Rogers, M., Taraboulos, A., Scott, M., Groth, D., and Prusiner, S.B., Intracellular accumulation of the cellular prion protein after mutagenesis of its Asnlinked glycosylation sites. Glycobiology,1990.1(1):p.101-109.
    [8]Thomas, M., Dadgar, N., Aphale, A., Harrell, J.M., Kunkel, R., Pratt, W.B., and Lieberman, A.P., Androgen Receptor Acetylation Site Mutations Cause Trafficking Defects, Misfolding, and Aggregation Similar to Expanded Glutamine Tracts. JBiol Chem,2004.279(9):p.8389-8395.
    [9]Ton, K.L., Jones, C.R., He, Y., Eide, E.J., Hinz, W.A., Virshup, D.M., Ptacek, L.J., and Fu, Y.-H., An hPer2 Phosphorylation Site Mutation in Familial Advanced Sleep Phase Syndrome. Science,2001.291(5506):p.1040-1043.
    [10]Wang, Z. and Moult, J., SNPs, protein structure, and disease. Hum. Mutat, 2001.17(4):p.263-270.
    [11]Iakoucheva, L.M., Radivojac, P., Brown, C.J., O'connor, T.R., Sikes, J.G., Obradovic, Z., and Dunker, A.K., The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res,2004.32(3):p.1037-1049.
    [12]Daily, K.M., et al.. Intrinsic disorder and protein modifications: building an SVM predictor for methylation, in IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).2005.: San Diego, California, U.S.A. p.475-481,.
    [13]Radivojac, P., Vacic, V., Haynes, C., Cocklin, R.R., Mohan, A., Heyen, J.W., Goebl, M.G., and Iakoucheva, L.M., Identification, analysis, and prediction of protein ubiquitination sites. Proteins.,2010.78(2):p.365-380.
    [14]Vogt, G., et al., Gains of glycosylation comprise an unexpectedly large group of pathogenic mutations. Nat Genet,2005.37(7):p.692-700.
    [15]Lee, T.-Y., Huang, H.-D., Hung, J.-H., Huang, H.-Y., Yang, Y.-S., and Wang, T.-H., dbPTM: an information repository of protein post-translational modification. Nucleic Acids Res.34(suppl 1):p. D622-D627.
    [16]Yang, C.-Y., et al., PhosphoPOINT:a comprehensive human kinase interactome and phospho-protein database. Bioinformatics,2008.24(16):p. i14-i20.
    [17]Radivojac, P., Baenziger, P.H., Kann, M.G., Mort, M.E., Hahn, M.W., and Mooney, S.D., Gain and loss of phosphorylation sites in human cancer. Bioinformatics,2008.24(16):p. i241-i247.
    [18]Li, B., Krishnan, V.G., Mort, M.E., Xin, F., Kamati, K.K., Cooper, D.N., Mooney, S.D., and Radivojac, P., Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics,2009. 25(21):p.2744-50.
    [19]Wu, C.H., et al., The Universal Protein Resource (UniProt):an expanding universe of protein information. Nucleic Acids Res.34(suppl 1):p. D187-D191.
    [20]Keshava Prasad, T.S., et al., Human Protein Reference Database-2009 update. Nucleic Acids Res,2009.37(suppl 1):p. D767-D772.
    [21]Diella, F., et al., Phospho.ELM:A database of experimentally verified phosphorylation sites in eukaryotic proteins. BMC Bioinformatics,2004.5(1): p.79.
    [22]Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E., The protein data bank. Nucl. Acids Res., 2000.28:p.235-242.
    [23]Gupta, R., Birch, H., Rapacki, K., Brunak, S., and Hansen, J.E., O-GLYCBASE version 4.0:a revised database of O-glycosylated proteins. Nucleic Acids Research,1999.27(1):p.370-372.
    [24]Biondi, R.M., Phosphoinositide-dependent protein kinase 1, a sensor of protein conformation. Trends in Biochemical Sciences,2004.29(3):p. 136-142.
    [25]Stenson, P.D., et al., Human Gene Mutation Database (HGMD):2003 update. Hum. Mutat,2003.21(6):p.577-581.
    [26]Lee, W., Yue, P., and Zhang, Z., Analytical methods for inferring functional effects of single base pair substitutions in human cancers. Hum Genet,2009. 126(4):p.481-498.
    [27]Songyang, Z., Blechner, S., Hoagland, N., Hoekstra, M.F., Piwnica-Worms, H., and Cantley, L.C., Use of an oriented peptide library to determine the optimal substrates of protein kinases. Curr biol,1994.4(11):p.973-982.
    [28]Oppliger, T., Thony, B., Nar, H., Burgisser, D., Huber, R., Heizmann, C.W., and Blau, N., Structural and Functional Consequences of Mutations in 6-Pyruvoyltetrahydropterin Synthase Causing Hyperphenylalaninemia in Humans. J Biol Chem,1995.270(49):p.29498-29506.
    [29]Scherer-Oppliger, T., Leimbacher, W., Blau, N., and Thony, B., Serine 19 of Human 6-Pyruvoyltetrahydropterin Synthase Is Phosphorylated by cGMP Protein Kinase Ⅱ. Journal of Biological Chemistry,1999.274(44):p. 31341-31348.
    [30]Thony B Fau-Leimbacher, W., Leimbacher W Fau-Blau, N., Blau N Fau-Harvie, A., Harvie A Fau-Heizmann, C.W., and Heizmann, C.W., Hyperphenylalaninemia due to defects in tetrahydrobiopterin metabolism: molecular. Am J Hum Genet,1994.54(5):p.782-92.
    [31]Pei, J. and Grishin, N.V., AL2CO:calculation of positional conservation in a protein sequence alignment. Bioinformatics,2001.17(8):p.700-712.
    [32]Thompson, J.D., Higgins, D.G., and Gibson, T.J., CLUSTAL W:improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res,1994.22(22):p.4673-4680.
    [33]Henikoff, S. and Henikoff, J.G., Position-based sequence weights. J Mol Biol, 1994.243(4):p.574-578.
    [34]Ng, P.C. and Henikoff, S., Predicting Deleterious Amino Acid Substitutions. Genome Res.,2001.11(5):p.863-874.
    [35]BeiBbarth, T. and Speed, T.P., GOstat:find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics,2004.20(9):p. 1464-1465.
    [36]Dalkilic, M.M., Costello, J.C., Clark, W.T., and Radivojac, P., From protein-disease associations to disease informatics. Front Biosci,2008.13:p. 3391-407.
    [37]Ashburner, M., et al., Gene ontology:tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet.,2000.25(1):p.25-9.
    [1]Hannoush, R.N. and Sun, J., The chemical toolbox for monitoring protein fatty acylation and prenylation. Nat Chem Biol,2010.6(7):p.498-506.
    [2]Huang, K. and El-Husseini, A., Modulation of neuronal protein trafficking and function by palmitoylation. Curr Opin Neurobiol,2005.15(5):p.527-535.
    [3]El-Husseini, A.E.-D. and Bredt, D.S., Protein palmitoylation:a regulator of neuronal development and function. Nat Rev Neurosci,2002.3(10):p. 791-802.
    [4]Zhou, B., Liu, L., Reddivari, M., and Zhang, X.A., The Palmitoylation of Metastasis Suppressor KAI1/CD82 Is Important for Its Motility- and Invasiveness-Inhibitory Activity. Cancer Res,2004.64(20):p.7455-7463.
    [5]Yang, X., Kovalenko, O., Tang, W., Claas, C., Stipp, C., and Hemler, M., Palmitoylation supports assembly and function of integrin-tetraspanin complexes. J Cell Biol,2004.167(6):p.1231-1240.
    [6]Kalinina, E. and Fricker, L., Palmitoylation of carboxypeptidase D. Implications for intracellular trafficking. J Biol Chem,2003.278(11):p.9244-9249.
    [7]Navarro-Lerida, I., Corvi, M., Barrientos, A., Gavilanes, F., Berthiaume, L., and Rodriguez-Crespo, I., Palmitoylation of inducible nitric-oxide synthase at Cys-3 is required for proper intracellular traffic and nitric oxide synthesis. J Biol Chem,2004.279(53):p.55682-55689.
    [8]Salaun, C., Gould, G., and Chamberlain, L., The SNARE proteins SNAP-25 and SNAP-23 display different affinities for lipid rafts in PC12 cells. Regulation by distinct cysteine-rich domains. J Biol Chem,2005.280(2):p. 1236-1240.
    [9]Wong, W. and Schlichter, L., Differential recruitment of Kv1.4 and Kv4.2 to lipid rafts by PSD-95. J Biol Chem,2004.279(1):p.444-452.
    [10]Vazquez, P., Roncero, I., Blazquez, E., and Alvarez, E., Substitution of the cysteine 438 residue in the cytoplasmic tail of the glucagon-like peptide-1 receptor alters signal transduction activity. J Endocrinol,2005.185(1):p. 35-44.
    [11]Kleuss, C. and Krause, E., G[alpha]s is palmitoylated at the N-terminal glycine. EMBO J,2003.22(4):p.826-832.
    [12]Caron, J.M., Vega, L.R., Fleming, J., Bishop, R., and Solomon, F., Single Site α-Tubulin Mutation Affects Astral Microtubules and Nuclear Positioning during Anaphase in Saccharomyces cerevisiae:Possible Role for Palmitoylation of a-Tubulin. Mol Biol Cell,2001.12(9):p.2672-2687.
    [13]Wang, D.-A. and Sebti, S.M., Palmitoylated Cysteine 192 Is Required for RhoB Tumor-suppressive and Apoptotic Activities. Journal of Biological Chemistry,2005.280(19):p.19243-19249.
    [14]Wang, D. and Sebti, S., Palmitoylated cysteine 192 is required for RhoB tumor-suppressive and apoptotic activities. J Biol Chem,2005.280(19):p. 19243-19249.
    [15]Dietrich, L. and Ungermann, C., On the mechanism of protein palmitoylation. EMBO Rep,2004.5(11):p.1053-1057.
    [16]Smotrys, J. and Linder, M., Palmitoylation of intracellular signaling proteins: regulation and function. Annu Rev Biochem,2004.73:p.559-587.
    [17]Roth, A.F., et al., Global Analysis of Protein Palmitoylation in Yeast. Cell, 2006.125(5):p.1003-1013.
    [18]Linder, M.E. and Deschenes, R.J., Palmitoylation:policing protein stability and traffic. Nat Rev Mol Cell Biol,2007.8(1):p.74-84.
    [19]Nadolski, M.J. and Linder, M.E., Protein lipidation. FEBS Journal,2007. 274(20):p.5202-5210.
    [20]Wan, J., Roth, A.F., Bailey, A.O., and Davis, N.G., Palmitoylated proteins: purification and identification. Nat. Protocols,2007.2(7):p.1573-1584.
    [21]Maurer-Stroh, S., Eisenhaber, B., and Eisenhaber, F., N-terminal N-myristoylation of proteins:prediction of substrate proteins from amino acid sequence. J Mol Biol,2002.317(4):p.541-557.
    [22]Maurer-Stroh, S., Eisenhaber, B., and Eisenhaber, F., N-terminal N-myristoylation of proteins:refinement of the sequence motif and its taxon-specific differences. J Mol Biol,2002.317(4):p.523-540.
    [23]Eisenhaber, B., Schneider, G., Wildpaner, M., and Eisenhaber, F., A Sensitive Predictor for Potential GPI Lipid Modification Sites in Fungal Protein Sequences and its Application to Genome-wide Studies for Aspergillus nidulans, Candida albicans Neurospora crassa, Saccharomyces cerevisiae and Schizosaccharomyces pombe. J Mol Biol,2004.337(2):p.243-253.
    [24]Eisenhaber, F., Eisenhaber, B., Kubina, W., Maurer-Stroh, S., Neuberger, G., Schneider, G., and Wildpaner, M., Prediction of lipid posttranslational modifications and localization signals from protein sequences:big-П, NMT and PTS1. Nucl. Acids Res.,2003.31(13):p.3631-3634.
    [25]Bologna, G., Yvon, C., Duvaud, S., and Veuthey, A.L., N-Terminal myristoylation predictions by ensembles of neural networks. Proteomics,2004. 4(6):p.1626-1632.
    [26]Podell, S. and Gribskov, M., Predicting N-terminal myristoylation sites in plant proteins. BMC Genomics,2004.5(1):p.37.
    [27]Maurer-Stroh, S. and Eisenhaber, F., Refinement and prediction of protein prenylation motifs. Genome Biol,2005.6(6):p. R55.
    [28]Zhou, F., Xue, Y., Yao, X., and Xu, Y., CSS-Palm:palmitoylation site prediction with a clustering and scoring strategy (CSS). Bioinformatics,2006. 22(7):p.894-896.
    [29]Xue, Y., Chen, H., Jin; C., Sun, Z., and Yao, X., NBA-Palm:prediction of palmitoylation site implemented in Naive Bayes algorithm. BMC Bioinformatics,2006.7(1):p.458.
    [30]Ren, I., Wen, L., Gao, X., Jin, C., Xue, Y., and Yao, X., CSS-Palm 2.0:an updated software for palmitoylation sites prediction. Protein Eng Des Sel, 2008.21(11):p.639-644.
    [31]Li, S., Iakoucheva, L.M., Mooney, S.D., and Radivojac, P., Loss of post-translational modification sites in disease. Pac Symp Biocomput,2010:p. 337-47.
    [32]Radivojac, P., Baenziger, P.H., Kann, M.G., Mort, M.E., Hahn, M.W., and Mooney, S.D., Gain and loss of phosphorylation sites in human cancer. Bioinformatics,2008.24(16):p. i241-i247.
    [33]Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D., Swiss-Prot/TrEMB Gapped BLAST and PSI-BLAST:a new generation of protein database search programs. Nucl. Acids Res.,1997. 25(17):p.3389-3402.
    [34]Peri, S., et al., Development of Human Protein Reference Database as an Initial Platform for Approaching Systems Biology in Humans. Genome Res, 2003.13(10):p.2363-2371.
    [35]Yip, Y.L., Scheib, H., Diemand, A.V., Gattiker, A., Famiglietti, L.M., Gasteiger, E., and Bairoch, A., The Swiss-Prot variant page and the ModSNP database:A resource for sequence and structure information on human protein variants. Hum. Mutat.,2004.23(5):p.464-470.
    [36]Grantham, R., Amino acid difference formula to help explain protein evolution. Science,1974.185(4154):p.862-4.
    [37]Chothia, C., The nature of the accessible and buried surfaces in proteins. J. Mol. Biol.,1976.105(1):p.1-12.
    [38]Charton, M. and Charton, B.I., The structural dependence of amino acid hydrophobicity parameters. Journal of Theoretical Biology,1982.99(4):p. 629-644.
    [39]Wadsworth, H., Russo, D., Nagayama, Y., Chazenbalk, G., and Rapoport, B., Studies on the role of amino acids 38-45 in the expression of a functional thyrotropin receptor. Mol Endocrinol,1992.6(3):p.394-398.
    [40]Alberti, L., et al., Germline Mutations of TSH Receptor Gene as Cause of Nonautoimmune Subclinical Hypothyroidism. J. Clin. Endocrinol. Metab., 2002.87(6):p.2549-2555.
    [41]Eerola, I., Boon, L.M., Mulliken, J.B., Burrows, P.E., Dompmartin, A., Watanabe, S., Vanwijck, R., and Vikkula, M., Capillary Malformation Arteriovenous Malformation, a New Clinical and Genetic Disorder Caused by RASA1 Mutations. Am. J. Hum. Genet.,2003.73(6):p.1240-1249.
    [42]Cosma, M.P., et al., Molecular and functional analysis of SUMF1 mutations in multiple sulfatase deficiency. Hum. Mutat.,2004.23(6):p.576-581.
    [43]Cosma, M.P., Pepe, S., Annunziata, I., Newbold, R.F., Grompe, M., Parenti, G., and Ballabio, A., The Multiple Sulfatase Deficiency Gene Encodes an Essential and Limiting Factor for the Activity of Sulfatases. Cell,2003.113(4): p.445-456.
    [44]Xu, H. and El-Gewely, M.R., Differentially expressed downstream genes in cells with normal or mutated p53. Oncol. Res.,2003.13(6-10):p.429-36.
    [45]De Vries, E.M., et al., Database of mutations in the p53 and APC tumor suppressor genes designed to facilitate molecular epidemiological analyses. Hum. Mutat.,1996.7(3):p.202-213.
    [46]Pollitt, R., McMahon, R., Nunn, J., Bamford, R., Afifi, A., Bishop, N., and Dalton, A., Mutation analysis of COL1A1 and COL1A2 in patients diagnosed with osteogenesis imperfecta type I-IV. Hum. Mutat.,2006.27(7):p.716-716.
    [47]Nelis, E., Haites, N., and Van Broeckhoven, C., Mutations in the peripheral myelin genes and associated genes in inherited peripheral neuropathies. Hum. Mutat.,1999.13(1):p.11-28.
    [48]Osawa, H., et al., Identification of novel C253Y missense and Y864X nonsense mutations in the insulin receptor gene in type A insulin-resistant patients. Clinical Genetics,2001.59(3):p.194-197.
    [49]Carola, A.G.G.D., Bert, P.M.J., Huub, J.W., Leonoor, D.K., Anke, H.M.V.V., Alfred, J.L.G.P., August, F.D., and Jacques, J.M.J., Null mutation in the human 11-cis retinol dehydrogenase gene associated with fundus albipunctatusl 1 The authors have no proprietary interest in any of the materials used in this study. Ophthalmology,2001.108(8):p.1479-1484.
    [50]Pineda-Trujillo, N., et al., A novel Cys212Tyr founder mutation in parkin and allelic heterogeneity of juvenile Parkinsonism in a population from North West Colombia. Neurosci. Lett.,2001.298(2):p.87-90.
    [51]Wang, Q., Montmain, G., Ruano, E., Upadhyaya, M., Dudley, S., Liskay, M., Thibodeau, S., and Puisieux, A., Neurofibromatosis type 1 gene as a mutational target in a mismatch repair-deficient cell type. Hum. Genet.,2003. 112(2):p.117-123.
    [52]Awad, M. M. et al., DSG2 mutations contribute to arrhythmogenic right ventricular dysplasia/cardiomyopathy. Am. J. Hum. Genet., 2006,79(1):136-42.
    [53]Biguzzi, E., Molecular diversity and thrombotic risk in protein S deficiency: the PROSIT study.Hum. Mutat.,2005,25(3):259-69.
    [1]Reinherz, E.L., et al., The Crystal Structure of a T Cell Receptor in Complex with Peptide and MHC Class Ⅱ. Science,1999.286(5446):p.1913-1921.
    [2]Sung, M.H., Zhao, Y., Martin, R., and Simon, R., T-cell epitope prediction with combinatorial peptide libraries. J Comput Biol,2002.9(3):p.527-39.
    [3]Donnes, P. and Elofsson, A., Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinformatics,2002.3(1):p.25.
    [4]Srinivasan, K.N., Zhang, G.L., Khan, A.M., August, J.T., and Brusic, V., Prediction of class I T-cell epitopes:evidence of presence of immunological hot spots inside antigens. Bioinformatics,2004.20(suppl 1):p. i297-i302.
    [5]Li, S., Yao, X., Liu, H., Li, J., and Fan, B., Prediction of T-cell epitopes based on least squares support vector machines and amino acid properties. Anal Chim Acta,2007.584(1):p.37-42.
    [6]Schonbach, C., Ibe, M., Shiga, H., Takamiya, Y., Miwa, K., Nokihara, K., and Takiguchi, M., Fine tuning of peptide binding to HLA-B*3501 molecules by nonanchor residues. J Immunol,1995.154(11):p.5951-5958.
    [7]Mallios, R.R., Class Ⅱ MHC quantitative binding motifs derived from a large molecular database with a versatile iterative stepwise discriminant analysis meta-algorithm. Bioinformatics,1999.15(6):p.432-439.
    [8]Beiβarth, T., Tye-Din, J.A., Smyth, G.K., Speed, T.P., and Anderson, R.P., A systematic approach for comprehensive T-cell epitope discovery using peptide libraries. Bioinformatics,2005.21(suppl 1):p.129-137.
    [9]Rammensee, H.-G., Friede, T., and Stevanovic, S., MHC ligands and peptide motifs:first listing. Immunogenetics,1995.41(4):p.178-228.
    [10]Guan, P., Doytchinova, I.A., and Flower, D.R., HLA-A3 supermotif defined by quantitative structure鈥揳ctivity relationship analysis. Protein Eng,2003. 16(1):p.11-18.
    [11]Doytchinova, I., Hemsley, S., and Flower, D.R., Transporter Associated with Antigen Processing Preselection of Peptides Binding to the MHC:A Bioinformatic Evaluation. J Immunol,2004.173(11):p.6813-6819.
    [12]Guan, P., Doytchinova, I.A., Zygouri, C., and Flower, D.R., MHCPred:a server for quantitative prediction of peptide-MHC binding. Nucleic Acids Res, 2003.31(13):p.3621-3624.
    [13]Flower, D.R., Towards in silico prediction of immunogenic epitopes. Trends in Immunology,2003.24(12):p.667-674.
    [14]Zhihua, L., Yuzhang, W., Bo, Z., Bing, N., and Li, W., Toward the quantitative prediction of T-cell epitopes:QSAR studies on peptides. J Comput Biol,2004.11(4):p.683-94.
    [15]Doytchinova, I.A., Guan, P., and Flower, D.R., Quantitative structure-activity relationships and the prediction of MHC supermotifs. Methods,2004.34(4):p. 444-453.
    [16]Hillig, R.C., Coulie, P.G., Stroobant, V., Saenger, W., Ziegler, A., and Hulsmeyer, M., High-resolution structure of HLA-A*0201 in complex with a tumour-specific antigenic peptide encoded by the MAGE-A4 gene. J Mol Biol, 2001.310(5):p.1167-1176.
    [17]Doytchinova, I.A. and Flower, D.R., Towards the in silico identification of class Ⅱ restricted T-cell epitopes:a partial least squares iterative self-consistent algorithm for affinity prediction. Bioinformatics,2003.19(17): p.2263-2270.
    [18]Honeyman, M.C., Brusic, V., Stone, N.L., and Harrison, L.C., Neural network-based prediction of candidate T-cell epitopes. Nat Biotech,1998. 16(10):p.966-969.
    [19]Yu K Fau-Petrovsky, N., Petrovsky N Fau-Schonbach, C., Schonbach C Fau-Koh, J.Y.L., Koh Jy Fau-Brusic, V., and Brusic, V., Methods for prediction of peptide binding to MHC molecules:a comparative study. Mol Med,2002.8(3):p.137-48.
    [20]Nielsen, M., Lundegaard, C., Worning, P., Lauemφller, S.L., Lamberth, K., Buus, S., Brunak, S., and Lund, O., Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci,2003. 12(5):p.1007-1017.
    [21]Mamitsuka, H., Predicting peptides that bind to MHC molecules using supervised learning of. Proteins,1998.33(4):p.460-74.
    [22]Zhao, Y., Pinilla, C., Valmori, D., Martin, R., and Simon, R., Application of support vector machines for T-cell epitopes prediction. Bioinformatics,2003. 19(15):p.1978-1984.
    [23]Yang, Z.R. and Johnson, F.C., Prediction of T-Cell Epitopes Using Biosupport Vector Machines. J Chem Inf Model,2005.45(5):p.1424-1428.
    [24]Liu, H., Zhang, R., Yao, X., Liu, M., Hu, Z., and Fan, B., QSAR and Classification models of a novel series of COX-2 selective inhibitors:1, 5-Diarylimidazoles based on support vector machines. J. Comput. Aid. Mol. Des.,2004.18:p.389-399.
    [25]Han, L.Y., et al., Support vector machines approach for predicting druggable proteins:recent progress in its exploration and investigation of its usefulness. Drug Discov. Today,2007.12:p.304-313.
    [26]Hua, S. and Sun, Z., A novel method of protein secondary structure prediction with high segment overlap measure:support vector machine approach. J Mol Biol,2001.308(2):p.397-407.
    [27]Capriotti, E., Calabrese, R., and Casadio, R., Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics,2006. 22(22):p.2729-2734.
    [28]Chang, C.-C. and Lin, C.-J., LIBSVM:a library of support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm.2001.
    [29]Cristianini, N. and Shawe-Taylor, J., An Introduction to Support vector Machine.2000, U. K.:Cambridge University Press.
    [30]Noble, W.S., What is a support vector machine? Nat. Biotech.,2006.24(12):p. 1565-1567.
    [31]Fan, R.-E., Chen, P.-H., and Lin, C.-J., Working Set Selection Using Second Order Information for Training Support Vector Machines. J. Mach. Learn. Res, 2005.6:p.1889--1918.
    [32]Suykens, J.A.K., Gestel, T.V., Brabanter, J.D., Moor, B.D., and Vandewalle, J., Least Squares Support Vector Machines.2002, Singapore:World Scientific. 308.
    [33]Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I.H., The WEKA data mining software:an update. SIGKDD Explor. Newsl., 2009.11(1):p.10-18.
    [34]Doytchinova, I. and Flower, D.R., Modeling the peptide-T cell receptor interaction by the comparative molecular. J Med Chem,2006.49(7):p. 2193-9.
    [1]Betz, M., Saxena, K., and Schwalbe, H., Biomolecular NMR:a chaperone to drug discovery. Curr. Opin. Chem. Biol.,2006.10(3):p.219-225.
    [2]Diercks, T., Coles, M., and Kessler, H., Applications of NMR in drug discovery. Curr. Opin. Chem. Biol.,2001.5(3):p.285-291.
    [3]Villar, H.O., Yan, J., and Hansen, M.R., Using NMR for ligand discovery and optimization. Curr. Opin. Chem. Biol.,2004.8(4):p.387-391.
    [4]D'Amico, S., Sohier, J.S., and Feller, G., Kinetics and Energetics of Ligand Binding Determined by Microcalorimetry:Insights into Active Site Mobility in a Psychrophilic [alpha]-Amylase. J. Mol. Biol.,2006.358(5):p.1296-1304.
    [5]Chavelas, E.A., Zubillaga, R.A., Pulido, N.O., and Garcia-Hernandez, E., Multithermal titration calorimetry:A rapid method to determine binding heat capacities. Biophys. Chem.,2006.120(1):p.10-14.
    [6]Wiseman, T., Williston, S., Brandts, J.F., and Lin, L.-N., Rapid measurement of binding constants and heats of binding using a new titration calorimeter. Anal. Biochem.,1989.179(1):p.131-137.
    [7]Lofas, S., ASSAY and Drug Development Technologies. Assay Drug Dev. Techn.,2004.2:p.407-416.
    [8]Kuntz, I.D., Blaney, J.M., Oatley, S.J., Langridge, R., and Ferrin, T.E., A geometric approach to macromolecule-ligand interactions. J. Mol. Biol.,1982. 161(2):p.269-288.
    [9]Jones, G., Willett, P., Glen, R.C., Leach, A.R., and Taylor, R., Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol.,1997. 267(3):p.727-748.
    [10]Naim, M., et al., Solvated Interaction Energy (SIE) for Scoring Protein-Ligand Binding Affinities.1. Exploring the Parameter Space. J. Chem. Inf. Model., 2007.47:p.122-133.
    [11]Aqvist, J., Luzhkov, V.B., and Brandsdal, B.O., Ligand Binding Affinities from MD Simulations. Acc. Chem. Res.,2002.35(6):p.358-365.
    [12]Gohlke, H., Hendlich, M., and Klebe, G., Knowledge-Based Scoring Function to Predict Protein-Ligand Interactions. J. Mol. Biol.,2000.295:p.337-356.
    [13]Zhang, C., Liu, S., Zhu, Q., and Zhou, Y., A Knowledge-Based Energy Function for Protein-Ligand, Protein-Protein, and Protein-DNA Complexes. J. Med. Chem.,2005.48(7):p.2325-2335.
    [14]Imai, T., Hiraoka, R., Seto, T., Kovalenko, A., and Hirata, F., Three-Dimensional Distribution Function Theory for the Prediction of Protein-Ligand Binding Sites and Affinities:Application to the Binding of Noble Gases to Hen Egg-White Lysozyme in Aqueous Solution. J. Phys. Chem.B,2007.111(39):p.11585-11591.
    [15]Muegge, Ⅰ. and Martin, Y.C., A General and Fast Scoring Function for Protein-Ligand Interactions:A Simplified Potential Approach. J. Med. Chem., 1999.42(5):p.791-804.
    [16]Muegge, I., PMF Scoring Revisited. J. Med. Chem.,2006.49(20):p. 5895-5902.
    [17]Mitchell, J.B.O., Laskowski, R.A., Alex, A., and Thornton, J.M., BLEEP-potential of mean force describing protein-ligand interactions:Ⅰ. Generating potential. J. Comput. Chem.,1999.20:p.1165-1176.
    [18]Mitchell, J.B.O., Laskowski, R.A., Alex, A., Forster, M.J., and Thornton, J.M., BLEEP—potential of mean force describing protein-ligand interactions:Ⅱ. Calculation of binding energies and comparison with experimental data. J. Comput. Chem,1999.20(11):p.1177-1185.
    [19]Huang, S.-Y. and Zou, X., An iterative knowledge-based scoring function to predict protein-ligand interactions:Ⅱ. Validation of the scoring function. J. Comput. Chem.,2006.27(15):p.1876-1882.
    [20]Yang, C.Y., Wang, R., and Wang, S., M-Score:A Knowledge-Based Potential Scoring Function Accounting for Protein Atom Mobility. J. Med. Chem.,2006. 49(20):p.5903-5911.
    [21]Gehlhaar, D.K., Verkhivker, G.M., Rejto, P.A., Sherman, C.J., Fogel, D.R., Fogel, L.J., and Freer, S.T., Molecular recognition of the inhibitor AG-1343 by HIV-1 protease:conformationally flexible docking by evolutionary programming. Chem. Biol,1995.2:p.317-324.
    [22]Rarey, M., Kramer, B., Lengauer, T., and Klebe, G., A Fast Flexible Docking Method using an Incremental Construction Algorithm. J. Mol. Biol.,1996. 261(3):p.470-489.
    [23]Head, R.D., Smythe, M.L., Oprea, T.I., Waller, C.L., Green, S.M., and Marshall, G.R., VALIDATE:A new method for the receptorbased prediction of binding affinities of novel ligands. J. Am. Chem. Soc.,1996.118:p. 3959-3969.
    [24]Bohm, H.J., Prediction of binding constants of protein ligands:A fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J. Comput. Aided Mol. Des,1998.12:p.309-323.
    [25]Wang, R., Lui, L., Lai, L., and Tang, Y., SCORE:A New Empirical Method for Estimating the Binding Affinity of a Protein-Ligand Complex. J. Mol. Model.,1998.4:p.379-394.
    [26]Wang, R., Lai, L., and Wang, S., Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J. comput-Aided mol. Des.,2002.16:p.11-26.
    [27]Eldridge, M.D., Murray, C.W., Auton, T.R., Paolini, G.V., and Mee, R.P., Empirical Scoring Functions:Ⅰ. The Development of a Fast, Empirical Scoring Function to Estimate the Binding Affinity of Ligands in Receptor Complexes. J. Comput.-Aided Mol. Des.,1997.11,:p.425-445.
    [28]DeWitte, R.S. and Shakhnovich, E.I., SMoG: De novo design method based on simple, fast, and accurate free energy estimates.1.Methodology and supporting evidence. J. Am. Chem. Soc.,1996.118:p.11733-11744.
    [29]Ishchenko, A.V. and Shakhnovich, E.I., SMall Molecule Growth 2001 (SMoG2001):An Improved Knowledge-Based Scoring Function for Protein-Ligand Interactions. J. Med. Chem.,2002.45(13):p.2770-2780.
    [30]Yang, J.-M., Development and evaluation of a generic evolutionary method for protein-ligand docking. J. Comput. Chem.,2004.25(6):p.843-857.
    [31]Chen, H.-M., Liu, B.-F., Huang, H.-L., Hwang, S.-F., and Ho, S.-Y., SODOCK:Swarm optimization for highly flexible protein-ligand docking. J. Comput. Chem.,2007.28(2):p.612-623.
    [32]Brown, S.P. and Muchmore, S.W., High-Throughput Calculation of Protein-Ligand Binding Affinities:Modification and Adaptation of the MM-PBSA Protocol to Enterprise Grid Computing. J. Chem. Inf. Model., 2006.46(3):p.999-1005.
    [33]Brown, S.P. and Muchmore, S.W., Rapid Estimation of Relative Protein-Ligand Binding Affinities Using a High-Throughput Version of MM-PBSA. J. Chem. Inf. Model,2007.47(4):p.1493-1503.
    [34]Lindstrom, A., Pettersson, F., Almqvist, F., Berglund, A., Kihlberg, J., and Linusson, A., Hierarchical PLS Modeling for Predicting the Binding of a Comprehensive Set of Structurally Diverse Protein-Ligand Complexes. J. Chem. Inf. Model.,2006.46(3):p.1154-1167.
    [35]Tropsha, A., Gramatica, P., and Gombar, V.K., The Importance of Being Earnest:Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR Comb. Sci.,2003.22(1):p.69-77.
    [36]Deng, W., Breneman, C., and Embrechts, M.J., Predicting Protein-Ligand Binding Affinities Using Novel Geometrical Descriptors and Machine-Learning Methods. J. Chem. Inf. Comput. Sci.,2004.44(2):p. 699-703.
    [37]Wang, R., Fang, X., Lu, Y., and Wang, S., The PDBbind Database:Collection of Binding Affinities for Protein-Ligand Complexes with Known Three-Dimensional Structures. J. Med. Chem.,2004.47(12):p.2977-2980.
    [38]Wang, R., Fang, X., Lu, Y., Yang, C.Y., and Wang, S., The PDBbind Database:Methodologies and Updates. J. Med. Chem.,2005.48(12):p. 4111-4119.
    [39]Wang, R., Lu, Y., Fang, X., and Wang, S., An Extensive Test of 14 Scoring Functions Using the PDBbind Refined Set of 800 Protein-Ligand Complexes. J. Chem. Inf. Comput. Sci.,2004.44(6):p.2114-2125.
    [40]Zhang, S., Golbraikh, A., and Tropsha, A., Development of Quantitative Structure-Binding Affinity Relationship Models Based on Novel Geometrical Chemical Descriptors of the Protein-Ligand Interfaces. J. Med. Chem.,2006. 49(9):p.2713-2724.
    [41]Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E., The protein data bank. Nucl. Acids Res., 2000.28:p.235-242.
    [42]Reczko, M., Karras, D., and Bohr, H., An update of the DEF database of protein fold class predictions. Nucl. Acids Res.,1997.25(1):p.235-235.
    [43]Lin, Z. and Pan, X.M., Accurate Prediction of Protein Secondary Structural Content. J. Protein Chem.,2001.20:p.217-220.
    [44]David, S.H., Prediction of protein helix content from an autocorrelation analysis of sequence hydrophobicities. Biopolymers,1988.27:p.451-477.
    [45]Sokal, R.R. and Thomson, B.A., Population structure inferred by local spatial autocorrelation:An example from an Amerindian tribal population. Am. J. Phys.,2006.129:p.121-131.
    [46]Chou, K.C., Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem. Bioph. Res. Co.,2000.278:p.477-483.
    [47]Cai, Y.D. and Chou, K.C., Predicting Enzyme Subclass by Functional Domain Composition and Pseudo Amino Acid Composition. J. Proteome Res.,2005.4: p.967-971.
    [48]Han, L.Y., et al., Support vector machines approach for predicting druggable proteins:recent progress in its exploration and investigation of its usefulness. Drug Discov. Today,2007.12:p.304-313.
    [49]Bock, J.R. and Gough, D.A., Predicting protein-protein interactions from primary structure. Bioinformatics,2001.17:p.455-460.
    [50]Hyperchem, Molecular Modeling System (Release 7.03 for Windows) 2002, Hypercube, Inc:Gainesville, FL
    [51]Todeschini, R., Consonni, V., Mauri, A., and Pavan, M., DRAGON:Software for the Calculation of Molecular Descriptors.2005, Talete srl:Milan, Italy.
    [52]SYBYL.2002, Tripos Associates Inc:St. Louis.
    [53]Kennard, R.W. and Stone, L.A., Computer aided design of experiments. Technometrics,1969.11:p.137-148.
    [54]Kira, K. and Rendell, L.A. A Practical Approach to Feature Selection. in Proceedings of the ninth international workshop on Machine learning.1992. Aberdeen, Scotland, United Kingdom:Morgan Kaufmann Publishers Inc. San Francisco, CA, USA.
    [55]Mrozek, A., Karolak-Wojciechowska, J., Amiel, P., and Barbe, J., Five-membered heterocycles. Part Ⅰ. Application of the HOMA index to 1,2,4-trizoles. J. Mol. Struct.,2000.524(1-3):p.151-157.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700