新型氨基酸结构表征方法及其在定量构效关系中应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
肽与蛋白质的结构表征是其定量构效关系(Quantitative Structure-Activity Relationship, QSAR)研究的前提和重要内容。由于肽和蛋白质的空间结构及功能信息隐藏于一级结构即氨基酸序列中,因此,氨基酸的结构信息对肽及蛋白质的定量构效关系研究至关重要。本文从氨基酸的结构特征出发,构建了两种全新的氨基酸结构表征体系,即VHESH和VSTPV。VHESH(principal component score vector of hydrophobic, electronic, steric, and hydrogen bond properties)来源于20种天然氨基酸的113种物理化学性质,通过对其中50个疏水性质、23个电性性质、35个立体性质和5个氢键性质分别进行主成分特征提取而产生,其中VHSE1和VHSE2代表氨基酸疏水特性;VHSE3~VHSE6代表氨基酸电性特性;VHSE7和VHSE8则代表氨基酸的立体特性;VHSE9和VHSE10代表氨基酸氢键供体和受体特性。VSTPV(principal component score vector of structural and topological variables)则来源于166种天然及非天然氨基酸的85种拓扑结构信息,并经主成分特征提取而产生。与z-scale等其它氨基酸描述子比较,VHESH具有物理化学意义明确,表征能力强,结果易解释等优点;而基于氨基酸拓扑结构性质的VSTPV则具有计算方法简便,不依赖实验数据以及拓展性能好等优点。
     在肽定量构效关系研究中,将VHESH和VSTPV用于血管紧张素转化酶抑制剂、后叶催产素、人类1型双载蛋白SH3结构域亲和肽、阳离子抗菌肽及细胞穿膜肽的定量构效关系研究,都取得了较好构效关系建模结果。基于VHESH表征方法的构效关系研究发现:血管紧张素转化酶抑制剂第2残基电性与疏水性及第1残基立体等性质与生物活性呈正相关关系,而其第1残的电性等性质则与活性呈负相关关系;后叶催产素第1残基电性及疏水性质和第3残基立体及氢键性质与其生物活性呈显著正相关关系,而第2残基疏水、电性及立体性质与其活性呈负相关关系;分析影响人类1型双载蛋白SH3结构域亲和肽亲和性关键作用力可知,第P-3与第P2之间残基(含P-3与P2残基)的相应性质对亲和活性影响较为显著;阳离子抗菌肽第3残基电性性质,第6、7和12残基立体性质以及第11和12残基的疏水性与抗菌效价呈正相关关系,而第6、10和12残基电性性质则与抗菌效价呈显著的负相关关系;细胞穿膜肽的相关残基的物化性质及拓扑性质对其穿膜性能影响较大。应用VSTPV表征方法对以上体系进行构效关系研究亦取得了较优的建模和预测结果,且得出影响活性关键氨基酸位点与VHESH模型结果基本吻合。在以上研究基础上,根据最优定量构效关系模型,在模型应用域范围内分别设计了一系列全新分子,其预测活性与各体系最高预测活性相比均有不同程度提高。
     将VSTPV应用于含非天然氨基酸肽衍生物体系即血管舒缓激肽促进剂、牛乳清蛋白水解肽和弹性蛋白酶模拟底物的定量构效关系研究,取得了较好的结果。研究表明,血管舒缓激肽促进剂分子的第2、3残基相关拓扑信息与其生物活性呈强相关;牛乳铁蛋白水解肽的第6、8残基拓扑性质与其生物活性关系密切;弹性蛋白质模拟底物A、B残基部分变量的二次项和交互项对酶催化反应影响很大。应用定量构效关系相关理论和方法对蛋白质特性及功能预测进行了研究。基于VHESH和VSTPV结构表征基础上,对人免疫缺陷病毒蛋白酶裂解位点(HIV PR)、蛋白质磷酸化位点和蛋白质与RNA相互作用位点进行预测及特异性分析,取得了优于其他方法的预测结果。研究显示,HIV PR的第1、2、4、5和6残基的立体、氢键、电性及疏水性质或对应的拓扑性质是HIV PR被识别重要因素;磷酸化位点序列的P-3位点物化性质(VHESH)及其拓扑性质(VSTPV)对S、T和Y位点磷酸化影响最大;与RNA相互作用的蛋白质序列第2、5、6残基立体、疏性、电性和拓扑信息对RNA和蛋白质相互作用位点影响较大。
     构效关系建模方法与技术是定量构效关系研究的一个重要内容。本文比较了多元线性回归(MLR)、偏最小二乘(PLS)、线性判别分析(LDA)及支持向量机(SVM)等方法在肽及蛋白质结构与功能关系研究中应用。结果表明,MLR在满足相关条件前提下,通常可以取得较好结果;PLS可较好地解决变量数较多且存在多重共线性情况;LDA用于模式识别效果好,模型易解释;SVM能较好地解决小样本、非线性、高维数和局部最小等问题。此外,为提高模型质量,采用多元线性逐步回归(SMR)、遗传算法(GA)筛选变量。研究发现,这两方法能较好地删除原始变量中噪音信息。
     模型质量评价及其应用域现已成为建模方法学中的一个关键性问题。文中将全部样本划分为训练集和预测集两个部分,由训练集样本建立QSAR模型,通过内部和外部双重验证来对模型进行质量评价。采用的内部验证方法有留一法(leave one out, LOO)、留组法(leave 1/n out,LNO)、留多法(leave many out, LMO)以及Y随机排序验证(Y random permutations test)。在内部验证基础上,通过多种评价函数对模型的外部预测能力进行评价,以确保模型的真实有效性。在此基础上,以样本的X空间标准化模型距离为依据确定了模型的应用域,避免模型外推后给活性预测带来的较大误差及不确定性。
Structural characterization is crucial to performing quantitative structure-activity relationship (QSAR) studies for peptides and proteins. Major information of structure and function for peptides and proteins is contained in their amino acid sequences. Therefore, characteristics of the amino acid residues for peptides and proteins are of great significance to their QSAR study. Two kinds of amino acid descriptors, i.e. principal component score vector of hydrophobic, electronic, steric, hydrogen bond properties (VHESH) and principal component score vector of structural and topological variables (VSTPV), were extracted from principal component analysis (PCA). VHESH was derived from PCA of independent families of 50 hydrophobic properties, 23 electronic properties, 35steric properties, and 5 hydrogen bond properties, respectively, which were in total 113 physicochemical properties of 20 coded amino acids. With regard to each amino acid, VHESH1 and VHESH2 are related to hydrophobic properties, VHESH3~VHESH6 indicate electronic properties, VHESH7 and VHESH8 denote steric properties, VHESH9 and VHESH10 are hydrogen bond properties. VSTPV was derived from PCA of 85 structural and topological variables of 166 coded and non-coded amino acids. VHESH is physico-chemically interpretable and more informative in comparison with z-scales and other amino acid descriptors, and VSTPV is easy to compute, and experiment-independent can be easily expanded to other non-coded amino acids.
     VHESH and VSTPV were applied to study structural descriptions of several functional peptides, including angiotensin-converting enzyme inhibitors, oxytocin analogues, decapeptides binding to SH3 domain of human protein Amphiphysin-1, cationic antimicrobial peptides, and cell-penetrating peptides. Robust and predictive QSAR models were obtained by various modeling techniques and methods. The VHESH model was showed that bioactivities of angiotensin converting enzyme inhibitors could be enhanced by increasing electronic and hydrophobic properties of the 2nd residue, steric properties of the 1st residue and so on. In addition, their activities might be decreased by improving electronic properties of the 1st residue. It was inferred that activities of oxytocin analogues might be highly positive correlation with the electronic and hydrophobic properties of the 1st residue, steric and hydrogen bond contribution properties of the 3rd residue, and highly negative correlation with hydrophobic, electronic and steric properties of the 2nd residue. Diversified properties of the residues between the P-3 site and the P2 site for the decapeptide (P4P3P2P1P0P-1P-2P-3P-4P-5) may remarkably contribute to the interactions between human Amphiphysin-1 SH3 domain and the decapeptide. It can be found that electronic properties of the 3rd residue, steric properties of the 6th, 7th and 12th residues, hydrophobic properties of the 11th and 12th residues exert highly positive effects on the activities of antimicrobial peptides, and electronic of the 6th, 10th and 12th residues negatively contribute to the activities antibacterial activities. Different structural information of cell-penetrating peptides may be highly correlated to the penetrating process. Many new peptide sequences can be designed based on their structure and activities relationships in these peptides panels. The VSTPV modeling results showed similar results with VHESH models in explanation of the relationships between sequence site and bioactivities.
     VSTPV was applied to investigate structural description of several peptides and analogues, including bradykinin-potentiating pentapeptides, bovine lactoferricin-(17–31)-pentadecapeptide, and elastase substrate analogues. Robust and predictive QSAR models were developed using various modeling techniques and methods. The results showed that the activities of bradykinin potentiating pentapeptides were mainly related to its topological information of the 2nd and 3rd residues. It can be found that the 6th and 8th topological variables contribute significantly to bovine lactoferricin-(17-31)-pentadecapeptide bioactivities. The square and reciprocation of topological variables in the residues A and B mainly have effects on elastase substrate analogues catalytic activities.
     The principles and methodologies of QSAR were employed to investigate the relationship between protein structure and property or function. VHESH and VSTPV were applied to characterize amino acid sequences of proteins, including cleaved site of HIV-1 protease (HIV PR), phosphorylation site of protein, and RNA binding sites in proteins. It was inferred that HIV PR may only recognize several key properties of various sites in the octameric sequences. These diversified properties including steric properties, hydrogen bond properties, electronic properties, hydrophobic properties and topological properties of the 1st, 2nd, 4th, 5th and 6th residues and so on may be important to determine HIV PR cleavage. The physicochemical properties (VHESH) and topological properties (VSTPV) of P-3 site near the S, T, and Y sites were significant to predicting phosphorylated S, T and Y sites. Remarkable influences were derived from the steric, hydrophobic, electronic and topological properties of the 2nd, 5th, 6th sites in the motif with the 11 residues in protein sequences, and little remarkable influences were from the other sites. This point displayed that these properties may be key features for recognization of the RNA binding region.
     The modeling methods and related techniques are also important to the success of QSAR studies. The modeling and the pattern recognition methods, such as multiple linear regression (MLR), partial least squares (PLS), linear discriminant analysis (LDA) and SVM were discussed in this dissertation. The results showed that MLR behaved as well as other modeling methods if its application conditions were meeted. PLS can well avoid harmful effects by the multi-collinearity in modeling, and be particularly fit for the regression when the sample size is less than the number of variables. Models are robust and interpretable by LDA. As a new machine learning arithmetic, SVM can well deal with small dataset, nonlinear optimization, high-dimensional feature space, local minimization and so on. Besides, stepwise multiple regression (SMR) and genetic algorithm (GA) were used to optimize variable subsets. The results indicated that variable selection can efficiently avoid noise in the original variable set.
     The QSAR models were then subjected to validation and evaluation. In this dissertation, dataset were firstly divided into training and test dataset. The training dataset was utilized to establish QSAR models. Leave-one-out (LOO) cross validation (CV), leave-1/n-out (LNO) CV, leave-many-out (LMO) CV and Y random permutation test were used to perform internal validation of the QSAR models. Based on internal validation, external validation was performed by test dataset. Several evaluation functions were used to evaluate predictive power of the results of QSAR models. Besides, the error evaluation of the predictive activities of designed molecules was also fulfilled with model applicability domain in this dissertation.
引文
[1]陈凯先,蒋华良,嵇汝运.计算机辅助药物设计——原理、方法及应用[M].上海:上海科学技术出版社, 2000.
    [2]徐筱杰,侯廷军,乔学斌,章威.计算机辅助药物分子设计[M].北京:化学工业出版社, 2004.
    [3]郭宗儒.药物分子设计[M].北京:科学出版社, 2005.
    [4]李仁利.药物构效关系[M].北京:中国医药科技出版社, 2004.
    [5]李志良.定量构效关系研究进展[J].化学通报, 1995, 9: 5-10.
    [6] Urbina JA, Payares G, Molina J, Sanoja C, Liendo A, Lazardi K, Piras M M, Piras R, Perez N, Wincker P, Ryley JF. Cure of short and long-term experimental Chaga’s disease using D0870. Science, 1996, 273(5277): 969-971.
    [7] Kubinyi H. From narcosis to hyperspace: The history of QSAR. Quant. Struct.-Act. Relat., 2002, 21 (4): 348-356.
    [8] Hansch C, Fujita T. Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficient. Nature, 1962, 194 (14): 178-180.
    [9] Hansch C, Fujita T. p-σ-πanalysis. A method for the correlation of biological activity and chemical structure. J. Am. Chem. Soc., 1964, 86 (8): 1616-1626.
    [10] Free SMJr, Wilson JW. A mathematical contribution to structure activity studies. J. Med. Chem., 1964, 7 (4): 395-399.
    [11] Hansch C, Muir M, Fujita T,Maloney PP, Geiger F, Streich M. The correlation of biological activity of plant growth regulators and chloromycetin derivatives with Hammett constants and partition coefficients. J. Am. Chem. Soc., 1963, 85 (18): 2817-2824.
    [12] Fujita T, Ban T. Structure-activity study of phenethylamines as substrates of biosynthetic enzymes of sympathetic transmitters. J. Med. Chem., 1971, 14(2): 148-152.
    [13] Unger SH, Hansch C. On model building in structure-activity relations. A reexamination of adrenergic blocking activity of beta-halo-beta-arylalkylamines. J. Med. Chem., 1973, 16 (7): 745-749.
    [14] Randic M. On characterization of molecular branching. J. Am. Chem. Soc., 1975, 97 (23): 6609-6615.
    [15] Kier LB, Murray WJ, Hall LH. Molecular connectivity 4: Relationship to biological activity. J. Med. Chem., 1975, 18 (12): 1272-1274.
    [16] Kier LB, Hall LH. An electrotopological state for atoms in molecules. J. Pharm. Res., 1990, 7 (8): 801-807.
    [17] Liu SS, Cai SX, Cao CZ, Li ZL. Molecular electronegative distance vector (MEDV) relating to 15 properties of alkanes. J. Chem. Inf. Comput. Sci., 2000, 40 (6): 1337-1348.
    [18] Liu SS, Yin CS, Li ZL, Cai SX. QSAR study of steroid benchmark and dipeptides based on MEDV-13. J. Chem. Inf. Comput. Sci., 2001, 41 (2): 321-329.
    [19] Cramer RD, Patterson DE, Bunce JD. Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J. Am. Chem. Soc., 1988, 110 (18): 5959-5967.
    [20] Xu Y, Liu H, Niu CY, Luo C, Luo X, Shen J, Chen K, Jiang H. Molecular docking and 3D QSAR studies on 1-amino-2- phenyl-4-(piperidin-1-yl)-butanes based on the structural modeling of human CCR5 receptor. Bioorg. Med. Chem., 2004, 12 (23): 6193-6208.
    [21] Doweyko AM. The hypothetical active site lattic. An approach to modeling active sites from data on inhibitor molecules. J. Med. Chem., 1988, 31 (7): 1396-1406.
    [22] Klebe G., Abraham U, Mietzner T. Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. J. Med. Chem., 1994, 37 (24): 4130-4146.
    [23] Todeschini R, Lasagni M, Marengo E. New molecular descriptors for 2D and 3D structures. Theory. J. Chemom.1994, 8(4): 263-272.
    [24] Menezes IRA, Lopes JCD, Montanari CA, Oliva G, Pav?o F, Castilho MS, Vieira PC, Pupo MT. 3D QSAR studies on binding affinities of coumarin natural products for glycosomal GAPDH of Trypanosoma cruzi. J. Comput. Aid. Mol. Des., 2003, 17 (5-6): 277-290.
    [25] Hopfinger AJ, Wang S, Tokarski JS, Jin Baiqiang,Albuquerque M, Madhav PJ, Duraiswami C. Construction of 3D-QSAR models using the 4D-QSAR analysis formalism. J. Am. Chem. Soc., 1997, 119 (43): 10509-10524.
    [26] Albuquerque MG, Hopfinger AJ, Barreiro EJ de Alencastro, R B. Four-dimensional quantitative structure-activity relationship analysis of a series of interphenylene 7-oxabicycloheptane oxazole thromboxane A2 receptor antagonists. J. Chem. Inf. Comput. Sci., 1998, 38 (5): 925-938.
    [27] Vedani A, Briem H, Dobler M, Dollinger H, McMasters DR. Multiple conformation and protonation state representation in 4D-QSAR: The neurokinin-1 receptor system. J. Med. Chem., 2000, 43 (23): 4416-4427.
    [28] Vedani A, Dober M. 5D QSAR: The key for simulating induced fit? J. Med. Chem., 2002, 45(11): 2139-2149.
    [29] Vedani A, Dobler M. Multi-dimentinal QSAR in drug research: Predicting binding affinities, toxicity and pharmacokinetic parameters. Prog. Drug Res., 2000, 55: 105-135.
    [30] Vedani A, Dobler M, Lill MA. Combining protein modeling and 6D-QSAR. Simulating the binding of structurally diverse ligands to the estrogen receptor. J. Med. Chem., 2005, 48 (11): 3700-3703.
    [31]王连生,韩朔睽.分子结构、性质与活性[M].北京:化学工业出版社, 1997.
    [32] Karelson M, Lobanov VS, Katritzky AR. Quantum-chemical descriptors in QSAR/QSPR studies. Chem. Rev., 1996, 96 (3): 1027-1043.
    [33] Livingstone DJ. The characterization of chemical structures using molecular properties. A survey. J. Chem. Inf. Comput. Sci., 2000, 40(2): 195-209.
    [34]李志良,曾鸽鸣,胡芳,梁本熹,村松Y,松本S,李梦龙.多维定量构效关系及药物分子设计研究进展[J].化学研究与应用, 1997, 9 (1): 7-14.
    [35] Hansch, C.; Hoekman, D.; Leo, A.; Weininger, D.; Selassie, C. D. Chem-Bioinformatics: Comparative QSAR at the Interface Between Chemistry and Biology[J]. Chem. Rev. 2002, 102 (3): 783-812.
    [36]梁桂兆,梅虎,周原等.计算机辅助药物设计中的多维定量构效关系模型化方法[J].化学进展, 2006, 18 (1): 120-127.
    [37] Shinoda K, Sugimoto M, Tomita M, Ishihama Y. Informatics for peptide retention properties in proteomic LC-MS. Ptoteomics, 2008, 8: 787-798.
    [38] Pripp AH, Isakssonb T, Stepaniakb L, S?rhaugb T , Ard? Y. Quantitative structure activity relationship modelling of peptides and proteins as a tool in food science. Trends in Food Science & Technology, 2005, 16: 484-494.
    [39] Zhou P, Tian FF, Wu YQ,Li ZL, Shang ZC.Quantitative Sequence-Activity Model (QSAM): Applying QSAR Strategy to Model and Predict Bioactivity and Function of Peptides, Proteins and Nucleic Acids. Current Computer-Aided Drug Design, 2008, 4: 311-321.
    [40] Sneath PH. Relations between chemical structure and biological activity in peptides. J. Theor. Biol., 1966, 12 (2): 157-195.
    [41] Kidera A, Konishi Y, Oka M, Ooi T, Scheraga HA. Statistical analysis of the physical properties of the 20 naturally occuring amino acids. J. Protein Chem., 1985, 4 (1): 23-55.
    [42] Hellberg S, Sj?str?m M, Skagerberg B, Wold S. Peptide quantitative structure-activity relationships, a multivariate approach. J. Med. Chem., 1987, 30 (7): 1126-1135.
    [43] Hellberg S, Eriksson L, Jonsson J, Lindgren F, Sj?str?m M, Skagerberg B, Wold S, Andrews P. Minimum analogue peptide sets (MAPS) for quantitative structure-activity relationships. Int. J. Pept. Protein Res., 1991, 37 (5): 414-424.
    [44] Sandberg M, Eriksson L, Jonsson J, Sj?str?m M, Wold S. New chemical descriptors relevant for the design of biologically active peptides. A multivariate charaterrization of 87 aminoacids. J. Med.Chem., 1998, 41 (14): 2481-2491.
    [45] Wu J, Aluko R E, Nakai S. Structural requirements of angiotensin I-converting enzyme inhibitory peptides: quantitative structure-activity relationship modeling of peptides containing 4-10 amino acid residues. QSAR Comb. Sci., 2006, 25 (10): 873-880.
    [46] Genst E D, Areskoug D, Decanniere K, Muyldermans S, Andersson K. Kinetic and affinity predictions of a protein-protein interaction using multivariate experimental design. J. Biol. Chem., 2002, 277 (33): 29897-29907.
    [47] Guan P, Doytchinova I A, Walshe V A, Borrow P, Flower D R. Analysis of peptide-protein binding using amino acid descriptors: prediction and experimental verification for human histocompatibility complex HLA-A*0201. J. Med. Chem., 2005, 48 (23): 7418-7425.
    [48] Ponce Y M, Marrero R M, Castro E A, de Armas R R, Díaz H G, Zaldivar V R, Torrens F. Protein quadratic indices of the“macromolecular pseudograph′sα-carbon atom adjacency matrix”. 1. Prediction of arc repressor alanine-mutant’s stability. Molecules, 2004, 9 (12): 1124-1147.
    [49] Collantes E R, Dunn W J. Amino acid side chain descriptors for quantitative structure activity relationship studies of peptide analogues. J. Med. Chem., 1995, 38 (14): 2705-2713.
    [50] Cho S J, Zheng W, Tropsha A. Rational combinatorial library design. 2. Rational design of targeted combinatorial peptide libraries using chemical similarity probe and the inverse QSAR approaches. J. Chem. Inf. Comput. Sci., 1998, 38 (2): 259-268.
    [51] Lin Z, Wu Y, Zhu B, Ni B, Wang L. Toward the quantitative prediction of T-Cell epitopes: QSAR studies on peptides having affinity with the class I MHC molecular HLA-A*0201. J. Comput. Biol., 2004, 11 (4): 683-694.
    [52] de Armas R R, Díaz H G, Molina R, Uriarte E. Stochastic-based descriptors studying biopolymers biological properties: extended MARCH-INSIDE methodology describing antibacterial activity of lactoferricin derivatives. Biopolymers, 2005, 77 (5): 247-256.
    [53] Zaliani A, Gancia E. MS-WHIM scores for amino acids: A new 3D-description for peptide QSAR and QSPR studies. J. Chem. Inf. Compt. Sci., 1999, 39 (3): 525-533.
    [54] Cocchi M, Johansson E. Amino acids characterization by GRID and multivariate data analysis. Quant. Struct. -Act. Relat., 1993, 12 (1): 1-8.
    [55] Norinder U, Svensson P. Descriptors for amino acids using MolSurf parametrization. J. Comput. Chem., 1998, 19 (1): 51-59.
    [56]丁俊杰,丁晓琴,赵立峰等.新型三维氨基酸结构描述符的研究及其在多肽QSAR.中的应用[J].药学学报2005, 40 (4): 340– 346.
    [57] Tong JB, Liu SL, Zhou P, Wu B, Li Z. A novel descriptor of amino acids and its application inpeptide QSAR. Journal of Theoretical Biology 2008, 253 (1): 90–97.
    [58] Lin ZH, Long HX, Bo Z, Wang YQ, Wu YZ. New descriptors of amino acids and their application to peptide QSAR study. Peptides, 2008, 29 (10): 1798–1805.
    [59] Liu SS, Yin CS, Cai SX, Li ZL. A novel MHDV descriptor for dipeptide QSAR studies. J. Chin. Chem. Soc., 2001, 48 (2): 253-260.
    [60] Mei H, Zhou Y, Sun L, Li Z. A new descriptor of amino acids and its application in peptide QSAR. Acta Phys. -Chim. Sin. 2004, 20 (8): 821-825.
    [61] Mei H, Liao Z, Zhou Y, Li Z. A new set of amino acid descriptors and its application in peptide QSARs. Biopolymers (Pept. Sci.) 2005, 80 (6): 775-786.
    [62] Tian FF, Zhou P, Li Z. T-scale as a novel vector of topological descriptors for amino acids and its application in QSARs of peptides. J. Mol. Struct., 2007, 830 (1-3): 106-115.
    [63]梁桂兆,周鹏,周原,张巧霞,李志良.一组新氨基酸描述子用于肽定量构效关系研究[J].化学学报2006, 64 (5): 393-396.
    [64] Zhou P, Tian FF, Zhang MJ,Li ZL. Applying generalized hydrophobicity scale of amino acids to quantitative prediction of human leukocyte antigen-A*0201-restricted cytotoxic T lymphocyte epitope. Chin. Sci. Bull., 2006, 51 (12): 1439-1443.
    [65] Liang GZ, Li ZL. A new sequence representation (FASGAI) as applied in better specificity elucidation for human immunodeficiency virus type 1 protease. Biopolymers (Pept. Sci.), 2007, 88 (3): 401-412.
    [66] Shu M, Huo DQ, Mei Hu, Liang GZ, Zhang M, Li ZL. New Descriptors of Amino Acids and Its Applications to Peptide Quantitative Structure-activity Relationship. Chinese J. Struct. Chem. 2008, 27 (11): 1375-1383.
    [67]丁俊杰,晓琴,赵立峰等.多肽定量构效关系与分子设计[J].化学进展, 2005, 17 (1): 130-136.
    [68]来鲁华.蛋白质的结构预测与分子设计[M].北京:北京大学出版社, 1993.
    [69]阎隆飞,孙之荣.蛋白质分子结构[M].北京:清华大学出版社, 1999.
    [70] Nemethy G, Scheraga HA. Protein folding. Q. Rev. Biophys., 1977, 10(3): 239-252.
    [71] Chou KC. Energy-optimized structure of antifreeze protein and its binding mechanism. J. Mol. Biol., 1992, 223 (2): 509-517.
    [72] Klein P. Prediction of protein structural class by discriminant analysis. Biochim. Biophys. Acta, 1986, 874 (2): 205-215.
    [73] Nakashima H, Nishikawa K, Ooi T. The folding type of a protein is relevant to the amino acid composition. J. Biochem., 1986, 99 (1): 152-162.
    [74] Hua S J, Sun Z R. Support vector machine approach for protein subcellular locationprediction. Bioinformatics, 2001, 17 (8):721-728.
    [75] Xie D, Li A, Wang M, Fan Z, Feng H. LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Res, 2005, 33(Web server issue): W105-W110.
    [76] Chittibabu G., Shankar S. TARGET: a new method for predicting protein subcellular localization in eukaryotes. Bioinformatics, 2005, 21 (21): 3963-3969.
    [77] Xiao X, Shao S, Ding Y, Huang Z, Huang Y, Chou KC. Using complexity measure factor to predict protein subcellular location. Amino Acids., 2005, 28 (1): 57-61.
    [78] Nielsen H, Engelbrecht J, Brunak S, von Heijne G. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng., 1997, 10 (1):1-6.
    [79] Chou KC. Prediction of protein signal sequences and their cleavage sites. Proteins. 2001, 2 (1): 136-139.
    [80] Liu H, Yang J, Ling JG, Chou KC. Prediction of protein signal sequences and their cleavage sites by statistical rulers. Biochem Biophys Res Commun., 2005, 338 (2): 1005-1011.
    [81] Blom N, Sicheritz-Ponten T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics, 2004, 4 (6): 1633-1649.
    [82] Koenig M, Grabe N. Highly specific prediction of phosphorylation sites in proteins. Bioinformatics, 2004, 20 (18): 3620-3627.
    [83] Muramatsu T, Suwa M. Statistical analysis and prediction of functional residues effective for GPCR-G-protein coupling selectivity. Protein Eng Des Sel., 2006, 19 (6): 277-283.
    [84] Burgoyne NJ, Jackson RM. Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces. Bioinformatics, 2006, 22(11): 1335-1342.
    [85] Cai CZ, Han LY, Ji ZL, Chen YZ. Enzyme family classification by support vector machine. Proteins, 2004, 55 (1): 66-76.
    [86] Chou KC, Cai YD. Prediction of membrane protein types by incorporating amphipathic effects. J Chem Inf Model, 2005, 45 (2): 407-413.
    [87] Perou CM, Jeffrey, SS, Rijn MVD. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc. Natl. Acad. Sci. USA., 1999, 96 (16): 9212-9217.
    [88] Enright AJ, Iliopoulous I. Kyrpides NC, Ouzounis CA. Protein interaction maps for complete genomes based on gene fusion events. Nature, 1999, 402 (6757): 86-90.
    [89] Marcotte EM., Pellegrini M, Ng HL. Detecting protein function and protein-protein Interactios from genome-wide prediction of protein function.Science, 1999, 285 (5428):751-753.
    [90] Guttman I. Linear models: an introduction. New York: Wiley. 1982: 578-579.
    [91] Rencher A C, Pun F C. Inflation of R2 in best subset regression. Technometrics, 1980, 22 (1): 49-53.
    [92] Smith D W, Gill D S, Hammond J J. Variable selection in multivariate multiple regression. J. Statist. Comput. Simul., 1985, 22 : 217-227.
    [93] Jackson J E. Principal components and factor analysis: part I—principal components. J. Quality Tech., 1980, 12 : 201-213.
    [94]俞汝勤.化学计量学导论[M].长沙:湖南教育出版社, 1991: 1-180.
    [95] Wold S, Ruhe A, Wold H, Dunn, WJ. The collinearity problem in linear regression, the partial least squares approach to generalized inverses, SIAM J. Sci. Stat. Comput., 1984, 5 (9): 735-743.
    [96] Ergon R. PLS score–loading correspondence and a biorthogonal factorization. J. Chemometrics, 2002, 16 (7): 368-373.
    [97] De Jong S. SIMPLS: An alternative approach to partial least squares regression. Chemometrics Intell. Lab. Syst., 1993, 18 (3): 251-263.
    [98] Wold S, Johansson E, Cocchi M. PLS—partial least squares. projections to latent structures. In H. Kubinyi Ed., 3D QSAR in drug design, theory, methods, and applications. ESCOM Science Publishers, Leiden Holland, 1993: 523-550.
    [99]王惠文.偏最小二乘回归方法及其应用[M].北京:国防工业出版社, 2006.
    [100] Wold S, Hellberg S, Lundstedt T et al. PLS modeling with latent variables in two or more dimensions, PLS meeting, Frankfurt, September 1987: 84-85.
    [101] Nomikos P, MacGregor JF. Multi-way partial least squares in monitoring batch processes. Chemometrics Intell. Lab. Syst., 1995, 30 (1): 97-108.
    [102] Johnson RA, Wichern DW. Applied multivariate statistical analysis. New Jersey: Prentice Hall, Upper Saddle River, 2002.
    [103] Vapnik V, The nature of statistical learning theory. New York: Springer, 1995.
    [104] Hibbert D B. Genetic algorithms in chemistry. Chemometr. Intell. Lab. Syst., 1993, 19 (3): 277-293.
    [105] Holland J H. Adaptation in natural and artificial systems. Ann Arbor: University of Michigan Press, 1975: 156-159.
    [106] Goldberg D. Genetic algorithms in search, optimization, and machine learning. New York: Addison-Wesley Publishing Company, Inc., 1989: 49-53.
    [107] Lucasius C B, Kateman G. Understanding and using genetic algorithms. Part 1. Concepts,properties and context. Chemometr. Intell. Lab. Syst., 1993, 19 (1): 1-33.
    [108] Luke B T. An overview of genetic methods. Genetic algorithms in molecular modeling, New York: Academic Press, 1996: 189-193.
    [109] Kaur H, Raghava GPS. A neural network method for prediction ofβ-turn types in proteins using evolutionary information. Bioinformatics, 2004, 20 (16): 2751-2758.
    [110] Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochem. Biophys. Acta, 1975, 405 (2): 442-451.
    [111] Deleo JM. Receiver operating characteristic laboratory (ROCLAB): Software for developing decision strategies that account for uncertainty. In: Proceedings of the second international symposium on uncertainty modelling and analysis. College Park, MD: IEEE, Computer Society Press, 1993.
    [112] Wold S, Eriksson L. Statistical validation of QSAR results, in: van de waterbeemd H(Ed). Chemometrics methods in molecular design, 1995, 309-318.
    [113] Golbraikh A, Tropsha A. Beware of q2! J. Mol. Graphics Mod., 2002, 20 (4): 269-276
    [114] Gramatica P, Pilutti P, Papa E. Validated QSAR prediction of OH tropospheric degradation of VOCs: Splitting into training-test sets and consensus modeling. J. Chem. Inf. Comput. Sci., 2004, 44 (5): 1794-1802.
    [115] Atkinson AC. Plots, transformations and regression, Clarendon Press, Oxford (UK),1985, 282.
    [116] Gramatica P, Corradi M Consonni V, Modelling and prediction of soil sorption coefficients of non-ionic organic pesticides by molecular descriptors. Chemosphere. 2000, 41 (5): 763-777.
    [117] Mandel J. The regression analysis of collinear data. J. Res. Nat. Bur, Stand. 1985,90:465-476.
    [118] Lindberg W, Persson JA, Wold, S. Partial least squares method for spectrofluorimetric analysis of mixtures of humic acid and ligninsulfonate. Anal. Chem. 1983, 55 (4): 643-648.
    [119] Cho SJ, Zheng W, Tropsha A. Rational combinatorial library design. 2. Rational design of targeted combinatorial peptide libraries using chemical similarity probe and the inverse QSAR approaches. J. Chem. Inf. Comput. Sci. 1998, 38 (2): 259-268.
    [120] Zheng W, Tropsha A. Novel variable selection quantitative structure-property relationship approach based on the k-Nearest neighbor principle. J. Chem. Inf. Comput. Sci. 2000, 40 (1): 185-194.
    [121]来鲁华.蛋白质的结构预测与分子设计[M].北京:北京大学出版社, 1993.
    [122]阎隆飞,孙之荣.蛋白质分子结构[M].北京:清华大学出版社, 1999.
    [123] Lipkowitz KB, Boyd DB. Reviews in computational chemistry,Vol. 2. VCH, N. Y., 1991, 81-97.
    [124]李志良,曾鸽鸣,胡芳等.多维定量构效关系及药物分子设计研究进展[J].化学研究与应用, 1997, 9 (1): 7-14.
    [125] Kawashima S, Ogata H, Kanehisa M. AAindex: amino acid index database. Nucleic Acids Res., 1999, 27 (1): 368-369.
    [126] B?ck A, Forchhammer K, Heider J, Leinfelder W, Sawers G, Veprek B, Zinoni F. Selenocysteine: The 21st amino acid. Mol. Microbiol. 1991, 5 (3): 515–520.
    [127] Atkins JF, Gesteland R. The 22nd amino acid. Science, 2002, 296 (5572):1409–1410.
    [128]闫爱新,田桂玲,叶蕴华.非蛋白氨基酸对生物活性多肽的修饰及其构效关系的研究进展[J].有机化学, 2000, 20 (3): 299–305.
    [129] Robert S. Phillips R.S. Synthetic applications of tryptophan synthase. Tetrahedron: Asymmetry, 2004, 15 (18): 2787–2792.
    [130] Caligiuri A, D’Arrigo P, Rosini E, Tessaro D, Molla G, Servi S, and Pollegioni L. Enzymatic Conversion of Unnatural Amino Acids by Yeast D-Amino Acid Oxidase. Adv. Synth. Catal. 2006, 348 (15): 2183– 2190.
    [131] Liu RW , Lam KS. Automatic Edman Microsequencing of Peptides Containing Multiple Unnatural Amino Acids. Analytical Biochemistry, 2001, 295 (1): 9–16.
    [132] Cushman D W, Cheung H S, Sabo E F, Ondetti, M A. Angiotensin-converting enzyme inhibitors: evolution of a new class of antihypertensive drugs. In Horovitz Z P (Ed.). Angiotensin-converting enzyme inhibitors: mechanisms of action and clinical implications (pp. 3-25). Baltimore: Urban and Schwarzenberg. 1981.
    [133]周鹏,田菲菲,李波,吴世容,李志良.一种基于遗传算法的肽蛋白质结合模式虚拟筛选建模技术[J].化学学报, 2006, 64 (7): 691-697.
    [134] Adenot M, Sarrauste de Menthiere C, Chavanieu A, Calas B, Grassy G. Peptides quantitative structure-function relationships: an automated mutation strategy to design peptides and pseudopeptides from substitution matrices. Journal of Molecular Graphics and Modelling, 1999, 17 (5-6): 292-309.
    [135] Dalgarno DC, Botfield MC, Rickles RJ. SH3 domains and drug design: ligands, structure, and biological function. Biopolymers, 1997, 43 (5): 383-400.
    [136] Ren RB, Mayer BJ, Cicchetti P, Baltimore D. Identification of a 10-amino acid proline-rich SH3 binding-site. Science, 1993, 259 (5098): 1157-1161.
    [137] Slepnev VI, Ochoa GC, Butler MH, Grabs D, De Camilli P. Role of phosphorylation in regulation of the assembly of endocytic coat complexes. Science, 1998, 281 (5378): 821-824.
    [138] Rickles RJ, Botfield MC, Zhou XM, Henry PA, Brugge JS, Zoller MJ. Phage display selection of ligand residues important for Src homology 3 domain binding specificity. Proc.Natl. Acad. Sci. U.S.A., 1995, 92 (24): 10909-10913.
    [139] Wang W, Lim WA, Jakalian A, Wang J, Wang J, Luo R, Bayly CI, Kollman PA. An analysis of the interactions between the Sem-5 SH3 domain and its ligands using molecular dynamics, free energy calculations, and sequence analysis. J. Am. Chem. Soc., 2001, 123 (17): 3986-3994.
    [140] Hou TJ, McLaughlin W, Lu BZ, Chen K, Wang W. Prediction of binding affinities between the human amphiphysin-1 SH3 domain and its peptide ligands using homology modeling, molecular dynamics and molecular field analysis. J. Proteome Res., 2006, 5 (1): 32-43.
    [141] Hou TJ, Zhang W, Case DA. and Wang W. Characterization of Domain–Peptide Interaction Interface: A Case Study on the Amphiphysin-1SH3 Domain. J. Mol. Biol. 2008, 376 (4): 1201–1214.
    [142] Landgraf C, Panni S, Montecchi-Palazzi L, Castagnoli L, Schneider-Mergener J, Volkmer-Engert R, Cesareni G. Protein interaction networks by proteome peptide scanning. PLOS Biol., 2004, 2 (1): 94-103.
    [143] Hou TJ, Li ZM, Li Z, Liu J, Xu XJ. Three-dimensional quantitative structure-activity relationship analysis of the new potent sulfonylureas using comparative molecular similarity indices analysis. J. Chem. Inf. Comput. Sci., 2000, 40 (4): 1002-100.
    [144] Sima P, Trebichavsky I, Sigler K. Mammalian antibiotic peptides. Folia Microbiol., 2003, 48 (2): 123-137.
    [145] Miele R, Bj?rklund G, Barra D, Simmaco M, Engstr?m Y. Involvement of rel factors in the expression of antimicrobial peptide genes in amphibian. Eur. J. Biochem. 2001; 268 (2): 443 - 449.
    [146] Simmaco M, Mignogna G, Barra D. Antimicrobial peptides from amphibian skin: what do they tell us? Biopolymers. 1998, 47 (6): 435–450.
    [147] Khush RS, Leulier F, Lemaitre B. Drosophila immunity: two paths to NF-kappaB. Trends Immunol, 2001, 22 (5): 260-264.
    [148] Fernández M, Caballero J. Analysis of protegrin structure-activity relationships: the structural characteristics important for antimicrobial activity using smoothed amino acid sequence descriptors. Molecular Simulation, 2007; 33 (8): 689– 702.
    [149] Jenssen H, Fjell CD, Cherkasov A, Hancock REW. QSAR modeling and computer-aided design of antimicrobial peptides. J. Pept. Sci. 2008; 14 (1): 110–114.
    [150] Bhonsle JB, Venugopal D, Huddler DP, Magill AJ, Hicks RP. Application of 3D-QSAR for Identification of Descriptors Defining Bioactivity of Antimicrobial Peptides. J. Med. Chem. 2007; 50(26):6545-6553.
    [151] Cherkasov A, Jankovic B. Application of‘Inductive’QSAR descriptors for quantification of antibacterial activity of cationic polypeptides. Molecules, 2004, 9 (12): 1034-1052.
    [152] Cronin MTD, Aptula AO, Dearden JC, Netzeva TI, Patel H, Rowe PH, Schultz TW, Worth AP, Voutzoulidis K, Schüürmann G. Structure-based classification of antibacterial activity. J. Chem. Inf. Comp. Sci., 2002, 42 (4): 869-878.
    [153] Molina E, Diaz HG, Gonzalez MP, Rodríguez E, Uriarte E. Designing antibacterial compounds through a topological substructural approach. J. Chem. Inf. Comp. Sci., 2004, 44 (2): 515-521.
    [154] Hancock RE, Lehrer R. Cationic peptides: A new source of antibiotics. Trends Biotechnol., 1998, 16 (2): 82-88.
    [155] Takeshima K, Chikushi A, Lee KK, Yonehara S, Matsuzaki K. Translocation of analogues of the antimicrobial peptides magainin and buforin across human cell membranes. J. Biol. Chem., 2003, 278 (2): 1310-1315.
    [156] Jaen-Oltra J, Salabert-Salvador MT, Garcia-March FJ, Pérez-Giménez F, Tomás-Vert F. Artificial neural network applied to prediction of fluorquinolone antibacterial activity by topological methods. J. Med. Chem., 2000, 43 (6): 1143-1148.
    [157] Baker MA, Maloy WL, Zasloff M, Jacob LS. Anticancer efficacy of Magainin2 and analogue peptides. Cancer Res., 1993, 53 (13): 3052-3057.
    [158] Epand RM, Vogel HJ. Diversity of antimicrobial peptides and their mechanisms of action. Biochim. Biophys. Acta, 1999, 1462 (1-2): 11-28.
    [159] Vivés E, Brodin P, Lebleu B. A truncated HIV-1 Tat protein basic domain rapidly translocates through the plasma membrane and accumulates in the cell nucleus, J. Biol. Chem., 1997, 272 (25):16010–16017.
    [160] Derossi D, Joliot AH, Chassaing G, Prochiantz A. The third helix of the Antennapedia homeodomain translocates through biological membranes, J. Biol. Chem., 1994, 269 (14): 10444–10450.
    [161] Brooks H, Lebleu B ,Vives E. Tat peptide2mediated cellular delivery :back to hasics. Advanced drug delivery reviews. 2005, 57 (4): 5592-5571.
    [162] Duchardt F, Fotin-Mleczek M, Schwarz H, Fischer R, Brock R. A comprehensive model for the cellular uptake of cationic cell-penetrating peptides, Traffic, 2007, 8 (7): 848–866.
    [163] H?llbrink M, Kilk K, Elmquist A, Lundberg P, Lindgren M, Jiang Y, Pooga M, Soomts U, Langelü. Prediction of cell-penetrating peptides, Int. J. Pep. Res. Ther., 2005, 11 (4): 249–259.
    [164] Sandgren S, Cheng F, Belting M. Nuclear targeting of macromolecular polyanions by anHIV-Tat derived peptide. Role for cell-surface proteoglycans, J. Biol. Chem., 2002, 277 (41): 38877–38883.
    [165] Binder H, Lindblom G, Charge-dependent translocation of the trojan peptide penetratin across lipid membrane, Biophys. J. 2003, 85 (2): 982–995.
    [166] Deshayes S, Plénat T, Charnet P, Divita G, Molle G, Heitz F. Formation of transmembrane ionic channels of primary amphiphatic cell-penetrating peptides. Consequences on the mechanism of cell penetration. Biochim.Biophys. Acta, 2006, 1758 (11): 1846–1851.
    [167] Terrone D, Sang SL, Roudaia L, Silvius JR. Penetratin and related cellpenetrating cationic peptides can translocate across lipid bilayers in the presence of a transbilayer potential, Biochemistry, 2003, 42 (47): 13787–13799.
    [168] H?llbrink M, Oehlke J, Papsdorf G, Bienert M. Uptake of cellpenetrating peptides is dependent on peptide-to-cell ratio rather than on peptide concentration, Biochim. Biophys. Acta, 2004, 1667 (2): 222–228.
    [169] Holm T, Netzereab S, Hansen M, Langel U, H?llbrink M. Uptake of cell-penetrating peptides in yeasts, FEBS Lett., 2005, 579 (23): 5217–5222.
    [170] Nekhotiaeva N, Awashti KS, Nielsen PE, Good L. Inhibition of staphylococcus aureus gene expression and growth using antisense peptide nucleic acids, Mol. Ther., 2004, 10 (4): 652–659.
    [171] Gomez JA, Gama V, Yoshida T, Sun W, Hayes P, Leskov K, Boothman D, Matsuyama S. Bax-inhibiting peptides derived from Ku70 and cell-penetrating pentapeptides, Biochem. Soc. Trans., 2007, 35 (4): 797–801.
    [172] Hansen M, Kilk K, Langelü. Predicting cell-penetrating peptides. Advanced Drug Delivery Reviews 2008,60(4-5) : 572-579.
    [173] Wold S, Jonsson J, Sj?str?m M et al. DNA and peptide sequences and chemical processes mutlivariately modelled by principal component analysis and partial least squares projections to latent structures. Anal. Chim. Acta., 1993, 277: 239-253.
    [174] Witherow FN, Helmy A,Webb DJ. B radykinin contributes to the vasodilator effects of chronic angiotensin2converting enzyme inhibition in patients with heart failure[J]. Circulation, 2001, 104 (18): 2177-2181.
    [175] Heudi O, Ramirez-Molina C, Marshall P, Amour A, Peace S, McKeown S, Abou-Shakra F. Investigation of bradykinin metabolism in human and rat plasma in the presence of the dual ACE /NEP inhibitors GW660511X and omapatrilat[J]. J Pep t Sci, 2002, 8 (11): 591-600.
    [176] Pawluczyk IZ, Patel SR, Harris KP. The role of bradykinin in the antifibrotic actions of perindoprilat on human mesangial cells[J]. Kidney Int, 2004, 65 (4): 1240-1251.
    [177] Ianzer D, Konno K, Marques-Porto R, Vieira Portaro FC, St?cklin R, Martins de Camargo AC,Pimenta DC. Identification of five new bradykinin potentiating peptides (BPPs) fromBothrops jararaca crude venom by using electrospray ionization tandem mass spectrometryafter a two-step liquid chromatography. Peptides, 2004, 25 (7): 1085-1092.
    [178] Jonsson J , Eriksson L, Hellberg S, Sj?str?m M, and Wold S. Multivariate parametrization of55 coded and non-coded amino acids[J]. Quant S truct-Act Relat, 1989, 8 (3): 204 - 209.
    [179] Sorrentino S , Alessandro AMD , Maras B, Di Ciccio L, D'Andrea G, De Prisco R, Bossa F,Libonati M, Oratore A. Purification of a 76-k Dairon- binding protein from human seminalplasma by affinity chromatography specific for ribonuclease:structural and functional identitywith milk lactoferrin[J]. Biochimica et Biophsica Acta, 1999, 1430 (1): 103-110.
    [180] Steijins JM, Van Hooijdonk AC. 1O ccurrence, structure, biochemical properties andtechnological characteristics of lactoferrin[J ]1B r J N ut r, 2000, 84 (Suppl 1): S11-S17.
    [181] Kawakami H. Effect of iron-saturated lactoferrinon iron absorption.[J].Agtic Biol Chem, 1998,52 (2): 903-908.
    [182] Christina TT. Lactoferrin gene espression and regulation: an overview [J]. Biochemical andCell Biology, 2002, 80 (1): 7-16.
    [183]张建明.广谱天然抗生素-乳铁蛋白[J].国外医药-抗素册, 1998, 19 (2): 144-145.
    [184] Tomita M, Takase M, Bellamy W, Shimamura S. A review: The active peptide of lactoferrin.Acta Paediatr.Jpn. 1994, 36 (5): 585–591.
    [185] Rekdal ?, Andersen J, Vorland LH, Svendsen JS. Construction and synthesis of lactoferricinderivatives with enhanced antibacterial activity. J. Peptide Sci., 1999, 5 (1): 32–45.
    [186] Haug BE, Svendsen JS. The role of tryptophan in the antibacterial activity of a 15-residuebovine lactoferricin peptide J. Peptide Sci., 2001, 7 (4): 190–196.
    [187] Haug BE, Skar ML, Svendsen JS. Bulky aromatic amino acids increase the antibacterialactivity of 15-residue bovine lactoferricin derivatives. J. Peptide Sci. 2001, 7 (8): 425–432.
    [188] Haug BE, Andersen J, Rekdal ?, Svendsen JS. Simple parameterization ofnon-proteinogenic amino acids for QSAR of antibacterial peptides. J. Peptide Sci. 2002, 8 (7):307–313.
    [189] Lejon T, Svendsen JS, Haug BE. Synthesis of a 2-arylsulphonylated tryptophan: theantibacterial activity of bovine lactoferricin peptides containing Trp(2-Pmc) J. Peptide Sci.2002, 8 (7): 302–306.
    [190] Ohmoto K, Okuma M, Yamamoto T, Kijima H, Sekioka T, Kitagawa K, Yamamoto S, TanakaK, Kawabata K, Sakata A, Imawaka H, Nakai H, Toda M. Design and synt hesis of new orallyactive inhibitors of human neutrophil elastase. Bioorganic Med Chemistry, 2001, 9 (5):1307-1323.
    [191] Snider G L. Animal models of emphysema. Am Rew Respit Dis, 1986, 133 (1): 149-150.
    [192] Nomizu M, Iwaki T, Yamashita T, Inagaki Y, Asano K, Akamatsu M, Fujita T. Quantitative structure-activity relationship (QSAR) study of elastase substrates and inhibitors. Int J Pept Protein Res, 1993, 42 (3): 216-226.
    [193] Kimura T, Miyashita Y, Funatsu K, Sasaki SI.Quantitative structure-activity relationships of the synthetic substrates for elastase enzyme using nonlinear partial least squares regression. J Chem Inf Comput Sci, 1996, 36 (2): 185-189.
    [194] Farmerie WG, Loeb DD, Casavant NC, Hutchison CA 3rd, Edgell MH, Swanstrom R. Expression and processing of the AIDS virus reverse transcriptase in Escherichia coli. Science, 1987, 236 (4799): 305-308.
    [195] Kohl NE, Emini EA, Schleif WA, Davis LJ, Heimbach JC, Dixon RA, Scolnick EM, Sigal IS. Active human immunodeficiency virus protease is required for viral infectivity. Proc. Natl. Acad. Sci. U.S.A., 1988, 85 (13): 4686-4690.
    [196] You L, Garwicz D, R?gnvaldsson T. Comprehensive bioinformatics analysis of the specificity of human immunodeficiency virus type1 protease. J. Virol., 2005, 79 (19): 12477-12486.
    [197] Schechter I, Berger A. On the size of the active site in proteases. Biochem. Biophys. Res. Commun., 1967, 27 (2): 157-162.
    [198] Chou KC. Review: Prediction of HIV protease cleavage sites in proteins. Anal. Biochem., 1996, 233 (1): 1-14.
    [199] Poorman RA, Tomasselli AG, Heinrikson RL, Kézdy FJ. A cumulative specificity model for protease from human immunodeficiency virus types 1 and 2, inferred from statistical analysis of an extended substrate data base. J. Biol. Chem., 1991, 266 (22): 14554-14561.
    [200] Cai YD, Chou KC. Artificial neural network model for predicting HIV protease cleavage sites in protein. Adv Eng. Software, 1998, 29 (2): 119-128.
    [201] Narayanan A, Wu X, Yang ZR. Mining viral protease data to extract cleavage knowledge. Bioinformatics, 2002, 18 Suppl 1: S5-S13.
    [202] Yang ZR, Chou KC. Bio-support vector machines for computational proteomics. Bioinformatics, 2004, 20 (5): 735-741.
    [203] Thomson R, Hodgman TC, Yang ZR, Doyle AK. Characterizing proteolytic cleavage site activity using bio-basis function neural networks. Bioinformatics, 2003, 19 (14): 1741-1747.
    [204] Prabu-Jeyabalan M, Nalivaika E, Schiffer CA. Substrate shape determines specificity of recognition for HIV-1 protease: Analysis of crystal structures of six substrate complexes. Structure, 2002, 10 (3): 369-381.
    [205] Clemente JC, Moose RE, Hemrajani R, Whitford LR, Govindasamy L, Reutzel R, McKenna R, Agbandje-McKenna M, Goodenow MM, Dunn BM. Comparing the accumulation of active- and nonactive-site mutations in the HIV-1 protease. Biochemistry, 2004, 43(38): 12141-12151.
    [206] Roberto R , Stefano G, Georg C T. Phosphoproteome analysis [J] . Bioscience Reports, 2005, 25 (1/2): 30-44.
    [207] Wu HY, Tseng VS, Liao PC. Mining phosphopeptide siginals in liquid chromatography-mass spectrometry data for protein phosphorylation analysis [J]. J Proteome Res, 2007, 6 (5): 1812-1821.
    [208] Yang F, Stenoien DL, Strittmatter EF, Wang JH, Ding LH, Lipton MS, Monroe ME, Nicora CD, Gristenko MA, Tang KQ, Fang RH, Adkins JN, Camp DG, Chen DJ, Smith RD. Phosphoproteome profiling of human skin fibroblast cells in response to low- and high-dose irradiation [J]. J Proteome Res, 2006, 5 (5): 1252 -1260.
    [209] Campbell DG, Morrice NA. Identification of protein phosphorylation sites by a combination of mass spectrometry and solid phase Edman sequencing. J Biomol Tech, 2002, 13 (3): 119-130.
    [210] Philip RG, Paul DL. Methodologies for Characterizing Phosphoproteins by Mass Spectrometry. Cell Commun Adhes, 2006, 13 (5-6): 249–262.
    [211] Blom N, Gammeltoft S, Brunak S. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol, 1999, 294 (5): 1351–1362.
    [212] Tang YR, Chen YZ, Canchaya CA, Zhang ZD. GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network Protein Engineering. Design & Selection, 2007, 20 (8): 405–412.
    [213] Kim JH, Lee J, Oh B, Kimm K, Koh I. Prediction of phosphorylation sites using SVMs. Bioinformatics, 2004, 20 (17): 3179–3184.
    [214] Huang HD, Lee TY, Tseng SW, Horng JT. KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites. Nucleic Acids Res, 2005, 33 (Web Server issue): W226-W229.
    [215] Xue Y, Li A, Wang L, Feng H, Yao X. PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory. BMC Bioinformatics, 2006, 7: 163.
    [216] Wong YH, Lee TY, Liang HK, Huang CM, Yang YH, Chu CH, Huang HD, Ko MT, Hwang JK. KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns. Nucleic Acids Res, 2007, 35 (Web Server issue): W588-594.
    [217] Obenauer JC, Cantley LC, Yaffe MB. Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res, 2003, 31 (13): 3635–3641
    [218] Yaffe MB, Leparc GG, Lai J, Obata T, Volinia S, Cantley LC. A motif-based profile scanning approach for genome-wide prediction of signaling pathways. Nat Biotechnol, 2001, 19 (4): 348–353.
    [219] Diella F, Cameron S, Gemund C, Linding R, Via A, Kuster B, Sicheritz-Ponten T, Blom N, Gibson TJ. Phospho.ELM: a database of experimentally verified phosphorylation sites in eukaryotic proteins. BMC Bioinformatics, 2004, 5 (): 79.
    [220] Songyang Z, Blechner S, Hoagland N, Hoekstra MF, Piwnica Worms H, Cantley LC. Use of an oriented peptide library to determine the optimal substrates of protein kinases. Curr Biol, 1994, 4 (11): 973–982.
    [221] Iakoucheva LM, Radivojac P, Brown CJ, Oconnor TR, Sikes JG, Obradovic Z, Dunker AK. The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res, 2004, 32 (3): 1037-1049.
    [222] Kemp BE, Graves DJ, Benjamini E, Krebs EG. Role of multiple basic residues in determining the substrate specificity of cyclic AMP-dependent protein kinase. J Biol Chem, 1977, 252 (14): 4888-4894.
    [223] Knighton DR, Zheng JH, Teneyck LF, Xuong NH, Taylor SS, Sowadski JM: Structure of a peptide inhibitor bound to the catalytic subunit of cyclic adenosine-monophosphate Dependent Protein-Kinase. Science, 1991, 253 (5018): 414-420.
    [224] Shabb JB. Physiological substrates of CAMP-dependent protein kinase. Chem Rev, 2001, 101 (8): 2381-2411.
    [225] Pearson RB, Kemp BE. Protein kinase phosphorylation site sequences and consensus specificity motifs: tabulations. Methods Enzymol, 1991, 200: 62-81.
    [226] Kennelly PJ, Krebs EG. Consensus sequences as substrate specificity determinants for protein kinases and protein phosphatases. J Biol Chem, 1991, 266 (24): 15555-15558.
    [227] Jurica MS, Moore MJ. Pre-mRNA splicing: awash in a sea of proteins. Mol. Cell, 2003, 12 (1): 5–14.
    [228] Noller HF. RNA structure: reading the ribosome. Science, 2005,309(5740), 1508–1514.
    [229] Freed EO, Mouland AJ. The cell biology of HIV-1 and other retroviruses. Retrovirology, 2006, 3: 77.
    [230] Draper DE. Protein-RNA recognition. Annu. Rev. Biochem., 1994, 64: 593-620.
    [231] Draper DE. Themes in RNA-protein recognition. J. Mol. Biol., 1999, 293 (2): 255-270.
    [232] Jones S, Daley DT, Luscombe NM, Berman HM, Thornton JM. Protein-RNA interaction: a structural analysis. Nucleic Acids Research, 2001, 29 (4): 943-954.
    [233] Kim,H. Jeong E, Lee SW, Han K. Computational analysis of hydrogen bonds in protein-RNA complexes for interaction patterns. FEBS Lett., 2004, 552 (2-3): 231-239.
    [234] Jeong E. Chung IF, Miyano S. A neural network method for identification of RNA-interacting residues in proteins. Genome Inform. Ser. Workshop Genome Inform., 2004, 15 (1): 105-116.
    [235] Jeong E. and Miyano S. A weighted profile based method for Protein-RNA interacting residues prediction. Trans. Comput. Syst. Biol. 2006, IV: 123-139.
    [236] Terribilini M. Prediction of RNA binding sites in proteins from amino acid sequence. RNA, 2006, 12 (1):1-13.
    [237] Wang LJ, Brown SJ. BindN: a web-based tool for efficient prediction of DNAand RNA binding sites in amino acid sequences. Nucleic Acids Res., 2006, Web Server Issue, W243-W248.
    [238] Terribilini M, Sander JD, Lee JH, et al. RNABindR: a server for analyzing and predictingRNA-binding sites in proteins. Nucleic Acids Research, 2007, 35 (5):1-7.
    [239] Wang, G. and Dunbrack, R. L. Jr. PISCES: a protein sequence culling server. Bioinformatics, 2003, 19 (12): 1589-1591.