摘要
【目的】二肽具有生物活性高、易于合成等优点,但在机体内有较差的代谢稳定性,易被降解,故基于氨基酸性质,对二肽稳定性进行定量预测研究,为设计稳定性好的二肽分子提供理论依据。【方法】基于209个二肽分子在不同时间段的降解率,使用偏最小二乘法,将逐步回归筛选变量与支持向量机、随机森林、多元线性回归等方法相结合,建立多肽降解率与氨基酸理化性质之间的定量预测模型。【结果】最为显著的是对二肽60min降解率所建模型,对训练集和测试集分别具有良好的估计能力(R2>0.68,Q2>0.57)与预测能力(R2>0.54),能够有效预测二肽分子的降解率;而且基于多元线性回归系数计算的氨基酸贡献能够发现影响二肽稳定性的重要氨基酸,可以指导高稳定性二肽分子的合理设计。【结论】建立的预测模型方法简单,物理意义明确,多种方法均能获得较为理想的预测模型,确保了预测结果的准确性,可用于指导设计和筛选稳定性好的二肽分子。
[Purposes]Dipeptide has the advantages of high biological activity and easy synthesis,but it has poor metabolic stability in the body and is easily degraded.Therefore,based on the properties of amino acids,the quantitative prediction of the stability of dipeptides was studied,which could provide a theoretical basis for the design of stable dipeptide molecules.[Methods]Uses partial least squares based on the degradation rate of 209 dipeptide molecules at different time periods(5,30,60 minutes),and combines stepwise regression screening variables with support vector machines,random forest,and multiple linear regression.A quantitative prediction model was established between the degradation rate of peptides and the physicochemical properties of amino acids.[Findings]The most notable model is the 60-minute degradation rate of dipeptides,with good estimation ability(R2>0.68,Q2>0.57)and predictability(R2>0.54)for training sets and test sets,which can effectively predict The degradation rate of the peptide molecule;and the amino acid contribution calculated based on the multiple linear regression coefficients can find important amino acids that affect the stability of the dipeptide and can guide the rational design of the highly stable dipeptide molecule.[Conclusions]The prediction model established in this study is simple in method and clear in physical meaning.Various methods can obtain more ideal prediction models to ensure the accuracy of the prediction results.It can be used to guide the design and screening of dipeptide molecules with good stability.
引文
[1]余惠敏.抗真菌药物的发展与研究现状[J].北方药学,2012,9(7):26-27.YU H M.Development and research status of antifungal drugs[J].Northern Medicine,2012,9(7):26-27.
[2]许亚平,曹华.多肽药物的研究及应用进展[J].广东药科大学学报,2010,26(6):653-657.XU Y P,CAO H.Progress in the research and application of peptide drugs[J].Journal of Guangdong Pharmaceutical University,2010,26(6):653-657.
[3]ANAGNOSTIS P,ATHYROS V G,ADAMIDOU F,et al.Glucagon-like peptide-1-based therapies and cardiovascular disease:looking beyond glycaemic control.[J].Diabetes Obesity&Metabolism,2015,13(4):302-312.
[4]李玲,洪战英,董昕,等.多肽类药物的质量控制及药动学研究进展[J].药学服务与研究,2013,13(6):405-409.LI L,HONG Z Y,DONG X,et al.Progress in quality control and pharmacokinetics of polypeptide drugs[J].Pharmaceutical Services and Research,2013,13(6):405-409.
[5]DENG P Y,LI Y J.Calcitonin gene-related peptide and hypertension.[J].Peptides,2005,26(9):1676-1685.
[6]姚金凤,白露,宋亚芳,等.多肽类药物代谢研究进展[J].中国药理学通报,2013,29(7):895-899.YAO J F,BAI L,SONG Y F,et al.Progress in the study of peptide drug metabolism[J].Chinese Journal of Pharmacology,2013,29(7):895-899.
[7]陈红丽,王永学,郭伟云,等.提高PLGA微球载蛋白多肽类药物稳定性添加剂的研究进展[J].国际生物医学工程杂志,2012,35(3):185-188.CHEN H L,WANG Y X,GUO W Y,et al.Research progress in improving stability additives of PLGA microsphere-loaded peptide polypeptides[J].International Biomedical Engineering Journal,2012,35(3):185-188.
[8]TROPSHA A.Best practices for QSAR model development,validation and exploitation[J].Molecular Informatics,2010,29(6/7):476-488.
[9]WU J,ALUKO R E.Quantitative structure-activity relationship study of bitter di-and tri-peptides including relationship with angiotensin I-converting enzyme inhibitory activity[J].Journal of Peptide Science,2007,13(1):63.
[10]FOLTZ M,VAN B L,KLAFFKE W,et al.Modeling of the relationship between dipeptide structure and dipeptide stability,permeability,and ACE inhibitory activity[J].Journal of Food Science,2010,74(7):H243-H251.
[11]KAWASHIMA S,OGATA H,KANEHISA M.AAindex:amino acid index database[J].Nucleic Acids Research,1999,27(1):368-369.
[12]MILLER A.The convergence of Efroymson's stepwise regression algorithm[J].American Statistician,1996,50(2):180-181.
[13]CRANEY T A,SURLES J G.Model-dependent variance inflation factor cutoff values[J].Quality Engineering,2002,14(3):391-403.
[14]EBERLY L E.Multiple linear regression[J].Methods in Molecular Biology,2007,404(2):165.
[15]HEARST M A.Support vector machines[J].IEEE Intelligent Systems and their Applications,2002,13(4):18-28.
[16]ARCHER K J,KIMES R V.Empirical characterization of random forest variable importance measures[J].Computational Statistics&Data Analysis,2008,52(4):2249-2260.
[17]GELADI P,KOWALSKI B R.Partial least-squares regression:a tutorial[J].Analytica Chimica Acta,1985,185(86):1-17.