Machine Learning Models for Lipophilicity and Their Domain of Applicability

详细信息查看全文

作者：Timon Schroeter ; Anton Schwaighofer ; Sebastian Mika ; Antonius Ter Laak ; Detlev Suelzle ; Ursula Ganzer ; Nikolaus Heinrich ; Klaus-Robert Mü ; ller
刊名：Molecular Pharmaceutics
出版年：2007
出版时间：August 2007
年：2007
卷：4
期：4
页码：524 - 538
全文大小：600K
年卷期：v.4,no.4(August 2007)
ISSN：1543-8392

文摘

Unfavorable lipophilicity and water solubility cause many drug failures; therefore theseproperties have to be taken into account early on in lead discovery. Commercial tools forpredicting lipophilicity usually have been trained on small and neutral molecules, and are thusoften unable to accurately predict in-house data. Using a modern Bayesian machine learningalgorithm-a Gaussian process model-this study constructs a log D₇ model based on 14556drug discovery compounds of Bayer Schering Pharma. Performance is compared with supportvector machines, decision trees, ridge regression, and four commercial tools. In a blind test on7013 new measurements from the last months (including compounds from new projects) 81%were predicted correctly within 1 log unit, compared to only 44% achieved by commercialsoftware. Additional evaluations using public data are presented. We consider error bars foreach method (model based error bars, ensemble based, and distance based approaches), andinvestigate how well they quantify the domain of applicability of each model.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700