Statistical machine learning and data mining for chemoinformatics and drug discovery.
详细信息   
  • 作者:Azencott ; Chloe-Agathe.
  • 学历:Doctor
  • 年:2010
  • 导师:Baldi, Pierre F.,eadvisorSmyth, Padhraicecommittee memberTsai, Sherylecommittee member
  • 毕业院校:University of California
  • Department:Computer Science - Ph.D
  • ISBN:9781124207735
  • CBH:3422105
  • Country:USA
  • 语种:English
  • FileSize:8936457
  • Pages:193
文摘
Modern therapeutic research is a very time-consuming, complex and costly process which can considerably benefit from the use of statistical machine learning techniques. In particular, using predictive models to quantify the toxicity or activity of a molecule allows to considerably alleviate the cost of the discovery and development of a new drug. We develop and study structure-based feature representations of small molecules and successfully leverage them to create predictors for several of their chemical, physical and biological properties. We address the prediction of biological activity more in depth by studying virtual high-throughput screening vHTS), which aims at exploiting a first exploratory biological screen to learn how to rank untested compounds according to their activity against a particular target. More specifically, we present a new algorithm, the Influence Relevance Voter IRV), particularly tailored to that problem, and show that it is preferable to state-of-the-art methods. One of the most desirable qualities of a vHTS algorithm is its ability to present the most active compounds in the very top ranked molecules. This capacity for what is called "early recognition" allows experimentalists to focus only on a small fraction of the compounds. To properly analyze and compare virtual high-throughput screening algorithms, we develop the concentrated receiving-operator characteristic CROC) framework, an extension of the ROC framework for the quantitative evaluation, visualization, and optimization of early recognition. Finally we develop machine learning methods for the challenging problem of reaction prediction. Inspired by human chemists, we study elementary reaction steps; in this approach reaction prediction becomes a matter of learning to rank elementary mechanisms by favorability. We do not address this task directly, but rather undertake two necessary preliminary problems. We first develop a large database of elementary mechanisms, annotated with favorability information. We then propose a feature representation of the atoms of a molecule, which we leverage to predict whether or not they belong to a site of reactivity; eventually such a classifier can be used to filter out disfavored elementary reactions.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700