文摘
Machine learning has recently become popular and much used within the life science research domain, e.g., for finding quantitative structure鈥揳ctivity relationships (QSARs) between molecular structures and different biological end points. In the work presented here, we have applied orthogonal partial least-squares (OPLS), principal component analysis (PCA), and random forests (RF) methods for classification as well as regression analysis to a publicly available in vivo data set in order to assess the intrinsic metabolic clearance (CLint) in humans. The derived classification models are able to identify compounds with CLint lower and higher than 1500 mL/min, respectively, with nearly 80% accuracy. The most relevant descriptors are of lipophilicity and charge/polarizability types. Furthermore, the accuracy from a classification model based on regression analysis, using the 1500 mL/min cutoff, is also around 80%. These results suggest the usefulness of machine learning techniques to derive robust and predictive models in the area of in vivo ADMET (absorption, distribution, metabolism, elimination, and toxicity) modeling.
Keywords:
machine learning; OPLS; PCA; RF; in vivo CLint; hepatic CL