Quantitative Structure–Retention Relationship Models To Support Nontarget High-Resolution Mass Spectrometric Screening of Emerging Contaminants in Environmental Samples

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

Quantitative Structure–Retention Relationship Models To Support Nontarget High-Resolution Mass Spectrometric Screening of Emerging Contaminants in Environmental Samples

详细信息查看全文

作者：Reza Aalizadeh ; Nikolaos S. Thomaidis ; Anna A. Bletsou ; Pablo Gago-Ferrero
刊名：Journal of Chemical Information and Modeling
出版年：2016
出版时间：July 25, 2016
年：2016
卷：56
期：7
页码：1384-1398
全文大小：608K
年卷期：0
ISSN：1549-960X

文摘

Over the past decade, the application of liquid chromatography-high resolution mass spectroscopy (LC-HRMS) has been growing extensively due to its ability to analyze a wide range of suspected and unknown compounds in environmental samples. However, various criteria, such as mass accuracy and isotopic pattern of the precursor ion, MS/MS spectra evaluation, and retention time plausibility, should be met to reach a certain identification confidence. In this context, a comprehensive workflow based on computational tools was developed to understand the retention time behavior of a large number of compounds belonging to emerging contaminants. Two extensive data sets were built for two chromatographic systems, one for positive and one for negative electrospray ionization mode, containing information for the retention time of 528 and 298 compounds, respectively, to expand the applicability domain of the developed models. Then, the data sets were split into training and test set, employing k-nearest neighborhood clustering, to build and validate the models’ internal and external prediction ability. The best subset of molecular descriptors was selected using genetic algorithms. Multiple linear regression, artificial neural networks, and support vector machines were used to correlate the selected descriptors with the experimental retention times. Several validation techniques were used, including Golbraikh–Tropsha acceptable model criteria, Euclidean based applicability domain, modified correlation coefficient (r_m²), and concordance correlation coefficient values, to measure the accuracy and precision of the models. The best linear and nonlinear models for each data set were derived and used to predict the retention time of suspect compounds of a wide-scope survey, as the evaluation data set. For the efficient outlier detection and interpretation of the origin of the prediction error, a novel procedure and tool was developed and applied, enabling us to identify if the suspect compound was in the applicability domain or not.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700