Sesquiterpene Lactones-Based Classification of the Family Asteraceae Using Neural Networks and k-Nearest Neighbors
详细信息    查看全文
  • 作者:Dimitar Hristozov ; Fernando B. Da Costa ; Johann Gasteiger
  • 刊名:Journal of Chemical Information and Modeling
  • 出版年:2007
  • 出版时间:January 2007
  • 年:2007
  • 卷:47
  • 期:1
  • 页码:9 - 19
  • 全文大小:162K
  • 年卷期:v.47,no.1(January 2007)
  • ISSN:1549-960X
文摘
In a recent publication we described the application of an unsupervised learning method using self-organizingmaps to the separation of three tribes and seven subtribes of the plant family Asteraceae based on a set ofsesquiterpene lactones (STLs) isolated from individual species. In the present work, two different structurerepresentations-atom counts (2D) and radial distribution function (RDF) (3D)-and two supervisedclassification methods-counterpropagation neural networks and k-nearest neighbors (k-NN)-were used topredict the tribe in which a given STL occurs. The data set was extended from 144 to 921 STLs, and theAsteraceae tribes were augmented from three to seven. The k-NN classifier with k = 1 showed the bestperformance, while the RDF code outperformed the atom counts. The quality of the obtained model wasassessed with two test sets, which exemplified two possible applications: (1) finding a plant source for adesired compound and (2) based on a plant species chemical profile (STLs): (a) study the relationshipbetween the current taxonomic classification and plant's chemistry and (b) assign a species to a tribe bymajority vote. In addition, the problem of defining the applicability domain of the models was assessed bymeans of two different approaches-principal component analysis combined with Hotelling T2 statistic andan a posteriori probability-based rule.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700