Feature Selection Methods for Multiphase Reactors Data Classification
详细信息    查看全文
文摘
The design of reliable data-driven classifiers able to predict flow regimes in trickle beds or bedinitial behavior (contraction/expansion) in three-phase fluidized beds requires as a first stepthe identification of a restrained number of salient variables among all the numerous availablefeatures. Reduction of dimensionality of the feature space is urged by the fact that lesser trainingsamples may be required and/or more reliable estimates for the classifier parameters may beachieved and/or improvement in accuracy can be achieved. This work investigates severalmethodologies to identify the relevant features in two classification problems belonging to amultiphase reactor context. Relevance of the subsets was assessed using mutual informationbetween the subsets and the class variable (filter approach) and by the accuracy rate of a one-nearest neighbor classifier (wrapper approach). Algorithms for generating feasible sets tomaximize these relevance criteria that were investigated were the sequential forward selectionand the plus-l-take away r. Another conceptually different method to feature ranking that wastested was based on the Garson's saliency indices derived from the weights of classificationneural networks. Reliability of the feature selection methodologies was first evaluated on twobenchmark problems (a synthetic problem and the Anderson's iris data). They were henceforthapplied to the two multiphase reactors classification problems with the goal of identifying themost appropriate features subsets to be used into classifiers. Finally, a new feature selectionalgorithm which combines filter and wrapper techniques proved to yield the same solutions asthe wrapper technique while being less computationally expensive.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700