On the development of conjunctival hyperemia computer-assisted diagnosis tools: Influence of feature selection and class imbalance in automatic gradings

详细信息查看全文

作者：Marí ; a Luisa Sá ; nchez Brea^a ; ^{luisa.brea@udc.es" class="auth_mail" title="E-mail the corresponding author} ; Noelia Barreira Rodrí ; guez^a ; ^{nbarreira@udc.es" class="auth_mail" title="E-mail the corresponding author} ; Noelia Sá ; nchez Maroñ ; o^b ; ^{nsanchez@udc.es" class="auth_mail" title="E-mail the corresponding author} ; Antonio Mosquera Gonzá ; lez^c ; ^{antonio.mosquera@usc.es" class="auth_mail" title="E-mail the corresponding author} ; Carlos Garcí ; a-Resú ; a^d ; ^{carlos.garcia.resua@usc.es" class="auth_mail" title="E-mail the corresponding author} ; Marí ; a Jesú ; s Girá ; ldez Ferná ; ndez^d ; ^{mjesus.giraldez@usc.es" class="auth_mail" title="E-mail the corresponding author}
关键词：ROI ; region of interest ; CFS ; correlation based feature selection ; MLP ; multi-layer perceptron ; RBFN ; radial basis function network ; RF ; random forest ; MSE ; mean square error ; ANNs ; artificial neural networks
刊名：Artificial Intelligence in Medicine
出版年：2016
出版时间：July 2016
年：2016
卷：71
期：Complete
页码：30-42
全文大小：3126 K

文摘

The sudden increase of blood flow in the bulbar conjunctiva, known as hyperemia, is associated to a red hue of variable intensity. Experts measure hyperemia using levels in a grading scale, a procedure that is subjective, non-repeatable and time consuming, thus creating a need for its automatisation. However, the task is far from straightforward due to data issues such as class imbalance or correlated features. In this paper, we study the specific features of hyperemia and propose various approaches to address these problems in the context of an automatic framework for hyperemia grading.

Methodology

Oversampling, undersampling and SMOTE approaches were applied in order to tackle the problem of class imbalance. 25 features were computed for each image and regression methods were then used to transform them into a value on the grading scale. The values and relationships among features and experts’ values were analysed, and five feature selection techniques were subsequently studied.

Results

The lowest mean square error (MSE) for the regression systems trained with individual features is below 0.1 for both scales. Multi-layer perceptron (MLP) obtains the best values, but is less consistent than the random forest (RF) method. When all features are combined, the best results for both scales are achieved with MLP. Correlation based feature selection (CFS) and M5 provide the best results, MSE = 0.108 and MSE = 0.061 respectively. Finally, the class imbalance problem is minimised with the SMOTE approach for both scales (MSE < 0.006).

Conclusions

Machine learning methods are able to perform an objective assessment of hyperemia grading, removing both intra- and inter-expert subjectivity while providing a gain in computation time. SMOTE and oversampling approaches minimise the class imbalance problem, while feature selection reduces the number of features from 25 to 3–5 without worsening the MSE. As the differences between the system and a human expert are similar to the differences between experts, we can therefore conclude that the system behaves like an expert.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700