Semi-supervised self-training for decision tree classifiers

设为首页

收藏本站

网站地图 | English | 公务邮箱

About the library

Background
History
Leadership
Organization

Readers' Guide

Opening Hours
Collections
Help Via Email

Publications

Electronic Information Resources

Semi-supervised self-training for decision tree classifiers

详细信息查看全文

作者：Jafar Tanha ; Maarten van Someren…
关键词：Semi ; supervised learning ; Self ; training ; Ensemble learning ; Decision tree learning
刊名：International Journal of Machine Learning and Cybernetics
出版年：2017
出版时间：February 2017
年：2017
卷：8
期：1
页码：355-370
全文大小：1119KB
刊物类别：Engineering
刊物主题：Computational Intelligence; Artificial Intelligence (incl. Robotics); Control, Robotics, Mechatronics; Complex Systems; Systems Biology; Pattern Recognition;
出版者：Springer Berlin Heidelberg
ISSN：1868-808X
卷排序：8

文摘

We consider semi-supervised learning, learning task from both labeled and unlabeled instances and in particular, self-training with decision tree learners as base learners. We show that standard decision tree learning as the base learner cannot be effective in a self-training algorithm to semi-supervised learning. The main reason is that the basic decision tree learner does not produce reliable probability estimation to its predictions. Therefore, it cannot be a proper selection criterion in self-training. We consider the effect of several modifications to the basic decision tree learner that produce better probability estimation than using the distributions at the leaves of the tree. We show that these modifications do not produce better performance when used on the labeled data only, but they do benefit more from the unlabeled data in self-training. The modifications that we consider are Naive Bayes Tree, a combination of No-pruning and Laplace correction, grafting, and using a distance-based measure. We then extend this improvement to algorithms for ensembles of decision trees and we show that the ensemble learner gives an extra improvement over the adapted decision tree learners.