Active learning for penalized logistic regression via sequential experimental design
详细信息    查看全文
文摘
Penalized logistic regression is useful for classification that not only provides class probability estimates but also can overcome overfitting problem. Traditionally, supervised classifier learning has required a lot of labeled data. Due to technical innovation, it is easy to collect large amounts of unlabeled data, while labeling is usually expensive and difficult. Active learning aims to select the most informative subjects for labeling to decrease the amount of labeling requests. Recently, active learning using experimental design techniques have attracted considerable attention. The typical criteria attempt to reduce the generalization error of a model by minimizing either its estimation variance or estimation bias. However, they fail to take into account both components simultaneously. In this article, we introduce a new algorithm of active learning using penalized logistic regression. The most informative subjects are selected as those with the smallest mean squared estimation error. This criterion, integrated with the idea of sequential design, is exploited in our algorithms to guide a procedure for a new subject selection. Experiments on extensive real-world data sets demonstrate the effectiveness and efficiency of the proposed method compared to several state-of-the-art active-learning alternatives.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700