Software measurement data reduction using ensemble techniques

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

Software measurement data reduction using ensemble techniques

详细信息	查看全文 \| 推荐本文 \|

作者：Huanjing Wang^a ; ^{huanjing.wang@wku.edu} ; [Author Vitae] ; Taghi M. Khoshgoftaar^b ; ^{khoshgof@fau.edu} ; [Author Vitae] ; Amri Napolitano^b ; ^{amrifau@gmail.com} ; [Author Vitae]
关键词：Ensembles of feature ranking techniques ; Feature selection ; Defect prediction
刊名：Neurocomputing
出版年：2012
期刊代码：95_09252312
类别：cp
出版时间：1 September, 2012
卷：92
期：Complete
页码：124-132
文件大小：273 K

摘要

Software defect prediction models are used to identify program modules that are high-risk, or likely to have a high number of faults. These models are built using software metrics which are collected during the software development process. Various techniques and approaches have been created for improving fault predictions. One of these is feature (metric) selection. Choosing the most important features is important to improve the effectiveness of defect predictors. However, using a single feature subset selection method may generate local optima. Ensembles of feature selection methods attempt to combine multiple feature selection methods instead of using a single one. In this paper, we present a comprehensive empirical study examining 17 different ensembles of feature ranking techniques (rankers) including six commonly used feature ranking techniques, the signal-to-noise filter technique, and 11 threshold-based feature ranking techniques. This study utilized 16 real-world software measurement data sets of different sizes and built 54,400 classification models using four well known classifiers. The main conclusion is that ensembles of very few rankers are very effective and even better than ensembles of many or all rankers.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700