基于迁移学习的跨项目软件缺陷预测
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Cross-project Software Defect Prediction Based on Transfer Learning
  • 作者:张洋洋 ; 荆晓远 ; 吴飞
  • 英文作者:ZHANG Yang-yang;JING Xiao-yuan;WU Fei;School of Automation,Nanjing University of Posts and Telecommunications;
  • 关键词:软件缺陷预测 ; 迁移学习 ; 特征映射 ; 机器学习
  • 英文关键词:soft defect prediction;;transfer learning;;feature map;;machine learning
  • 中文刊名:WJFZ
  • 英文刊名:Computer Technology and Development
  • 机构:南京邮电大学自动化学院;
  • 出版日期:2018-07-04 10:54
  • 出版单位:计算机技术与发展
  • 年:2018
  • 期:v.28;No.260
  • 基金:国家自然科学基金(61702280);; 江苏省自然科学基金(BK20170900);; 江苏省高等学校自然研究项目(17KJB520025);; 南京邮电大学引进人才科研启动基金(NY217009)
  • 语种:中文;
  • 页:WJFZ201812018
  • 页数:4
  • CN:12
  • ISSN:61-1450/TP
  • 分类号:89-91+96
摘要
软件缺陷预测在提高软件质量、控制与平衡软件成本方面起着举足轻重的作用,是软件工程的活跃领域。研究者们提出了许多预测技术,从不同层面解决了不同的问题。传统软件缺陷预测算法在面对跨项目软件缺陷预测中往往不能得到一个好的结果,原因是训练数据样本(源数据)和测试数据样本(目标数据)之间的分布是不同的。为了解决这个问题,提出了一种基于迁移学习的跨项目软件缺陷预测算法。该算法首先采用了一种不同分布之间的距离度量方式,训练出一种模型来最小化训练数据和测试数据之间的分布差异以及条件分布差异,在映射过后的新的特征空间中两种数据集几乎拥有同样的分布。然后就可以采用传统的机器学习算法进行分类。实验结果表明,该算法具有较好的预测性能。
        Software defect prediction plays an important role in improving software quality, controlling and balancing software costs as an active area of software engineering. Researchers have proposed many prediction techniques to solve different problems at different levels.The traditional software defect prediction algorithm often cannot get a ideal results in the face of cross-project software defect prediction because the distribution between the training data sample( source data) and the test data sample( target data) is different. In order to solve this problem,we present a cross-project software defect prediction algorithm based on transfer learning. Firstly, the algorithm uses a distance measurement between different distributions to train a model to minimize the distribution difference and conditional distribution difference between training data and test data. After mapping the two data sets have almost the same distribution in the new feature space.Then the traditional machine learning algorithm can be used for classification. The experiment shows that the proposed algorithm has better predictive performance.
引文
[1] TURHAN B,MENZIES T,BENER A B,et al. On the relative value of cross-company and w ithin-company data for defect prediction[J]. Empirical Softw are Engineering,2009,14(5):540-578.
    [2] ZIMMERMANN T,NAGAPPAN N,GALL H,et al. Crossproject defect prediction:a large scale experiment on data vs.Domain vs. Process[C]//Proceedings of the 7th joint meeting of the European softw are engineering conference and the ACM SIG-SOFT symposium on the foundations of softw are engineering. Amsterdam,The Netherlands:ACM,2009:91-100.
    [3] JIANG Jing,ZHAI Chengxiang. A two-stage approach to domain adaptation for statistical classifiers[C]//Proceedings of the 16th ACM conference on information and know ledge management. Lisbon,Portugal:ACM,2007:401-410.
    [4] DAI Weiyuan,XUE Guirong,YANG Qiang,et al. Co-clustering based classification for out-of-domain documents[C]//Proceedings of the 13th ACM conference on knowledge discovery and data mining. San Jose,California,USA:ACM,2007:210-219.
    [5]戴文渊.基于实例和特征的迁移学习算法研究[D].上海:上海交通大学,2008.
    [6] JING Xiaoyuan,YING Shi,ZHANG Zhiwu,et al. Dictionary learning based software defect prediction[C]//Proceedings of the 36th international conference on software engineering. Hyderabad:ACM,2014:414-423.
    [7]王青,伍书剑,李明树.软件缺陷预测技术[J].软件学报,2008,19(7):1565-1580.
    [8]罗云锋,贲可荣.基于BBNs的软件故障预测方法[J].电子学报,2006,34(12A):2380-2383.
    [9]单锦辉,徐克俊,王戟.一种软件故障诊断过程框架[J].计算机学报,2011,34(2):371-382.
    [10] LIU Yi,KHOSHGOFTAAR T M,SELIYA N. Evolutionary optimization of software quality modeling with multiple repositories[J]. IEEE Transactions on Software Engineering,2010,36(6):852-864.
    [11] MA Ying,LUO Guangchun,ZENG Xue,et al. Transfer learning for cross-company software defect prediction[J]. Information and Software Technology,2012,54(3):248-256.
    [12] GRETTON A,BORGWARDT K,RASCH M J,et al. A kernel method for the two-sample problem[C]//Proceedings of NIPS. Minnesota:[s. n.],2006.
    [13] WU Rongxin,ZHANG Hongyu,KIM S,et al. Relink:removeering links between bugs and changes[C]//Proceedings of 19th ACM SIGSOFT symposium and the thirteen European conference on foundations of software engineering. Szeged,Hungary:ACM,2011:15-25.
    [14] LONG Mingsheng,WANG Jianmin,DING Guiguang,et al.Transfer feature learning with joint distribution adaptation[C]//IEEE international conference on computer vision. Sydney,NSW,Australia:IEEE,2013:2200-2207.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700